Why do unsupervised transformers only update on predict and not also on learn? #542
-
I came across an issue when I wanted to do a pretraining of the model, it would fail to learn anything. After quite some time I figured out that the Standardscaler I use in my pipeline (Standardscaler -> PAClassifier) would always output 0/0/0/0... for all features. Thanks to the documentation and a couple of issues/discussions here, I realized that this is the desired behaviour, unsupervised transformers should only update when calling predict, not when calling learn:
I see that you would like to update transformers when predicting, but I don't understand why they would not also be updated when calling learn. I'm not judging, just asking, so it would be great if someone could explain this to me. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 2 replies
-
It's a good question and comes up a lot. The reason why we update the transformers in Indeed the current way to do pretraining is to |
Beta Was this translation helpful? Give feedback.
It's a good question and comes up a lot.
The reason why we update the transformers in
predict_one
is because we have all the information we need at that point, and it performs better to update the transformers as soon as possible. This is especially true for transformers. If we update the transformers inlearn_one
, we could be updating them twice, which is not desirable.Indeed the current way to do pretraining is to
predict_one
beforelearn_one
. I understand it's not ideal. What we could do is add alearn_unsupervised
boolean parameter to thelearn_one
method.