-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Missing fit
feature
#155
Comments
could you explain what you want you mean? 🐙 🍺 |
In sklearn and similar toolkit we generally |
so far I didn't start thinking about the missing fit feature, as fit is in general called on models, which can be fitted to your dataset (at least from my experience so far). |
Agree. Yes, we can mention that in the (future) FAQ page. |
@mk2510 Any plans to add |
In any ML task, the assumption is that the test data are not available during training and just available in the prediction phase.
Assume someone wants to categorize reviews using
tfidf
+ Naive Bayes. The required step would be the following ones:tfidf
on the train part and generate (the transform part in scikit-learn) thetfidf
values on the train partThe problem is that with the current implementation we don't have any state (and that brings also many advantages such as simplicity). The
tfidf
functions do not return any already fitted model, rather the already transformed values.We need to take a clear position wrt to this point. Having the exact same approach as scikit-learn would not probably make sense, still, we need to consider this fact. Opinions?
The text was updated successfully, but these errors were encountered: