New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can classifier update() be faster than training from scratch? #123

DSA101 opened this Issue Apr 15, 2016 · 2 comments


None yet
3 participants

DSA101 commented Apr 15, 2016

I am building a dataset and am training NaiveBayesClassifier as the dataset grows. Instead of retraining the classifier every time after adding few new entries, I was hoping to use the update() method just to add new entries and retrain the model with them, in order to cut training time when new data added. What I discovered that loading a pickled trained classifier and updating it just with new entries is not faster than re-training it from scratch. Re-reading the docs they do say that update() "Update the classifier with new training data and re-trains the classifier", which implies re-training on the entire data set...

Question: is there such thing as incremental re-training, or realistically it is processing the entire dataset from scratch, every time I want to update the classifier with new data?

@DSA101 DSA101 changed the title from Can classifier update() be faster than train from scratch? to Can classifier update() be faster than training from scratch? Apr 15, 2016


This comment has been minimized.

IvRRimum commented Jun 6, 2016

Hey, i have a question. How do you save the classifications ? I have sqlite database with string and status, but how to save the classifications itself ?


This comment has been minimized.


sloria commented Aug 16, 2017

#136 is now merged and released.

@sloria sloria closed this Aug 16, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment