Can classifier update() be faster than training from scratch? #123
Comments
Hey, i have a question. How do you save the classifications ? I have sqlite database with string and status, but how to save the classifications itself ? |
This was referenced Sep 1, 2016
#136 is now merged and released. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
I am building a dataset and am training NaiveBayesClassifier as the dataset grows. Instead of retraining the classifier every time after adding few new entries, I was hoping to use the update() method just to add new entries and retrain the model with them, in order to cut training time when new data added. What I discovered that loading a pickled trained classifier and updating it just with new entries is not faster than re-training it from scratch. Re-reading the docs they do say that update() "Update the classifier with new training data and re-trains the classifier", which implies re-training on the entire data set...
Question: is there such thing as incremental re-training, or realistically it is processing the entire dataset from scratch, every time I want to update the classifier with new data?
The text was updated successfully, but these errors were encountered: