-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Language Modeling #2077
Language Modeling #2077
Conversation
@alvations @stevenbird The tests that fail currently all have to do with python 2 lacking I can easily compensate for |
@copper-head thanks for all this... fine to add this dependency. |
…g always done after initialization.
…gramCounter's init.
Ok, it looks like there's more tweaking that I'll have to do to make python 2.7 happy. I'll get to it soon |
Looks like python 2 is now happy :) |
Wow, what a big contribution! |
[CI: retest] |
Thanks @copper-head! |
Thanks @copper-head!! An awesome update to the model package!! |
lol, only took me... 4 years ;) |
How about announcing it on nltk-dev? |
Sure! |
The next one... once it's packaged I hope you can announce your work on nltk-dev |
Ok, after getting some feedback on my previous attempt, I re-worked things a bit.
This time there's tests a-plenty and I've tried to add documentation as well.
I have regression tests for:
Since I didn't add the Simple Good Turing estimator yet, can't say anything about the issues related to that. Which brings me to the next point. I spent most of my effort making sure the basic datastructures are useable and robust, so didn't have as much time to add a lot of model classes. I'm hoping that the infrastructure the module provides, however, makes it easy for folks to create models on their own.
Would be nice to merge this into the next release :)
Happy modeling!