-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more code examples / tutorials #117
Comments
I wanted to pick up this issue. Besides the three you mentioned is there anything else that I should look out for? Things I should avoid doing or make sure that I do? |
Hey @theSage21 , thanks for signing up! 👍 The workflow for topic modeling is mostly standard and well-covered in textacy; it includes file io, preprocessing, spacy parsing, tokenization into terms, vectorization, model training, and visualization of results. This is another good candidate. Investigating similarities of documents / sentences using metrics in the Really, though, I recommend just doing an analysis that's interesting to you, using |
Awesome library! Just discovered it at work and am about to give it a go! I'm about to train a topic model and would love to post a tutorial here soon. Great stuff! |
I'm having an issue with the TF-IDF function, but I think it is possibly that I am using / understanding how to use incorrectly. Have posted here on SO but would be happy to write a usage doc once I understand properly. https://stackoverflow.com/questions/55764766/calculate-td-idf-for-a-single-word-in-textacy |
Expected Behavior
Users expect to learn from code examples and tutorials more so than from reading an API reference. We should oblige.
Current Behavior
Fairly brief usage examples are embedded throughout the code in docstrings, but there are few "end-to-end" examples to follow along with.
Possible Solution
Create a separate directory for tutorials, and add more detailed examples (in jupyter notebooks?) there. Create additional rst files to include in the official docs. Examples that have been conveyed to me:
Vectorizer
), using thetextacy.extract
andtextacy.keywords
modules, liketextacy.extract.pos_regex_matches()
andtextacy.keyterms.sgrank
.text_utils.clean_terms()
to a terms list.Context
I've gotten more than one email about this... Clearly there's a need.
The text was updated successfully, but these errors were encountered: