-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The word "visualize" is misleading for hero.top_words #93
Comments
Hey @vidyap-xgboost, my apologies but I'm not sure of having understood your point and what are your suggestions. Can you please reformulate it a bit and explain how would you "I would suggest that a bar plot be returned which takes the top_words as input." For you to know the Also, I noticed that you are very good at writing and interested in helping with the documentation. One idea might be to add a blog-post regarding "How to use Pandas on NLP and text mining tasks". Would you be interested in helping writing such an article? I have quite a lot of ideas and I can help you formulate that if you would like! 🥳 |
What I meant by a And yes, it makes a lot of sense not even have this function as you rightly pointed out but I'm supposing this was added as 'feature' under visualization. -- Thanks for noticing about my writing 😄 ! I am really interested in contributing blog articles. May I ask where these articles will be hosted? I'm also searching for ideas to contribute to examples. |
I see! Something useful to keep in mind is that Pandas is super powerful and it already allows for this kind of visualization. For instance, given a corpus, if we want to look at the 10 most common words as a bar plot, we can simply do:
That's it! Awesome, isn't? The point again is probably that we need to explain to users some useful tricks on how to deal with text-dataset with Pandas... so, what we can really do is to create some articles that explain all of this things (I would have love it to have these 1 year ago for instance ... ) Great you are interested in contributing to blog articles! They will appear there: texthero.org/blog. The way you do that (it will change a bit int the future (#40 ) but this is not big trouble) is that you write the article in a markdown format and then you add it under If you are motivated and want to go further, I'm looking for someone that is willing to supporting me managing the whole documentation of Texthero. This include:
If you want to engage yourself further and are interested, I can assign you the role of "documentation maintainers"! 📝 ⚡ (is quite interesting I would say as you will have to Peer Review all the PR related to the documentation, as well as organize the documentation, as well as helping other users learn better!) |
@jbesomi I'd be more than happy to contribute to the documentation of TextHero in every aspect and making it easier for others to use and understand this awesome library! It will be really helpful if you give me some basic pointers as to what a 'documentation maintainer' should be doing and any guidelines they should be following while peer-reviewing PRs. I can start working on a blog post if you have any suggestions. |
Great then! I'm glad to receive your help!
Unfortunately, I don't have a complete response yet. Some thoughts:
Let me know your opinion! 👍 |
@jbesomi Thank you for explaining everything!
-- |
From the documentation, "visualization" would mean listing the top_words as a Pandas Series grouped by
topics
, as shown in this example:So, if I have a dataset without any topics, then I would just get a Pandas Series of top_words which is not a "visualization". For this, I would suggest that a bar plot be returned which takes the top_words as input.
visualization.py
The text was updated successfully, but these errors were encountered: