Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is it possible to present result as cloud ? #425

Closed
om35 opened this issue Jan 31, 2022 · 6 comments
Closed

is it possible to present result as cloud ? #425

om35 opened this issue Jan 31, 2022 · 6 comments

Comments

@om35
Copy link

om35 commented Jan 31, 2022

is it possible with Topic Bert to present result (topics and words) as cloud like this ? :
cloud

size of word = frequency
if yes ? how we can do this ? with which tools ? at what level ? any ideas please

@MaartenGr
Copy link
Owner

You can typically achieve this with the wordcloud package. You can find a minimal example of working with frequencies here.

So something like this:

from sklearn.datasets import fetch_20newsgroups
from bertopic import BERTopic
from wordcloud import WordCloud
import matplotlib.pyplot as plt

def create_wordcloud(topic_model, topic):
    text = {word: value for word, value in topic_model.get_topic(topic)}
    wc = WordCloud(background_color="white", max_words=1000)
    wc.generate_from_frequencies(text)
    plt.imshow(wc, interpolation="bilinear")
    plt.axis("off")
    plt.show()

# Train topic model
docs = fetch_20newsgroups(subset='all',  remove=('headers', 'footers', 'quotes'))['data']
topic_model = BERTopic(verbose=True)
topics, probs = topic_model.fit_transform(docs)

# Show word cloud
create_wordcloud(topic_model, topic=1)

@MaartenGr
Copy link
Owner

Closing this for now. However, if you want to discuss this further, let me know and I'll make sure to reopen the issue!

@AbuBakarrr
Copy link

You are 😍ᒐ♡ᘎᙓ😍, Sir MaatenGr... such a great support.

💙💙💙💚💚💚💛💛💛💜💜💜

@rsg-iik
Copy link

rsg-iik commented Feb 14, 2023

Hi @MaartenGr,

Thanks a lot for creating this awesome package!
I am able to get really valuable insights for my use case using this.

I would like to create word cloud for all topics together so that this process can be automated. The above code #425(#425) works for me and I am using this to get word clouds for all topics (e.g. 10 topics generate by BERTopic)as:

for topic in range(0,10):
create_wordcloud(topic_model, topic)

Is there any way we could extract all topics together from topic_model instead of manually passing number of topics to create_wordcloud functions? (Topc2vec has this ability to generate word clouds for all topic together). Please let me know.
Thanks

@MaartenGr
Copy link
Owner

@rsg-iik That is currently not implemented as it would require adding another dependency to BERTopic. For that reason, I added how to create a word cloud to the documentation for the new release with instructions on how to install the additional dependency.

@rsg-iik
Copy link

rsg-iik commented Feb 16, 2023

@MaartenGr Thanks for letting me know!
I really appreciate your prompt response. I look forward to using new version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants