-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does GPU help? #31
Comments
Apologies, managed to try it on GPU enabled cloud server and it was significantly faster. |
Yes! Using a GPU is highly recommended to speed-up the inference at the sentence-transformers stage. However, if you do not have a GPU available to you, then you can actually use TF-IDF instead since from bertopic import BERTopic
from sklearn.datasets import fetch_20newsgroups
from sklearn.feature_extraction.text import TfidfVectorizer
# Create TF-IDF sparse matrix
docs = fetch_20newsgroups(subset='all', remove=('headers', 'footers', 'quotes'))['data']
vectorizer = TfidfVectorizer(min_df=5)
embeddings = vectorizer.fit_transform(docs)
# Run BERTopic with embeddings
model = BERTopic(allow_st_model=True)
topics, probabilities = model.fit_transform(docs, embeddings) Note that I used the parameter EDIT: Did not saw your response but I will leave this up here for those who are interested in other embedding methods. |
Thanks @MaartenGr ! This was very useful. |
Hi, firstly thank you so much for this library. I've tried it and it does take some time to get the topics.
Just wondering, will having GPU help speed-wise? Is the speed bottle-necked at the sentence transformers embedding portion?
The text was updated successfully, but these errors were encountered: