Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Curious about why tmt.reduce() method is faster than Bertopic's original UMAP method? #23

Open
xiongot opened this issue Jan 15, 2024 · 0 comments

Comments

@xiongot
Copy link

xiongot commented Jan 15, 2024

For the same docs, my dimensionality reduction in Bertopic costed 1.5 hour but tmt.reduce() only costed 10 more mins.

The following is the output of tmt.reduce():

UMAP(angular rp forest=True, metric='cosine, min dist=0.0, n components=5,n neighbors=5,random state=473921,verbose=2
Wed Jan 1 00:40:47 2024 Construct fuzzy simplicial set
Wed Jan 10 00:40:48 2024 Finding Nearest Neighbors
Wed Jan 10 00:40:48 2024 Building Rp forest with 37 trees
Wed Jan 10 00:41:012024 NN descent for 19 iterations
1 / 19
2 /19
3 / 19
4/19
Stopping threshold met -- exiting after 4 iterations
Wed Jan 10 00:41:30 2024 Finished Nearest Neighbor Search
Wed Jan 10 00:41:34 2024 Construct embedding
Epochs completed:0%
0/209[00:091
completede/200 epochs
completed200 epochs29/
40200 epochscompleted
60200 epochscompleted
completed80200 epochs
completed100200 epochs
completed120200 epochs-
completed140200 epochs
completed160200 epochs1
completed 180200 epochs
Wed Jan 10 00:54:18 2024 Finished embedding

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant