What is ClustrLab2k13 ?

ClustrLab2k13 is a powerful Python-based tool for clustering text, built using Streamlit.

How does it work?

The tool utilizes Google's Universal Sentence Encoder in conjunction with OpenTSNE, a lightning-fast implementation of t-SNE. It can process plain text files or CSV files with a single column containing text. When provided with a plain text file, it employs sentence embedding similarity to group sentences and create what we can refer to as "pseudo paragraphs." However, if you prefer to avoid this grouping, you can use the CSV mode. Additionally, all data, including text, embeddings, and TSNE output, can be downloaded. Much of the code for this tool is derived from my previous repository, 'Feed Visualizer'.

How to run ?

streamlit run app.py

How to use ?

Context-based help is available for each of the options. I won't bore 🥱 you by writing a manual here; instead, explore the tool and let it guide you.

How to see full screen charts ?

On the chart there is a button you can use to toggle full screen view .

What does the 'use zero-shot embedding' option do?

Instead of relying on Google's 'Universal Sentence Transformer', the 'use zero-shot embedding' option utilizes Huggingface's zero-shot classification to generate embeddings based on provided labels. For example, if you assign labels such as "positive, negative, neutral," the resulting embedding for a sentence could resemble "0.3, 0.4, 0.3".

Note: Exercise caution when experimenting with this option unless you have a GPU. This feature has not yet been tested with a GPU on large datasets.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
image.png		image.png
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What is ClustrLab2k13 ?

How does it work?

How to run ?

How to use ?

How to see full screen charts ?

What does the 'use zero-shot embedding' option do?

References and thanks !

About

Releases

Packages

Languages

License

code2k13/ClustrLab2k13

Folders and files

Latest commit

History

Repository files navigation

What is ClustrLab2k13 ?

How does it work?

How to run ?

How to use ?

How to see full screen charts ?

What does the 'use zero-shot embedding' option do?

References and thanks !

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages