ViTopic: Topic Modeling on Images with Clustering of Vision Transformer Embeddings

Image data is essential in various fields, but collecting and labeling it is time-consuming and challenging. The project aims to develop an automated system that can cluster and describe unlabeled images, allowing analysts to process visual data quickly and accurately. The primary objective is to create a tool that can help people cluster images and generate text-based topics for each cluster. This will improve image data processing efficiency, reduce labor costs, and facilitate the analysis of visual data.

How to Use it

Coming Soon...

How It Works

Below are the general steps to generate context-guided visual topics:

Generate vision embeddings and image captions using pre-trained models.
Generate a pair-wise similarity matrix using the vision embeddings and some similarity function. This matrix can be used for visual similarity search.
Assign clusters to the vision embeddings using a clustering algorithm.
Use the cluster information and the image captions to generate a Class-based Term-Frequency Inverse-Document-Frequency (c-TF-IDF) matrix.
Extract frequent words in each cluster from the c-TF-IDF matrix.

For more information on how the methods, results, and evaluations, read the project's report in the ./docs/ViTopic – Report.pdf file.

Final Project for ITCS-5156 @ UNC Charlotte

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
docs		docs
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
data-processing.colab.ipynb		data-processing.colab.ipynb
main.ipynb		main.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs

docs

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

data-processing.colab.ipynb

data-processing.colab.ipynb

main.ipynb

main.ipynb

Repository files navigation

ViTopic: Topic Modeling on Images with Clustering of Vision Transformer Embeddings

How to Use it

How It Works

About

Releases

Packages

Languages

License

faustotnc/vitopic

Folders and files

Latest commit

History

Repository files navigation

ViTopic: Topic Modeling on Images with Clustering of Vision Transformer Embeddings

How to Use it

How It Works

About

Resources

License

Stars

Watchers

Forks

Languages