Skip to content
/ NLP Public

The goal for this task is to take a dataset that has some labels and see if we can organize it in some unsupervised way. The dataset used in the above task is: 1)MIT Movie Corpus 2)MIT Restaurant Corpus. After this, the data has been clustered in different clusters that have similar intents using K-Means and Sentence Transformers based clustering.

Notifications You must be signed in to change notification settings

Srijan2001/NLP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

NLP

The overall goal for this task is to take a dataset that has some labels and see if we can organise it in some unsupervised way. This way we can reduce the labeling effort for another task. So, the job is to find ways to reduce the human labeling effort as low as possible.

The dataset used in the above task is: 1)MIT Movie Corpus 2)MIT Restaurant Corpus

The dataset has been analysed and statistics such as distribution of each tags, number of samples and so on have been calculated.

After this, the data has been clustered in different clusters that have similar intents using 2 ways:

  1. K-Means clustering
  2. Sentence Transformers based clustering

About

The goal for this task is to take a dataset that has some labels and see if we can organize it in some unsupervised way. The dataset used in the above task is: 1)MIT Movie Corpus 2)MIT Restaurant Corpus. After this, the data has been clustered in different clusters that have similar intents using K-Means and Sentence Transformers based clustering.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published