-
Notifications
You must be signed in to change notification settings - Fork 0
syedhadi816/Similarity-Measurement-in-Text-Data
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
There are 3 files in this folder data50.csv label.csv group.csv. In data50.csv there is a sparse representation of the bags-of-words, with each row containing 3 fields: articleId, wordId, and count. To find out which group an article belongs to, use the file label.csv, where for articleId i, line i in label.csv contains the groupId. Finally the group name is in group.csv, with line i containing the name of group i.
About
Measuring similarity in textual data (Bag of Words model) using Jaccard distance, Cosine similarity and Euclidean distance.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published