Skip to content

homasms/nlp_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

NLP-Based Comparative Analysis of Two Maulana Books

Overview:
This project aims to analyze the poetic works of Maulana Rumi, specifically comparing two of his notable books, Masnavi and Divan-e Shams. The analysis utilizes NLP tools and methods to uncover patterns and nuances within the poetry.

Data Gathering:

  • To gather data, run crawling.py.
  • To obtain data for Masnavi, use the -m argument.
  • To collect data for Divan-e Shams, use the -s argument.

Word Breaking:
Run without argument to execute the entire code.

  • Use -m argument to break Masnavi into its constituent words.
  • Use -s argument to break Divan-e Shams into its constituent words.

Statistics:
Run without argument to execute the entire code.
Use the following arguments for specific statistical analyses:

  • -b: Number of units
  • -w: Number of words
  • -u: Number of unique words in each dataset
  • -c: Number of common words
  • -r: 10 most used words in each dataset
  • -rnf: 10 words chosen by RNF from each dataset
  • -tf: 10 words chosen by TF-IDF from each dataset
  • -hist: Plot word-count for the top 100 words in each dataset

Feel free to explore and contribute to this project for a deeper understanding of Maulana Rumi's poetic expressions.

About

Preprocess and some NLP tasks on Moulana books

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages