NLP-Based Comparative Analysis of Two Maulana Books
Overview:
This project aims to analyze the poetic works of Maulana Rumi, specifically comparing two of his notable books, Masnavi and Divan-e Shams. The analysis utilizes NLP tools and methods to uncover patterns and nuances within the poetry.
Data Gathering:
- To gather data, run crawling.py.
- To obtain data for Masnavi, use the -m argument.
- To collect data for Divan-e Shams, use the -s argument.
Word Breaking:
Run without argument to execute the entire code.
- Use -m argument to break Masnavi into its constituent words.
- Use -s argument to break Divan-e Shams into its constituent words.
Statistics:
Run without argument to execute the entire code.
Use the following arguments for specific statistical analyses:
- -b: Number of units
- -w: Number of words
- -u: Number of unique words in each dataset
- -c: Number of common words
- -r: 10 most used words in each dataset
- -rnf: 10 words chosen by RNF from each dataset
- -tf: 10 words chosen by TF-IDF from each dataset
- -hist: Plot word-count for the top 100 words in each dataset
Feel free to explore and contribute to this project for a deeper understanding of Maulana Rumi's poetic expressions.