This project focuses on the classification and generation of poems, as well as web scraping to create our own dataset. The project is divided into several components, each utilizing different technologies and frameworks.
- First dataset for generation: Kaggle - Poetry Foundation Poems
- Second dataset for generation: Kaggle - Complete Poetryfoundationorg Dataset
- Kaggle dataset for generation: Kaggle - Poem Classification NLP
- Our first dataset for classification (144 possible classes): Kaggle - Poems Dataset NLP (topics part)
- Creation of our own dataset for classification (5 possible classes): Kaggle - Poems Classification Dataset
- Poetry Foundation Terms of Service for Robots: Poetry Foundation Robots.txt
Our dataset was made by scraping the Poetry Foundation website for classification. It contains five different topics: nature, art & sciences, love, relationships, and religion, which are fairly well distributed.
See: Kaggle Dataset
src
├── classification
│ ├── FNN
│ ├── Logistic Regression & Naive Bayes
│ ├── RNN / LSTM
│ ├── Transformers
│ └── XGBoost
└── generation
├── Ngram
├── Transformers
└── RNN
- angelo.eap
- valentin.san
- christophe.nguyen
- alexandre.devaux-riviere
- paul.duhot
- mael.reynaud