Skip to content

A poetry generator from a scrapped corpus of Spanish poetry. EDA and general NLP task included.

License

Notifications You must be signed in to change notification settings

andreamorgar/poesIA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

37 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

poesIA: a poetry generator in Spanish 📚📚💻

A poetry generator from a scrapped corpus of Spanish poetry. EDA and general NLP tasks are included.

Poem genetator 📚🤯

Here a visualization of the generator performance with streamlit. Given some words.... it generates a beautiful poem in Spanish✨✨✨✨

wordcloud

Exploratory Data Analysis 🔎🔎

We generated an overview of the whole data. We analyze the scope and length of the vocabulary involved, generating some nice visualizations ☁️☁️☁️

wordcloud

We decided to make some word counts as well as search for relations between authors and poems in the whole dataset 📈 Author count

We also took into account specific authors and established some comparisons. We detected relations between textual data such as antithesis and polysemy. Awesome isn't it? 🤩

graph2

An embedding model was build to detect polysemy, similar words, and common word collocations in poetry. So many word relations in poems!

wordcloud.

Also, Voronoi graphs were made...📈📈📈📈

wordcloud.

Relevant codes

notebooks

  • [EDA of the poetry dataset](notebooks/data exploration.ipynb): Exploratory Data Analysis of the dataset, including a basic NLP complete task!

  • [Poem genetator code](notebooks/poetry generator.ipynb): code to generate synthetic poems with a RNN.

Talks

This project has been presented as a talk in the PyConEs 2020 (Pandemic Edition). You can find the slides in this repo and the video in youtube.

About

A poetry generator from a scrapped corpus of Spanish poetry. EDA and general NLP task included.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published