# Dataset: African American Poetry

The [**African American Periodical Poetry** dataset](https://www.responsible-datasets-in-context.com/posts/african-american-periodical-poetry/aa-periodical-poetry.html) contains information about poems published in African American periodicals.
Dataset adapted from: https://www.responsible-datasets-in-context.com/posts/african-american-periodical-poetry/aa-periodical-poetry.html
Key columns in the dataset include:

- **author (first last)**: The full name of the poet.
- **author (last name)**: Last name of the poet
- **year**: The year when the poem was published.
- **month**: The month when the poem was published.
- **title**: The title of the poem.
- **venue**: The venue where the poem was published in.
- **form (if known)**: The type of poem (e.g., sonnet, free verse).
- **gender (if known)**: The gender of the poet.
- **themes**: The themes of the poem.
- **text**: The full text of the poem.
- **published in (city)**: the city where the poem was published.


In [15]:
# Read in the dataset
import pandas as pd
aap = pd.read_csv("https://raw.githubusercontent.com/melaniewalsh/responsible-datasets-in-context/main/datasets/aa-periodical-poetry/African-American-Periodical-Poetry_1900-1928-Created-by-Amardeep-Singh-and-Kate-Hennessey,-Lehigh-University.csv")

# Wrangling

In [16]:
# Calculate the length of each poem:
aap['Poem_Length'] = aap['text'].str.len()
aap.head()

Unnamed: 0,title,author (first last),author (last name),text,month,year,venue,edited by,form (if known),gender (if known),themes,second venue,published in (city),Magazine Type,Author Bio.,Poem_Length
0,New Wars,Benjamin Griffith Brawley,Brawley,HURL on the lance! Break up the ancient peace!...,November,1900,Colored American,Walter W. Wallace,Common Measure,male,"Spanish-American War, Empire",,Boston,Predom. Black,https://en.wikipedia.org/wiki/Benjamin_Griffit...,1418
1,A Picture,Olivia Ward Bush-Banks,Bush-Banks,I drew a picture long ago —\nA picture of a su...,June,1900,Colored American,Walter W. Wallace,Common Measure,female,,,Boston,Predom. Black,https://en.wikipedia.org/wiki/Olivia_Ward_Bush...,1391
2,The Christmas Reunion,Augustus M. Hodges,Hodges,"Twas a bright Christmas morning in ""Ole Kentuc...",December,1900,Colored American,Walter W. Wallace,,male,Slavery,,Boston,Predom. Black,https://en.wikipedia.org/wiki/Augustus_M._Hodges,4477
3,A Memorial of Frederick Douglass,C. Henry Holmes,Holmes,"He was a noble hero, born in an humble state,\...",September,1900,Colored American,Walter W. Wallace,Elegy,male,Frederick Douglass,,Boston,Predom. Black,,1136
4,The Negro's Worth,Alonzo Milton Skrine,Skrine,"Who casts a slur on Negro worth, a stain on Ne...",December,1900,Colored American,Walter W. Wallace,,male,"Civil War, Spanish-American War, Labor, Slaver...",,Boston,Predom. Black,https://scalar.lehigh.edu/african-american-poe...,1564


In [17]:
# explodes the dataset by themes.
aap["themes"] = aap["themes"].str.split(", ")
poems_themes = aap.explode('themes')
poems_themes.head()

Unnamed: 0,title,author (first last),author (last name),text,month,year,venue,edited by,form (if known),gender (if known),themes,second venue,published in (city),Magazine Type,Author Bio.,Poem_Length
0,New Wars,Benjamin Griffith Brawley,Brawley,HURL on the lance! Break up the ancient peace!...,November,1900,Colored American,Walter W. Wallace,Common Measure,male,Spanish-American War,,Boston,Predom. Black,https://en.wikipedia.org/wiki/Benjamin_Griffit...,1418
0,New Wars,Benjamin Griffith Brawley,Brawley,HURL on the lance! Break up the ancient peace!...,November,1900,Colored American,Walter W. Wallace,Common Measure,male,Empire,,Boston,Predom. Black,https://en.wikipedia.org/wiki/Benjamin_Griffit...,1418
1,A Picture,Olivia Ward Bush-Banks,Bush-Banks,I drew a picture long ago —\nA picture of a su...,June,1900,Colored American,Walter W. Wallace,Common Measure,female,,,Boston,Predom. Black,https://en.wikipedia.org/wiki/Olivia_Ward_Bush...,1391
2,The Christmas Reunion,Augustus M. Hodges,Hodges,"Twas a bright Christmas morning in ""Ole Kentuc...",December,1900,Colored American,Walter W. Wallace,,male,Slavery,,Boston,Predom. Black,https://en.wikipedia.org/wiki/Augustus_M._Hodges,4477
3,A Memorial of Frederick Douglass,C. Henry Holmes,Holmes,"He was a noble hero, born in an humble state,\...",September,1900,Colored American,Walter W. Wallace,Elegy,male,Frederick Douglass,,Boston,Predom. Black,,1136


# visualizations

## Histograms

In [18]:
# Visualize: How many times each theme is represented.
import plotly.express as px
fig = px.histogram(poems_themes, x='themes')
fig.show()

In [19]:
# visualize: Occurance of Themes, grouped gender. (i.e., a grouped bar chart)
fig = px.histogram(poems_themes, y='themes',color = "gender (if known)", barmode = "group", height = 2000)
fig.update_yaxes(categoryorder='total ascending')
fig.show()

Religion, Racism, Race, Progress and Racial Uplift, and Slavery are the top five themes addressed in African American poetry in this period.

In [20]:
# Visualize: Persentage of each gender in each theme of the poems.
fig = px.histogram(poems_themes, y='themes',color = "gender (if known)", barnorm = "percent", height = 2000)
fig.update_yaxes(categoryorder='total ascending')
fig.show()

In most themes, a bigger percentage of the poems in this theme is written by male poets than female poets.
Note: this may be due to a misrepresented data size. There are poems written by male poets than female poets.

## Violin plots

In [21]:
# Visualize: Poem length grouped by gender
fig = px.violin(aap, y="Poem_Length",
                x="gender (if known)",
                color = "gender (if known)",
                color_discrete_sequence = ["yellow", "green"])
fig.update_traces(opacity=0.5)
fig.show()

Male poets write longer poems than female poets.

## Line plot

In [35]:
# Visualize: average `Poem_Length` for every year.
aap_year_length = aap.groupby('year')['Poem_Length'].mean()
fig = px.line(
    aap_year_length,
    x=aap_year_length.index,
    y='Poem_Length',
    title='Average Poem Length Over Years')
fig.show()


Average Poem length peaks in 1910, 1913, and 1915, but have an overall downward trend.

## Scatter plot

In [50]:
# Poem length by year and venue.
fig = px.scatter(aap, x='year', y='Poem_Length',color='venue',
                 title='Poem Length Over Time',
                 labels = {'year':'Year_Published','Poem_length':'Poem_length'})
fig.show()


Voice of the Negro published many African American poems from 1904 - 1907
Colored American Magazine published many African American poems in 1903 and 1909.
The Crisis published African American poems consistantly from 1911 - 1920

In [23]:
# Poem length over years facet by gender.
fig = px.scatter(poems_themes, x='year', y='Poem_Length', color='venue', facet_col='gender (if known)',
                 title='Poem Length Over Time by gender',
                 labels={'year': 'Year_Published', 'Poem_Length': 'Poem_Length'})
fig.show()


More poems by male poets are published than woman.