# Exploring Genre Trends in Top Novels

## Imports

In [33]:
import pandas as pd
import altair as alt
import matplotlib.pyplot as plt
import numpy as np

## Reading Data and Cleaning

In [6]:
# Load the Goodreads Top Ranked Novels dataset
df_novels = pd.read_csv("https://raw.githubusercontent.com/melaniewalsh/responsible-datasets-in-context/main/datasets/top-500-novels/library_top_500.csv", sep=',', header=0, low_memory=False)

In [17]:
df_novels.head()
print(df_novels.columns)

Index(['top_500_rank', 'title', 'author', 'pub_year', 'orig_lang', 'genre',
       'author_birth', 'author_death', 'author_gender', 'author_primary_lang',
       'author_nationality', 'author_field_of_activity', 'author_occupation',
       'oclc_holdings', 'oclc_eholdings', 'oclc_total_editions',
       'oclc_holdings_rank', 'oclc_editions_rank', 'gr_avg_rating',
       'gr_num_ratings', 'gr_num_reviews', 'gr_avg_rating_rank',
       'gr_num_ratings_rank', 'oclc_owi', 'author_viaf', 'gr_url', 'wiki_url',
       'pg_eng_url', 'pg_orig_url'],
      dtype='object')


In [36]:
df_novels['genre'].value_counts()

history       53
fantasy       48
romance       33
bildung       27
scifi         21
thrillers     21
mystery       18
action        16
political     15
horror         8
autobio        8
allegories     7
war            4
Name: genre, dtype: int64

In [35]:
df_novels['genre'] = df_novels['genre'].replace('na', np.nan)
df_novels = df_novels.dropna(subset=['genre'])

In [37]:
df_novels.describe()

Unnamed: 0,top_500_rank,pub_year,oclc_holdings,oclc_eholdings,oclc_total_editions,oclc_holdings_rank,oclc_editions_rank,gr_avg_rating,gr_avg_rating_rank,gr_num_ratings_rank,oclc_owi
count,279.0,279.0,276.0,276.0,276.0,276.0,276.0,279.0,279.0,279.0,276.0
mean,250.111111,1940.885305,10617.681159,2100.358696,930.518116,250.543478,253.811594,3.996989,237.90681,232.103943,1738878000.0
std,149.625979,64.084374,6228.401035,3182.875874,1158.603851,149.023226,146.201482,0.22339,144.138767,142.995573,2894866000.0
min,1.0,1605.0,1231.0,25.0,26.0,1.0,1.0,3.29,1.0,1.0,3180.0
25%,116.0,1906.0,6675.25,351.5,235.75,120.75,130.25,3.86,108.0,113.5,431848.2
50%,246.0,1955.0,8351.5,540.5,438.0,246.5,264.5,3.99,244.0,221.0,26455700.0
75%,384.5,1995.0,12268.75,2606.25,1169.5,384.25,376.25,4.145,355.0,352.5,2931328000.0
max,500.0,2013.0,37702.0,15545.0,9017.0,494.0,494.0,4.62,499.0,498.0,12392500000.0


## Understanding The Distribution of Genres in Top Ranked Novels

In [38]:
genre_counts = df_novels['genre'].value_counts().reset_index()
genre_counts.columns = ['Genre', 'Count']

genre_chart = alt.Chart(genre_counts).mark_bar().encode(
    x=alt.X('Count:Q', title='Number of Novels'),
    y=alt.Y('Genre:N', title='Genre', sort='-x'),
    color='Genre:N'
).properties(title='Distribution of Genres in Top Ranked Novels')

genre_chart

## Calculating Average Ratings by Genre in Top Ranked Novels

In [39]:
# Calculate average ratings by genre
average_ratings_by_genre = df_novels.groupby('genre')['gr_avg_rating'].mean().reset_index()

# Create a bar chart for average ratings by genre
rating_chart = alt.Chart(average_ratings_by_genre).mark_bar().encode(
    x=alt.X('gr_avg_rating:Q', title='Average Rating'),
    y=alt.Y('genre:N', title='Genre', sort='-x'),
    color='genre:N'
).properties(title='Average Ratings by Genre in Top Ranked Novels')

rating_chart

## Observing Trends in Genre Popularity Over Time

In [43]:
genre_trends = df_novels.groupby(['pub_year', 'genre']).size().reset_index(name='count')

trend_chart = alt.Chart(genre_trends).mark_line().encode(
    x=alt.X('pub_year:O', title='Publication Year'),
    y=alt.Y('count:Q', title='Number of Novels'),
    color='genre:N',
    tooltip=['year:O', 'genre:N', 'count:Q']
).properties(title='Trends in Genre Popularity Over Time')

trend_chart

## Conclusion

Looking at the trends in literary genres over time, it’s interesting to see how preferences have shifted. Autobiographies were all the rage in the late 1800s and early 1900s. During this period, readers were drawn to personal stories, perhaps as a way to connect with historical events through the eyes of those who lived them. People were eager to learn about the lives of influential figures, which made these personal narratives especially powerful.

Fast forward to the 2000s, and we see a big shift towards thrillers. This genre really took off, likely reflecting our fast-paced lives and a growing appetite for suspenseful, gripping stories. With the world becoming more complex and technology-driven, readers seem to crave narratives that keep them on the edge of their seats, filled with twists and turns that mirror their everyday experiences.

When we look at average ratings across different genres, it’s fascinating that while many genres maintain similar ratings, the top three that stand out are fantasy, horror, and autobiographies. Fantasy, with its ability to whisk readers away to imaginative worlds, offers a much-needed escape from reality. Horror, on the other hand, taps into our primal fears and curiosity about the unknown, keeping readers hooked. And autobiographies, with their authenticity and personal touch, continue to resonate with those looking for real-life stories that inspire and connect.

Interestingly, historical fiction appears to have the most significant presence among the top-ranked novels. This genre’s popularity makes sense, as many readers are fascinated by the past. Historical novels not only entertain but also provide valuable context for understanding current issues. They blend fact and fiction in a way that makes history feel alive and relevant.

These trends raise some intriguing questions for future exploration. For instance, how have major societal changes influenced which genres gain popularity over time? Additionally, looking into different reader demographics could reveal what various groups are drawn to, potentially highlighting voices and stories that deserve more attention.

It would also be interesting to dig deeper into why autobiographies have become less popular in favor of thrillers. This shift could tell us a lot about changing cultural values and what we, as readers, are looking for in our stories.