# Analysis of Pitchfork Reviews from 1999 to 2016

## The Data

A database provided by Nolan Conway on Kaggle documents 18,393 Pitchfork reviews ranging from January 5, 1999 to January 8, 2017. The data for each review includes the title, release year, and artist associated with each album as well as the body of the review among other information.
    
Some questions I hope to answer through this analysis are as follows:
    
1. Which genres are most and least favored by reviewers?
2. Is there a strong association between release year and average review score? If so, does the popular belief that the best music came out in 2016 hold up?
3. Is there a strong association between the average review score and the time of year the review was posted?

## Genre Favorability

A reader of Pitchfork magazine may associate the publication primarily with alternative, pop, and hip-hop music. Pitchfork reviews span many genres, but if the majority of readers favor specific genres, it stands to reason that these genres may receive higher ratings on average. It is also worth noting that the most recent review in this data set is nine years old, making any conclusions potentially outdated. 

In [1]:
%load_ext sql
%sql duckdb:///pitchfork_reviews.sqlite

In [2]:
%%sql
SELECT genres.genre, ROUND(AVG(reviews.score),2) as avg_score
    FROM genres
    JOIN reviews
    ON genres.reviewid = reviews.reviewid
    WHERE genres.genre IS NOT NULL
    GROUP BY genres.genre
    ORDER BY avg_score DESC;

genre,avg_score
global,7.43
experimental,7.34
jazz,7.3
folk/country,7.2
metal,6.95
rock,6.94
electronic,6.92
rap,6.9
pop/r&b,6.88


As shown above, the genres with the highest average rating are global, experimental, and jazz, despite our initial assumption. Perhaps displaying the quantity of reviews corresponding to each genre will show a different trend.

In [3]:
%%sql
SELECT genre, COUNT(*) as num_reviewed
    FROM genres
    WHERE genre IS NOT NULL
    GROUP BY genre
    ORDER BY num_reviewed DESC;

genre,num_reviewed
rock,9436
electronic,3874
experimental,1815
rap,1559
pop/r&b,1432
metal,860
folk/country,685
jazz,435
global,217


Now we can see results that align more closely with our expectation. Poor genre specificity is a price one must pay for clean categorization, but we can reasonably associate our expected genres with rock, experimental, rap, and pop/r&b. Our results show that these genres place in the top 5. We also see that global music, which is the highest rated on average, is the least commonly reviewed. Its making up such a small portion of the dataset explains its unexpectedly high rating in the previous results.

## Ratings by Year

Every year when Grammy season rolls around, there is discussion regarding the overall quality of the music released in the past year. Many of us have our own opinions about which years are the "good years." It is a popular sentiment among millennials and some gen-z that 2016 was a particularly impressive year in the music sphere. While we cannot analyze the years following 2016, we can at least see how Pitchfork authors rated the music of the years leading up, and how 2016 ranks among them.

In [5]:
%%sql
SELECT years.year, ROUND(AVG(reviews.score),2) as avg_score, COUNT(*) as num_reviews, 
    COUNT(CASE WHEN reviews.score = 10 THEN 1 END) as num_perfect_scores
    FROM years
    JOIN reviews
    ON years.reviewid = reviews.reviewid
    WHERE years.year>=1999 AND years.year<2017
    GROUP BY years.year
    ORDER BY avg_score DESC;

year,avg_score,num_reviews,num_perfect_scores
1999,7.29,116,1
2004,7.2,1046,5
2016,7.19,1205,3
2015,7.1,1153,5
2014,7.09,1134,4
2013,7.05,1200,4
2011,7.04,1140,5
2005,7.04,1216,3
2000,7.03,220,2
2001,7.02,579,1


The data show the highest average scores belonging to 1999, 2004, and 2016. It appears Pitchfork authors agree with the internet here. It is worth noting the significantly low number of reviews for the year 1999, making its place at number one dubious. Interestingly, there does not seem to be a strong relationship between the average score for a given year and the number of perfect scores given during that year, shown above as num_perfect_scores.

## Ratings by Month

Most of us would probably say that our mood varies from month to month. Many people are happiest in the summer and during the holidays, though this is not true for everyone. Can we see this intuitive trend reflected in the average rating given by Pitchfork authors each month? Alternatively, will we see trends that follow more closely with the typical artist release schedule? It may even be the case that the highest rated months would be those with the fewest new releases, as this is when authors would be more likely to review their old favorites. In that case, we can expect peaks during the holidays and in late winter to spring. 

In [9]:
%config SqlMagic.displaylimit = None

In [11]:
%%sql
SELECT pub_month, ROUND(AVG(score),2) as avg_score
    FROM reviews
    GROUP BY pub_month
    ORDER BY pub_month;

pub_month,avg_score
1,7.01
2,6.99
3,6.98
4,6.94
5,7.04
6,6.97
7,6.96
8,6.97
9,7.04
10,7.08


The variability between months is not great, but there are peaks in May, October, and December. These months align with the beginning of summer and the holiday season in The United States. More research is necessary to support the idea that these are times when people are generally happier, but the above data may reflect monthly trends in happiness. There are likely many factors at play.