# **Detecting Bias in Music Reviews at pitchfork.com**

With nearly a quarter of a million readers per day, the music review publication Pitchfork is a hugely influential force in the independent music scene. A positive album review from Pitchfork can help launch an artist into more mainstream success, notable examples of which include Arcade Fire and Sufjan Stevens.

Pitchfork has also been the subject of criticism in regards to certain perceived biases in their content. A review of hip-hop artist M.I.A.'s 2007 album *Kala* erroneously stated that Diplo had produced all of the album's tracks, when, in reality, M.I.A. had produced most of the tracks herself. In an interview with Pitchfork, M.I.A. later said, "There is an issue especially with what male journalists write about me and say 'this *must* have come from a guy.'" Their early reviews often focused disproportionately on independent rock, though they appear in recent years to have featured a more balanced mixture of genres.

My goal in this project is to examine bias in the bias in the reviews, specifically looking to answer the following questions:
- How has the proportion of reviews per genre evolved over time?
- Is there evidence of bias in scoring based on the genre of an album?
- Is there evidence of gender bias in Pitchfork's review? Specifically:
    - How has the ratio of reviews of female artists to male artists evolved over time?
    - Is there evidence of bias in scoring based on the artist's gender or on the reviewer's gender?


---

## Preliminaries

In [2]:
import numpy as np

import pandas as pd

import matplotlib.pyplot as plt
%matplotlib inline

import seaborn as sns

from scipy import stats

from itertools import product

import re

import gender_guesser.detector as gender

In [3]:
# Examine data set
pitchfork = pd.read_csv('pitchfork.csv', header = 0)
pitchfork.head(10)

Unnamed: 0,album,artist,best_new_music,best_new_reissue,genre,label,review_date,review_text,reviewer,score,year
0,Apple,A. G. Cook,False,False,Electronic,PC Music,2020-09-24T05:00:00,PC Music founder A. G. Cook settled on Apple a...,Claire Lobenfeld,7.5,2020
1,A Day in a Yellow Beat,Yellow Days,False,False,Pop/R&B,RCA|Sony,2020-09-22T05:00:00,"Since age 17, George van den Broek has perform...",Ashley Bardhan,5.9,2020
2,BREACH,Fenne Lily,False,False,Folk/Country|Rock,Dead Oceans,2020-09-22T05:00:00,There comes a time in a disenchanted young man...,Cat Zhang,6.7,2020
3,BC,Bwoy Coyote,False,False,Pop/R&B,self-released,2020-09-22T05:00:00,Not much information is readily available on B...,Hubert Adjei-Kontoh,7.1,2020
4,Host,Cults,False,False,Rock,Sinderlyn,2020-09-22T05:00:00,The indie-pop world has completed several rota...,Arielle Gordon,6.8,2020
5,The Studio Albums 1978-1991,Dire Straits,False,False,Rock,Rhino|Warner,2020-09-23T05:00:00,"Stats don’t lie, but the tales they tell can b...",Stephen Thomas Erlewine,8.0,2020
6,The Liz Tape,Armani Caesar,False,False,Rap,Griselda,2020-09-23T05:00:00,Following a long hot streak that cemented the ...,Evan Rytlewski,7.2,2020
7,The Times EP,Neil Young,False,False,Rock,Reprise,2020-09-23T05:00:00,Neil Young’s live audiences this year have inc...,Jesse Jarnow,6.9,2020
8,Shore,Fleet Foxes,True,False,Folk/Country,Anti-,2020-09-23T05:00:00,"For Robin Pecknold, the music of Fleet Foxes h...",Matthew Strauss,8.3,2020
9,Virgo World,Lil Tecca,False,False,Rap,Republic|Galactic,2020-09-24T05:00:00,"On his 2013 song “Lonely,” South Carolina rapp...",Alphonse Pierre,6.0,2020


In [4]:
# Convert review_date to type 'datetime'
pitchfork['review_date'] = pd.to_datetime(pitchfork['review_date'])

Unnamed: 0,album,artist,best_new_music,best_new_reissue,genre,label,review_date,review_text,reviewer,score,year
0,Apple,A. G. Cook,False,False,Electronic,PC Music,2020-09-24 05:00:00,PC Music founder A. G. Cook settled on Apple a...,Claire Lobenfeld,7.5,2020
1,A Day in a Yellow Beat,Yellow Days,False,False,Pop/R&B,RCA|Sony,2020-09-22 05:00:00,"Since age 17, George van den Broek has perform...",Ashley Bardhan,5.9,2020
2,BREACH,Fenne Lily,False,False,Folk/Country|Rock,Dead Oceans,2020-09-22 05:00:00,There comes a time in a disenchanted young man...,Cat Zhang,6.7,2020
3,BC,Bwoy Coyote,False,False,Pop/R&B,self-released,2020-09-22 05:00:00,Not much information is readily available on B...,Hubert Adjei-Kontoh,7.1,2020
4,Host,Cults,False,False,Rock,Sinderlyn,2020-09-22 05:00:00,The indie-pop world has completed several rota...,Arielle Gordon,6.8,2020
5,The Studio Albums 1978-1991,Dire Straits,False,False,Rock,Rhino|Warner,2020-09-23 05:00:00,"Stats don’t lie, but the tales they tell can b...",Stephen Thomas Erlewine,8.0,2020
6,The Liz Tape,Armani Caesar,False,False,Rap,Griselda,2020-09-23 05:00:00,Following a long hot streak that cemented the ...,Evan Rytlewski,7.2,2020
7,The Times EP,Neil Young,False,False,Rock,Reprise,2020-09-23 05:00:00,Neil Young’s live audiences this year have inc...,Jesse Jarnow,6.9,2020
8,Shore,Fleet Foxes,True,False,Folk/Country,Anti-,2020-09-23 05:00:00,"For Robin Pecknold, the music of Fleet Foxes h...",Matthew Strauss,8.3,2020
9,Virgo World,Lil Tecca,False,False,Rap,Republic|Galactic,2020-09-24 05:00:00,"On his 2013 song “Lonely,” South Carolina rapp...",Alphonse Pierre,6.0,2020


---

## Genre Analysis

In [6]:
genre_df = pitchfork.copy()
genre_df = genre_df[~genre_df['genre'].isnull()]
genre_df = genre_df[['genre', 'score', 'review_date']]
genre_df

Unnamed: 0,genre,score,review_date
0,Electronic,7.5,2020-09-24 05:00:00
1,Pop/R&B,5.9,2020-09-22 05:00:00
2,Folk/Country|Rock,6.7,2020-09-22 05:00:00
3,Pop/R&B,7.1,2020-09-22 05:00:00
4,Rock,6.8,2020-09-22 05:00:00
...,...,...,...
22890,Electronic,7.5,2001-11-07 06:00:03
22891,Electronic,9.4,2001-11-08 06:00:00
22893,Folk/Country,7.1,2001-11-11 06:00:00
22894,Rock,8.5,2001-11-08 06:00:02
