Analysis of Suicide Rates over Time and by Country

The dataset can be found at: https://ourworldindata.org/suicide

The data provides suicide rates from 1990-2019 for every country by year. 

This notebook will look at how suicide rates have changed over time, and if suicide rates have changed as notable world events have transpired(such as the economic recession of 2008). I'd also like to look for correlation between suicide rates and socioeconomic status (through the lens of developing countries compared to developed countries). 

In [36]:
import pandas as pd
import numpy as np

In [37]:
data = pd.read_csv("suicide-death-rates.csv")
data = data[['Code', 'Year', 'Deaths - Self-harm - Sex: Both - Age: Age-standardized (Rate)']]

In [38]:
df = pd.DataFrame(data[['Deaths - Self-harm - Sex: Both - Age: Age-standardized (Rate)']])

In [39]:
df.describe()

Unnamed: 0,Deaths - Self-harm - Sex: Both - Age: Age-standardized (Rate)
count,8220.0
mean,12.144443
std,7.818673
min,1.633003
25%,7.035846
50%,10.941042
75%,14.881956
max,90.05703


In [40]:
p = pd.pivot_table(data, values = 'Deaths - Self-harm - Sex: Both - Age: Age-standardized (Rate)',
                  index = ['Code'],
                  columns = ['Year'])

In [41]:
p

Year,1990,1991,1992,1993,1994,1995,1996,1997,1998,1999,...,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019
Code,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
AFG,8.277699,8.174310,8.139174,8.231295,8.362751,8.408568,8.462155,8.528773,8.580934,8.699422,...,7.167642,6.985272,6.813385,6.653602,6.519977,6.431005,6.359541,6.293208,6.217334,6.125511
AGO,17.585853,17.399133,17.336500,17.487724,17.570218,17.254165,16.453348,16.134170,16.392509,16.399825,...,13.493817,13.398542,13.323194,13.114195,12.685184,12.534813,12.339597,12.317334,12.318951,12.289975
ALB,4.095761,4.400781,4.245704,4.165197,3.955223,4.190963,4.446323,4.760317,4.976018,4.995366,...,5.123011,5.110897,5.062669,5.038927,5.093646,5.149327,5.094738,5.033410,4.976800,4.927489
AND,9.710626,9.678968,9.660766,9.608765,9.550137,9.346259,9.164296,8.840730,8.641683,8.463644,...,7.353222,7.296121,7.271072,7.246787,7.331045,7.274367,7.253508,7.220837,7.185147,7.142659
ARE,6.849037,6.843866,6.867228,6.910262,6.947565,7.140118,7.125145,7.181258,7.105295,7.050719,...,5.719666,5.695260,5.666375,5.672497,5.540472,5.491611,5.427374,5.346215,5.313478,5.305595
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
WSM,16.585142,16.678847,16.813221,16.856149,16.867577,16.822767,16.669872,16.538949,16.325204,16.095968,...,13.841139,13.767277,13.694606,13.599901,13.534322,13.487125,13.469427,13.424446,13.308559,13.106483
YEM,6.794136,6.670706,6.599601,6.550718,6.541268,6.484669,6.430625,6.400727,6.363073,6.414934,...,6.004509,5.990398,5.875512,5.879535,5.751489,5.814312,5.881797,5.973942,6.130623,6.137930
ZAF,20.694871,20.716873,21.772021,19.877567,20.243746,20.123545,19.605909,22.110833,22.687580,21.772517,...,16.536706,14.980357,14.406254,14.113579,14.156845,13.903631,13.236054,13.217262,13.428478,13.497059
ZMB,14.620612,15.253332,15.785769,16.227256,16.716787,17.219897,17.486454,17.616710,17.686612,17.631471,...,15.777716,15.547681,15.218264,14.998929,14.840872,14.692415,14.490416,14.444245,14.206155,13.825075


In [42]:
p = np.round(pd.pivot_table(data, values=['Deaths - Self-harm - Sex: Both - Age: Age-standardized (Rate)'], 
                                index=['Year'], 
                                aggfunc=np.mean,
                                fill_value=0),2)

highest = p.reindex(p['Deaths - Self-harm - Sex: Both - Age: Age-standardized (Rate)'].sort_values(ascending=False).index).nsmallest(10, 'Deaths - Self-harm - Sex: Both - Age: Age-standardized (Rate)')

In [43]:
highest

Unnamed: 0_level_0,Deaths - Self-harm - Sex: Both - Age: Age-standardized (Rate)
Year,Unnamed: 1_level_1
2019,10.17
2018,10.29
2017,10.36
2016,10.49
2015,10.61
2014,10.69
2013,10.82
2012,11.0
2011,11.18
2010,11.38


The 10 years with the lowest suicide rates are the 10 most recent years-- in order! This is suprising. I am very if covid will affect this trend once that data becomes available. 

In [44]:
lowest = p.reindex(p['Deaths - Self-harm - Sex: Both - Age: Age-standardized (Rate)'].sort_values(ascending=True).index).nlargest(10, 'Deaths - Self-harm - Sex: Both - Age: Age-standardized (Rate)')

In [45]:
lowest

Unnamed: 0_level_0,Deaths - Self-harm - Sex: Both - Age: Age-standardized (Rate)
Year,Unnamed: 1_level_1
1995,13.65
1994,13.64
1993,13.49
1996,13.48
1997,13.4
1998,13.33
1992,13.29
1999,13.27
1991,13.18
1990,13.14


All of the top 10 years with the highest suicide rates are in the 90's... clearly suicide rates globally have been trending downward in the last 30 years. Here however, the trend is less consistent, with '95 being the highest overall and '90 being the 10th highest overall. 