# Summary
In this notebook we create a set of world-maps (choropleths) coloured by the suicide rates in each country. These plots give a very nice overview of suicide rates over the years in all of the countries in our dataset. 

In [1]:
BY_YEAR_COUNTRY = "../data/processed/year_country_data.csv"

In [2]:
import numpy as np 
import pandas as pd
import plotly.express as px
from ipywidgets import interact
import ipywidgets as widgets

In [3]:
%load_ext autoreload
%autoreload 2

## Read in data

In [4]:
df = pd.read_csv(BY_YEAR_COUNTRY)
df.head()

Unnamed: 0,female_rate,male_rate,female_pop,male_pop,suicide_num_f,suicide_num_m,year,country,code,overall_rate,population,suicides_no
0,4.05,10.39,12532000.0,12454000.0,508.0,1294.0,1979,Argentina,ARG,7.212039,24986000.0,1802.0
1,7.21,18.02,6641600.0,6637800.0,479.0,1196.0,1979,Australia,AUS,12.613522,13279400.0,1675.0
2,0.0,1.83,119800.0,109500.0,0.0,2.0,1979,Barbados,BRB,0.87222,229300.0,2.0
3,15.93,30.13,4739700.0,4509800.0,755.0,1359.0,1979,Belgium,BEL,22.855289,9249500.0,2114.0
4,2.1,4.68,51202700.0,51003400.0,1074.0,2385.0,1979,Brazil,BRA,3.384338,102206100.0,3459.0


# World map overview of suicide rates

Here we're creating a graph which shows a world map coloured by suicide rate per 100,000 people. I created one for males, females, and an overall combined plot. There is also a map showing the ratio of male/female suicides. A lot of the extreme countries in this latter plot are the ones with a very low population size, so I tried to filter some of them out. In almost all countries, males are more likely to commit suicide than females (though I've heard that females are more likely to attempt suicide, they just fail a lot more often). 

### Overall rates

In [5]:
types = ["male", "female", "overall"]

@interact
def choropleth(type_plot=widgets.Dropdown(options=types, value="male")):
    # plt.figure(figsize=(15,5))
    if type_plot == "male":
        color = "male_rate"
        hover_data = "male_pop"
    elif type_plot == "female":
        color = "female_rate"
        hover_data = "female_pop"
    else:
        color = "overall_rate"
        hover_data = "population"
    
    fig = px.choropleth(df, locations="code", color=color, hover_name="country", animation_frame="year",
             color_continuous_scale="Reds", 
             hover_data=[hover_data])
    fig.show()

interactive(children=(Dropdown(description='type_plot', options=('male', 'female', 'overall'), value='male'), …

# male/female ratio of rates

Here we'll look at the male to female suicide rate ratio. I filtered out countries with populations below one million, just because otherwise the number of suicides are not very robust. We find that 

* Generally, the male suicide rate is much higher than the female one

* Most countries whose female suicide rate is higher than the male suicide rate have a population below 2 million. This could mean that the higher female suicide rate occurred by chance. The population just doesn't seem big enough for the result to be very trustworthy. The lowest ratio was found for Haiti, but there just doesn't seem to be enough data there to make any conclusions (in 1999 and 20013, 1 and 0 males were reported to have committed suicide, respectively).

In [9]:
df["female_rate"] = df["female_rate"].replace(0, np.nan)
df["ratio_male_female"] = df["male_rate"] / df["female_rate"]
selection = df.loc[df["population"] > 10**6]

In [10]:
px.choropleth(selection, locations="code", color="ratio_male_female", animation_frame="year",
             color_continuous_scale=px.colors.diverging.Picnic, 
             hover_name="country",)

In [11]:
selection.sort_values(by="ratio_male_female", ascending=False).head(10)

Unnamed: 0,female_rate,male_rate,female_pop,male_pop,suicide_num_f,suicide_num_m,year,country,code,overall_rate,population,suicides_no,ratio_male_female
1403,1.06,15.44,1787807.0,1644950.0,19.0,254.0,1997,Puerto Rico,PRI,7.952791,3432757.0,273.0,14.566038
2237,0.02,0.29,13030850.0,12659760.0,2.0,37.0,2005,Malaysia,MYS,0.151806,25690611.0,39.0,14.5
2807,0.26,3.53,386068.0,1303524.0,1.0,46.0,2010,Qatar,QAT,2.781737,1689592.0,47.0,13.576923
2356,0.86,11.27,1512148.0,1525811.0,13.0,172.0,2006,Panama,PAN,6.089615,3037959.0,185.0,13.104651
3205,0.01,0.13,15148990.0,14717570.0,2.0,19.0,2014,Malaysia,MYS,0.070313,29866559.0,21.0,13.0
1723,0.02,0.26,8323950.0,8086898.0,2.0,21.0,2000,Syrian Arab Republic,SYR,0.140151,16410848.0,23.0,13.0
2668,0.03,0.38,3496416.0,3396844.0,1.0,13.0,2009,Jordan,JOR,0.203097,6893260.0,14.0,12.666667
2148,1.25,15.26,1833697.0,1677765.0,23.0,256.0,2004,Puerto Rico,PRI,7.945408,3511462.0,279.0,12.208
2361,1.2,14.54,1827241.0,1671795.0,22.0,243.0,2006,Puerto Rico,PRI,7.573515,3499036.0,265.0,12.116667
2040,1.2,14.27,1836820.0,1682206.0,22.0,240.0,2003,Puerto Rico,PRI,7.445242,3519026.0,262.0,11.891667


In [12]:
selection[selection["ratio_male_female"] < 1]

Unnamed: 0,female_rate,male_rate,female_pop,male_pop,suicide_num_f,suicide_num_m,year,country,code,overall_rate,population,suicides_no,ratio_male_female
190,0.19,0.13,533500.0,789800.0,1.0,1.0,1982,Kuwait,KWT,0.151137,1323300.0,2.0,0.684211
249,0.4,0.31,996000.0,969000.0,4.0,3.0,1983,Jamaica,JAM,0.356234,1965000.0,7.0,0.775
795,2.84,2.73,1763025.0,1797204.0,50.0,49.0,1990,Paraguay,PRY,2.78072,3560229.0,99.0,0.961268
1116,2.21,1.99,541900.0,906100.0,12.0,18.0,1994,Kuwait,KWT,2.071823,1448000.0,30.0,0.900452
1385,2.82,1.58,603700.0,1013800.0,17.0,16.0,1997,Kuwait,KWT,2.040185,1617500.0,33.0,0.560284
1480,2.14,1.78,701642.0,1124124.0,15.0,20.0,1998,Kuwait,KWT,1.917004,1825766.0,35.0,0.831776
1569,0.09,0.02,4219612.0,4099445.0,4.0,1.0,1999,Haiti,HTI,0.060103,8319057.0,5.0,0.222222
1682,1.86,1.64,751141.0,1222012.0,14.0,20.0,2000,Kuwait,KWT,1.72313,1973153.0,34.0,0.88172
2007,0.02,0.0,4514332.0,4385772.0,1.0,0.0,2003,Haiti,HTI,0.011236,8900104.0,1.0,0.0


# save plots

In [13]:
# import chart_studio.plotly as py
# import chart_studio

# chart_studio.tools.set_credentials_file(username='', api_key='')

# fig = px.choropleth(df, locations="code", color="overall_rate", hover_name="country", animation_frame="year",
#              color_continuous_scale="Reds", 
#              hover_data=["population"])

# py.plot(fig, filename="combined_rate", auto_open=True)