# Analyzing Suicide Rates (1985 to 2016 )




**Suicide** is the act of intentionally causing one's own death. Mental disorders—including depression, bipolar disorder, autism, schizophrenia, personality disorders, anxiety disorders, physical disorders such as chronic fatigue syndrome, and substance abuse—including alcoholism and the use of benzodiazepines—are risk factors. Some suicides are impulsive acts due to stress, such as from financial difficulties, relationship problems such as breakups, or bullying. Those who have previously attempted suicide are at a higher risk for future attempts. Effective suicide prevention efforts include limiting access to methods of suicide—such as firearms, drugs, and poisons; treating mental disorders and substance misuse; careful media reporting about suicide; and improving economic conditions. Even though crisis hotlines are common, they have not been well studied.

**Approximately** 1.5% of people die by suicide. In a given year this is roughly 12 per 100,000 people. Rates of completed suicides are generally ***higher among men than among women***, ranging from 1.5 times as much in the developing world to 3.5 times in the developed world. Suicide is generally most common among those ***over the age of 70***; however, in certain countries, those aged between 15 and 30 are at the highest risk. There are an estimated 10 to 20 million non-fatal attempted suicides every year. Non-fatal suicide attempts may lead to injury and long-term disabilities. In the ***Western world, attempts are more common among young people and among females***.

## Importing Essentials

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
plt.style.use('fivethirtyeight')

## Meeting the data

In [None]:
df=pd.read_csv('/kaggle/input/suicide-rates-overview-1985-to-2016/master.csv')
df.shape

In [None]:
df.head()

In [None]:
df.info()

Looks like we have null values in 'HDI for year' column

In [None]:
df.describe().T

## Any correlation?

In [None]:
plt.figure(figsize=(12,6))
sns.heatmap(df.corr(),cmap='viridis')

population vs suicide_no and gdp_per_capita($) vs HDI for year shows good correlation

## Male : Female ratio

In [None]:
sns.countplot(df['sex'],palette='Set2');
df['sex'].value_counts()

There is an equal distribution of male and female

# Country Count

In [None]:
plt.figure(figsize=(15,30))
sns.countplot(y=df['country'])

## Top 10 countries with highest number of suicides

In [None]:
explode=[0.1,0.1,0.1,0.1,0.3,0.2,0.2,0.1,0.2,0.4]
fig,ax=plt.subplots(1,2,figsize=(12, 6))
df.groupby('country')['suicides_no'].agg(sum).sort_values(ascending=False).head(10).plot(kind='bar',
                                                                                         cmap='Set2',ax=ax[0]);
_=df.groupby('country')['suicides_no'].agg(sum).sort_values(ascending=False).head(10).plot(kind='pie'
                                                                                         ,autopct='%.1f%%'
                                                                                         ,explode=explode
                                                                                         ,startangle=15
                                                                                        ,cmap='Wistia');

## Top 10 countries with highest number of suicides per 100k population

In [None]:
plt.figure(figsize=(12,9))
explode=[0.03,0.03,0.03,0.03,0.03,0.03,0.03,0.03,0.03,0.03]
fig,ax=plt.subplots(1,2,figsize=(16, 6))
_=df.groupby('country')['suicides/100k pop'].agg(sum).sort_values(ascending=False).head(10).plot(kind='bar',
                                                                                         cmap='Set2',ax=ax[0]);
_=df.groupby('country')['suicides/100k pop'].agg(sum).sort_values(ascending=False).head(10).plot(kind='pie'
                                                                                         ,autopct='%.1f%%'
                                                                                         ,explode=explode
                                                                                         ,startangle=15
                                                                                        ,cmap='Wistia');
plt.show()

## What about the age?

In [None]:
suicide_age=df.pivot_table('suicides_no',index='age',aggfunc='sum')
x=suicide_age.index.values
y=suicide_age.values
y=y.reshape(6,)

fig,ax=plt.subplots(1,2,figsize=(15, 6))
explode=[0.05,0.05,0.05,0.1,0.05,0.05]
_=df.groupby('age')['suicides_no'].agg(sum).sort_values(ascending=False).head(10).plot(kind='bar',cmap='Set2',ax=ax[0]);
_=plt.pie(y,explode=explode,labels=x,autopct='%1.1f%%',shadow=True,startangle=7)
plt.show()

In [None]:
suicide_age=df.pivot_table('suicides/100k pop',index='age',aggfunc='sum')
x=suicide_age.index.values
y=suicide_age.values
y=y.reshape(6,)

fig,ax=plt.subplots(1,2,figsize=(15, 6))
explode=[0.05,0.05,0.05,0.1,0.05,0.05]
_=df.groupby('age')['suicides/100k pop'].agg(sum).sort_values(ascending=False).head(10).plot(kind='bar',cmap='Set2',ax=ax[0]);
_=plt.pie(y,explode=explode,labels=x,autopct='%1.1f%%',shadow=True,startangle=7)
plt.show()

## And the gender?

In [None]:
suicide_sex=df.pivot_table('suicides_no',index='sex',aggfunc='sum')
x=suicide_sex.index.values
y=suicide_sex.values
y=y.reshape(2,)

fig,ax=plt.subplots(1,2,figsize=(15, 6))
explode=[0.05,0.05]
_=df.groupby('sex')['suicides_no'].agg(sum).plot(kind='bar',cmap='Set2',ax=ax[0])
_=plt.pie(y,explode=explode,labels=x,autopct='%1.1f%%',shadow=True,startangle=90)
plt.show()

In [None]:
suicide_sex=df.pivot_table('suicides/100k pop',index='sex',aggfunc='sum')
x=suicide_sex.index.values
y=suicide_sex.values
y=y.reshape(2,)

fig,ax=plt.subplots(1,2,figsize=(15, 6))
explode=[0.05,0.05]
_=df.groupby('sex')['suicides/100k pop'].agg(sum).plot(kind='bar',cmap='Set2',ax=ax[0])
_=plt.pie(y,explode=explode,labels=x,autopct='%1.1f%%',shadow=True,startangle=90)
plt.show()

## Count of generation by age group

In [None]:
plt.figure(figsize=(12, 6))
df.groupby('age')['generation'].value_counts().sort_values(ascending=False).head(10).plot(kind='bar');

## Which generation has higher suicide rate?

In [None]:
generation_suicide=df.pivot_table('suicides_no',index='generation',aggfunc='sum')
x=generation_suicide.index.values
y=generation_suicide.values
y=y.reshape(6,)

fig,ax=plt.subplots(1,2,figsize=(15, 6))
explode=(0.05,0.05,0.05,0.1,0.05,0.05)
_=df.groupby('generation')['suicides_no'].agg(sum).sort_values(ascending=False).plot(kind='bar',cmap='Set2',ax=ax[0])
_=plt.pie(y,explode=explode,labels=x,autopct='%1.1f%%',shadow=True,startangle=0)
plt.show()

In [None]:
generation_suicide=df.pivot_table('suicides/100k pop',index='generation',aggfunc='sum')
x=generation_suicide.index.values
y=generation_suicide.values
y=y.reshape(6,)

fig,ax=plt.subplots(1,2,figsize=(15, 6))
explode=(0.05,0.05,0.05,0.1,0.05,0.05)
_=df.groupby('generation')['suicides/100k pop'].agg(sum).sort_values(ascending=False).plot(kind='bar',cmap='Set2',ax=ax[0])
_=plt.pie(y,explode=explode,labels=x,autopct='%1.1f%%',shadow=True,startangle=0)
plt.show()

## Which year has seen more number of suicide?

In [None]:
plt.figure(figsize=(12,6))
sns.lineplot(x='year',y='suicides_no',marker='o',data=df,palette='Set2');

In [None]:
plt.figure(figsize=(12,6))
sns.lineplot(x='year',y='suicides/100k pop',marker='o',data=df,palette='Set2');

## Differentiating by gender

In [None]:
plt.figure(figsize=(12,6))
sns.lineplot('year','suicides_no',hue='sex',marker='o',data=df);

In [None]:
plt.figure(figsize=(12,6))
sns.lineplot(x='year',y='suicides/100k pop',hue='sex',marker='o',data=df,palette='Set2');

In [None]:
plt.figure(figsize=(12,6))
sns.scatterplot('year','suicides_no',hue='sex',data=df);

In [None]:
plt.figure(figsize=(12,6))
sns.scatterplot('year','suicides/100k pop',hue='sex',data=df);

## Differentiation by age

In [None]:
plt.figure(figsize=(12,6))
sns.lineplot('year','suicides_no',hue='age',marker='o',data=df);

In [None]:
plt.figure(figsize=(12,6))
sns.lineplot(x='year',y='suicides/100k pop',hue='age',marker='o',data=df,palette='Set2')
plt.legend(loc='upper right', bbox_to_anchor=(0.4,0.7,0.7, 0.7));

## Let's ellaborate by each year for better understanding

In [None]:
sns.catplot('age','suicides_no',col='year',data=df,kind='bar',col_wrap=3,palette='Set2');

## Adding the gender hue

In [None]:
sns.catplot('age','suicides_no',hue='sex',col='year',data=df,kind='bar',col_wrap=3,palette='Set2');

## Which generation has seen more suicides by year?

In [None]:
plt.figure(figsize=(12,6))
sns.lineplot('year','suicides_no',hue='generation',marker='o',data=df)
plt.legend(loc='upper right', bbox_to_anchor=(0.4,0.7,0.7, 0.7));

In [None]:
plt.figure(figsize=(12,6))
sns.lineplot('year','suicides/100k pop',hue='generation',marker='o',data=df)
plt.legend(loc='upper right', bbox_to_anchor=(0.4,0.7,0.7, 0.7));

#### Conclusion:

1. Men commit more suicide then women
2. Russian Federation has the highest number of suicide rates
3. The people at age of 35-54 years suicide most. However, in the suicide per 100,000 people, the people at ages over 75 are more suicidal

# SUICIDE IS NEVER AN OPTION!

# Like my work?

#### Do upvote and leave a comment 