# Introduction

Our aim is to analyse suicide cases by some features such as population, GDP, generation which are included in this dataset. In addition, the questions are below will be answered.

<font color = 'blue'>
Content: 

1. [Load and Check Data](#1)
1. [Variable Description](#2)
1. [Missing Value Detection](#3)
1. [Basic Data Analysis](#4)
    * [Value amounts in the features](#5)
    * [Grouping by some features](#6)
    * [Pivot Tables](#7)
1. [Data Visualization](#8)
    * [Correlation Between Year -- Suicide Numbers -- Population -- Suicides per 100k Population -- GDP for year -- GDP per capital](#9)
    * [Numerical Variables](#10)
    * [Categorical Variables](#11)
    * [Do men or women commit suicide more ?](#12)
    * [Visualize the ranking of the countries by the number of suicides](#13)
    * [Visualize the distribution of the number of suicides by years](#14)
    * [Visualize the 5 countries with the highest suicide rate in the time series](#15)
    * [Visualize the number of suicides by age groups](#16)
    * [Visualize the distribution of the number of suicides by generation](#17)
    * [Visualize the breakdown of the 2nd question by gender](#18)
    * [Visualize the breakdown of the 3nd question by gender](#19)
    * [Visualize the breakdown of the 4th question by gender](#20)
    * [Visualize suicide rates in 1995 using word cloud on a country basis](#21)
    * [Visualize relations between gdp_for_year - population - suicides_number - gdp_for_capital](#22)
    * [Extra: Examine suicide cases, occurred in Turkey Republic, basis on gender, age, year and generation](#23)
1. [Conclusion](#24)


In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
plt.style.use("seaborn-whitegrid") 
import seaborn as sns
from wordcloud import WordCloud

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 5GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

<a id = "1"></a><br>

# Load and Check Data

This data set:
* It is 2-dimensional
* Includes 27820 rows, 12 columns and 333840 observations

In [None]:
#Load data
suicideDataRaw = pd.read_csv("../input/suicide-rates-overview-1985-to-2016/master.csv")
suicideData = suicideDataRaw.copy()

In [None]:
suicideData.ndim

In [None]:
suicideData.shape

In [None]:
suicideData.size

In [None]:
suicideData.columns

In [None]:
suicideData.rename(columns={'suicides_no':'suicides_number',
                            'suicides/100k pop':'suicides_per_100k_pop',
                            ' gdp_for_year ($) ':'gdp_for_year', 
                       'gdp_per_capita ($)':'gdp_per_capital'}, inplace=True)
suicideData.columns = suicideData.columns.str.replace(" ","_")
suicideData.columns = suicideData.columns.str.lower()
suicideData.columns

In [None]:
#Observations in the first 5 rows of the data set
suicideData.head()

In [None]:
#Remove country-year feature
suicideData.drop("country-year", axis = 1, inplace = True)

In [None]:
#5 random rows of observation inside the dataset
suicideData.sample(5)

In [None]:
#Some basic statistical details of the numerical values in the dataset
suicideData.describe().T

<a id = "2"></a><br>

# Variable Description
* Country: A data set containing 101 of the 195 countries in the world.
* Year: The dataset values are belong to years between 1985 and 2016
* Sex: Male/Female
* Age: Age is divided in five age intervals. (5-14 / 15-24 / 25-34 / 35-54 / 55-74 / 75+)
* Generation: There are six generations included in this dataset. (Generation X / Silent / Millenials / Boomers / G.I. Generation / Generation Z)
* Population: Number of people 
* Number of Suicides: Number of suicides
* Suicides per 100k people: Number of suicides divided by the population number and multiplied by 100.000.
* GDP for year: Gross Domestic Product, a measure of the market value for a country-year combination.
* GDP per capital: Obtained by dividing the GDP by the total population of the country for that year.
* HDI for year : Human Development Index, an index that measures life expectancy, income and education.
* float64(2): Suicides per 100k people, HDI for year
* int64(4): Year, Number of Suicides, Population size, GDP per capital
* object(5): Country, Sex, Age, GDP for year, Generation

In [None]:
suicideData.info()

<a id = "3"></a><br>
# Missing Value Detection
* No missing data was detected in any feature except HDI for year. 
* It was decided that the feature named HDI for year should be removed from the dataset since three quarters of the feature is missing.

In [None]:
#Missing Value Detection by Features
suicideData.isnull().sum()

In [None]:
#Remove feature
suicideData.drop("hdi_for_year", axis = 1, inplace = True)

In [None]:
suicideData.gdp_for_year = suicideData.gdp_for_year.apply(lambda x: float(''.join(x.split(','))))
suicideData.age = suicideData.age.apply(lambda x: x.replace("years", ""))

In [None]:
suicideData.isnull().sum().sum()

In [None]:
suicideData.sample(5)

<a id = "4"></a><br>
# Basic Data Analysis
1. Value amounts in the features
2. Grouping by some features
3. Pivot Tables

<a id = "5"></a><br>
## Value amounts in the features

In [None]:
suicideData["country"].value_counts().head()

In [None]:
suicideData["country"].value_counts().tail()

In [None]:
suicideData["year"].value_counts().head()

In [None]:
suicideData["year"].value_counts().tail()

In [None]:
suicideData = suicideData.query("year != 2016")

In [None]:
suicideData["year"].value_counts().tail()

In [None]:
suicideData["sex"].value_counts()

In [None]:
suicideData["age"].value_counts()

In [None]:
suicideData["suicides_number"].value_counts().head()

In [None]:
suicideData["population"].value_counts().head()

In [None]:
suicideData["suicides_per_100k_pop"].value_counts().head()

In [None]:
suicideData["gdp_for_year"].value_counts().head()

In [None]:
suicideData["gdp_per_capital"].value_counts().head()

In [None]:
suicideData["generation"].value_counts()

<a id = "6"></a><br>
## Grouping by some features

In [None]:
#Average numbers of suicide by gender
suicideData.groupby("sex")["suicides_number"].mean()

In [None]:
#Average suicide rates by gender
suicideData.groupby("sex")["suicides_per_100k_pop"].mean()

In [None]:
#Average suicide rates by country
#The countries with the highest suicide rates are below
suicideData.groupby("country", as_index = False)["suicides_per_100k_pop"].mean().sort_values(by="suicides_per_100k_pop",ascending = False).head()

In [None]:
#Average suicide rates by country
#The countries with the lowest suicide rates are below
suicideData.groupby("country", as_index = False)["suicides_per_100k_pop"].mean().sort_values(by="suicides_per_100k_pop",ascending = False).tail()

In [None]:
#Average numbers of suicide by country
#The countries with the highest numbers of suicide are below
suicideData.groupby("country", as_index = False)["suicides_number"].mean().sort_values(by="suicides_number",ascending = False).head()

In [None]:
#Average numbers of suicide by country
#The countries with the lowest numbers of suicide are below
suicideData.groupby("country", as_index = False)["suicides_number"].mean().sort_values(by="suicides_number",ascending = False).tail()

In [None]:
#Average numbers of suicide by year
suicideData.groupby("year", as_index = False)["suicides_number"].mean().sort_values(by="suicides_number",ascending = False)

In [None]:
#Average suicide rates by year
suicideData.groupby("year", as_index = False)["suicides_per_100k_pop"].mean().sort_values(by="suicides_per_100k_pop",ascending = False)

In [None]:
#Average suicide numbers by age
suicideData.groupby("age", as_index = False)["suicides_number"].mean().sort_values(by="suicides_number",ascending = False)

In [None]:
#Average suicide rates by age
suicideData.groupby("age", as_index = False)["suicides_per_100k_pop"].mean().sort_values(by="suicides_per_100k_pop",ascending = False)

In [None]:
#Average suicide numbers by generation
suicideData.groupby("generation", as_index = False)["suicides_number"].mean().sort_values(by="suicides_number",ascending = False)

In [None]:
#Average suicide rates by generation
suicideData.groupby("generation", as_index = False)["suicides_per_100k_pop"].mean().sort_values(by="suicides_per_100k_pop",ascending = False)

<a id = "7"></a><br>
## Pivot Tables

In [None]:
#Average suicide numbers by gender and country
suicideData.pivot_table("suicides_number", index = "sex", columns = "country")

In [None]:
#Average suicide numbers by gender and year
suicideData.pivot_table("suicides_number", index = "sex", columns = "year")

<a id = "8"></a><br>
# Data Visualization

<a id = "9"></a><br>
## Correlation Between Year -- Suicide Numbers -- Population -- Suicides per 100k Population -- GDP for year -- GDP per capital
* By looking at the table below, there is a high positive correlation between Population and GDP for year, Population and Suicide Numbers, GDP for year and Suicide Numbers can be said.

In [None]:
numericVars = ["year","suicides_number", "population", "suicides_per_100k_pop", 
         "gdp_for_year", "gdp_per_capital"]

In [None]:
#Heat Map
sns.heatmap(suicideData[numericVars].corr(), annot = True, fmt = ".2f")
plt.show()

<a id = "10"></a><br>
## Numerical Variables
* Thanks to the histogram, the distribution of observations within our numerical variables is visualized.

In [None]:
#Histogram
def plot_hist(variable):
    plt.figure(figsize = (9,3))
    plt.hist(suicideData[variable])
    plt.xlabel(variable)
    plt.ylabel("Frequency")
    plt.title("{} distribution with hist".format(variable))
    plt.show()

for i in numericVars:
    plot_hist(i)

<a id = "11"></a><br>
## Categorical Variables
* Thanks to the bar plot, the amount of observations in categorical variables in the data set is visualized.

In [None]:
#Gender show bar plot
sns.set(style='whitegrid')
ax=sns.barplot(x=suicideData['sex'].value_counts().index,
               y=suicideData['sex'].value_counts().values,
               palette="Blues_d",
               hue=['female','male'])
plt.legend(loc=8)
plt.xlabel('Gender')
plt.ylabel('Frequency')
plt.title('Show of Gender Bar Plot')
plt.show()

In [None]:
#Age show bar plot
sns.set(style='whitegrid')
ax=sns.barplot(x=suicideData['age'].value_counts().index,
               y=suicideData['age'].value_counts().values,
               palette="Blues_d")
plt.xlabel('Age')
plt.ylabel('Frequency')
plt.title('Show of Age Bar Plot')
plt.show()

In [None]:
#Generation show bar plot
sns.set(style='whitegrid')
ax=sns.barplot(x=suicideData['generation'].value_counts().index,
               y=suicideData['generation'].value_counts().values,
               palette="Blues_d")
plt.xticks(rotation = 45)
plt.xlabel('Generation')
plt.ylabel('Frequency')
plt.title('Show of Generation Bar Plot')
plt.show()

In [None]:
#Countries show bar plot
f, ax = plt.subplots(figsize=(6, 15))
sns.set(style='whitegrid')
ax=sns.barplot(x=suicideData['country'].value_counts().values,
               y=suicideData['country'].value_counts().index,
               palette="Blues_d")
plt.xlabel('Frequency')
plt.ylabel('Country')
plt.title('Show of Country Bar Plot')
plt.show()

<a id = "12"></a><br>
## 1) Do men or women commit suicide more ?

In [None]:
plt.figure(figsize=(8,5))
sns.barplot(x = "sex", 
            y = "suicides_number", 
            data = suicideData)
plt.xlabel('Gender')
plt.ylabel('Number of Suicide')
plt.title('Suicide Numbers by Gender')
plt.show()

In [None]:
plt.figure(figsize=(8,5))
sns.barplot(x = "sex", 
            y = "suicides_per_100k_pop", 
            data = suicideData)
plt.xlabel('Gender')
plt.ylabel('Suicide Ratio')
plt.title('Suicide Ratio by Gender')
plt.show()

<a id = "13"></a><br>
## 2) Visualize the ranking of the countries by the number of suicides.

In [None]:
countrySN = suicideData.groupby("country", as_index = False)["suicides_number"].mean().sort_values(by="suicides_number",ascending = False)
plt.figure(figsize=(6,15))
sns.barplot(x = "suicides_number", 
            y = "country", 
            data = countrySN)
plt.xlabel('Number of Suicide')
plt.ylabel('Country')
plt.title('Suicide Numbers by Countries')
plt.show()

In [None]:
countrySR = suicideData.groupby("country", as_index = False)["suicides_per_100k_pop"].mean().sort_values(by="suicides_per_100k_pop",ascending = False)
plt.figure(figsize=(6,15))
sns.barplot(x = "suicides_per_100k_pop", 
            y = "country", 
            data = countrySR)
plt.xlabel('Suicide Ratio')
plt.ylabel('Country')
plt.title('Suicide Ratio by Countries')
plt.show()

<a id = "14"></a><br>
## 3) Visualize the distribution of the number of suicides by years.

In [None]:
yearMeanSN = suicideData.groupby("year", as_index = False)["suicides_number"].mean()
plt.figure(figsize=(15,6))
sns.pointplot(x = "year", 
              y = "suicides_number", 
              data = yearMeanSN)
plt.xlabel('Year')
plt.ylabel('Average Number of Suicide')
plt.title('Average Suicide Numbers by Years')
plt.show()

In [None]:
yearSumSN = suicideData.groupby("year", as_index = False)["suicides_number"].sum()
plt.figure(figsize=(15,6))
sns.pointplot(x = "year", 
              y = "suicides_number", 
              data = yearSumSN)
plt.xlabel('Year')
plt.ylabel('Number of Suicide')
plt.title('Total Suicide Numbers by Years')
plt.show()

In [None]:
yearMeanSR = suicideData.groupby("year", as_index = False)["suicides_per_100k_pop"].mean()
plt.figure(figsize=(15,6))
sns.pointplot(x = "year", 
              y = "suicides_per_100k_pop", 
              data = yearMeanSR)
plt.xlabel('Year')
plt.ylabel('Suicide Ratio')
plt.title('Suicide Ratios by Years')
plt.show()

<a id = "15"></a><br>
## 4) Visualize the 5 countries with the highest suicide rate in the time series.

In [None]:
lithuania = suicideData[suicideData["country"] == "Lithuania"]
srilanka = suicideData[suicideData["country"] == "Sri Lanka"]
russia = suicideData[suicideData["country"] == "Russian Federation"]
hungary = suicideData[suicideData["country"] == "Hungary"]
belarus = suicideData[suicideData["country"] == "Belarus"]
topFiveCountries = pd.concat([lithuania, srilanka, russia, hungary, belarus], ignore_index = True)
topFive = topFiveCountries.groupby(["country", "year"], as_index = False)["suicides_per_100k_pop"].mean()

plt.figure(figsize=(15,8))
sns.lineplot(x="year", 
             y="suicides_per_100k_pop",
             hue="country",
             data=topFive)
plt.xlabel('Year')
plt.ylabel('Suicide Ratio')
plt.title('Suicide Ratios by Years')
plt.show()

<a id = "16"></a><br>
## 5) Visualize the number of suicides by age groups.

In [None]:
ageDataSN = suicideData.groupby("age", as_index = False)["suicides_number"].mean().sort_values(by="suicides_number",ascending = False)
plt.figure(figsize=(8,5))
sns.barplot(x = "age", 
            y = "suicides_number", 
            data = ageDataSN)
plt.xlabel('Age Group')
plt.ylabel('Number of Suicide')
plt.title('Suicide Numbers by Age Groups')
plt.show()

In [None]:
ageDataSR = suicideData.groupby("age", as_index = False)["suicides_per_100k_pop"].mean().sort_values(by="suicides_per_100k_pop",ascending = False)
plt.figure(figsize=(8,5))
sns.barplot(x = "age", 
            y = "suicides_per_100k_pop", 
            data = ageDataSR)
plt.xlabel('Age Group')
plt.ylabel('Suicide Ratio')
plt.title('Suicide Ratio by Age Groups')
plt.show()

<a id = "17"></a><br>
## 6) Visualize the distribution of the number of suicides by generation.

In [None]:
genDataSN = suicideData.groupby("generation", as_index = False)["suicides_number"].mean().sort_values(by="suicides_number",ascending = False)
plt.figure(figsize=(8,5))
sns.barplot(x = "generation",
            y = "suicides_number",
            data = genDataSN)
plt.xlabel('Generation')
plt.ylabel('Number of Suicide')
plt.title('Suicide Numbers by Generations')
plt.show()

In [None]:
genDataSR = suicideData.groupby("generation", as_index = False)["suicides_per_100k_pop"].mean().sort_values(by="suicides_per_100k_pop",ascending = False)
plt.figure(figsize=(8,5))
sns.barplot(x = "generation", 
            y = "suicides_per_100k_pop", 
            data = genDataSR)
plt.xlabel('Generations')
plt.ylabel('Suicide Ratio')
plt.title('Suicide Ratios by Generations')
plt.show()

<a id = "18"></a><br>
## 7) Visualize the breakdown of the 2nd question by gender.

In [None]:
csDataSN = suicideData.groupby(["country","sex"], as_index = False)["suicides_number"].mean().sort_values(by="suicides_number",ascending = False)
plt.figure(figsize=(8,25))
sns.barplot(x = "suicides_number", 
            y = "country", 
            hue="sex", 
            data = csDataSN)
plt.xlabel('Number of Suicide')
plt.ylabel('Country')
plt.title('Suicide Numbers by Countries and Gender')
plt.show()

In [None]:
csDataSR = suicideData.groupby(["country","sex"], as_index = False)["suicides_per_100k_pop"].mean().sort_values(by="suicides_per_100k_pop",ascending = False)
plt.figure(figsize=(8,25))
sns.barplot(x = "suicides_per_100k_pop", 
            y = "country", 
            hue="sex", 
            data = csDataSR)
plt.xlabel('Suicide Ratio')
plt.ylabel('Country')
plt.title('Suicide Ratios by Countries and Gender')
plt.show()

<a id = "19"></a><br>
## 8) Visualize the breakdown of the 3rd question by gender.

In [None]:
asDataSN = suicideData.groupby(["age","sex"], as_index = False)["suicides_number"].mean().sort_values(by="suicides_number",ascending = False)
plt.figure(figsize=(8,5))
sns.barplot(x = "age", 
            y = "suicides_number", 
            hue="sex", 
            data = asDataSN)
plt.xlabel('Age Group')
plt.ylabel('Suicide Number')
plt.title('Suicide Numbers by Age and Gender')
plt.show()

In [None]:
asDataSR = suicideData.groupby(["age","sex"], as_index = False)["suicides_per_100k_pop"].mean().sort_values(by="suicides_per_100k_pop",ascending = False)
plt.figure(figsize=(8,5))
sns.barplot(x = "age", 
            y = "suicides_per_100k_pop", 
            hue="sex", 
            data = asDataSR)
plt.xlabel('Age Group')
plt.ylabel('Suicide Ratio')
plt.title('Suicide Ratio by Age and Gender')
plt.show()

<a id = "20"></a><br>
## 9) Visualize the breakdown of the 4th question by gender. 

In [None]:
gsDataSN = suicideData.groupby(["generation","sex"], as_index = False)["suicides_number"].mean().sort_values(by="suicides_number",ascending = False)
plt.figure(figsize=(8,5))
sns.barplot(x = "generation", 
            y = "suicides_number", 
            hue="sex", 
            data = gsDataSN)
plt.xlabel('Generation')
plt.ylabel('Suicide Ratio')
plt.title('Suicide Ratio by Generation and Gender')
plt.show()

In [None]:
gsDataSR = suicideData.groupby(["generation","sex"], as_index = False)["suicides_per_100k_pop"].mean().sort_values(by="suicides_per_100k_pop",ascending = False)
plt.figure(figsize=(8,5))
sns.barplot(x = "generation", 
            y = "suicides_per_100k_pop", 
            hue="sex", 
            data = gsDataSR)
plt.xlabel('Generation')
plt.ylabel('Suicide Ratio')
plt.title('Suicide Ratio by Generation and Gender')
plt.show()

<a id = "21"></a><br>
## 10) Visualize suicide rates in 1995 using word cloud on a country basis.

In [None]:
abc = suicideData.groupby(["country","year"], as_index= False)["suicides_per_100k_pop"].mean().sort_values(by = "suicides_per_100k_pop", ascending = False)
x95 = abc.country[abc.year == 1995]
plt.subplots(figsize=(12,12))
wordcloud = WordCloud(
                          background_color='white',
                          width=512,
                          height=384
                         ).generate(" ".join(x95))
plt.imshow(wordcloud)
plt.axis('off')
plt.savefig('graph.png')

plt.show()

<a id = "22"></a><br>
## 11) Visualize gdp_for_year & population & suicides_number & gdp_for_capital relations.

In [None]:
#Suicide Numbers vs Population by Generation and Gender
plt.figure(figsize=(12,8))
ax = sns.scatterplot(x="suicides_number", 
                     y="population",
                     hue="generation", 
                     style="sex", 
                     data=suicideData)
plt.xlabel('Suicide Number')
plt.ylabel('Population')
plt.title('Suicide Numbers vs Population by Generation and Gender')
plt.show()

In [None]:
#Suicide Numbers vs GDP for Year by Generation and Gender
plt.figure(figsize=(12,8))
ax = sns.scatterplot(x="suicides_number", 
                     y="gdp_for_year",
                     hue="generation", 
                     style="sex", 
                     data=suicideData)
plt.xlabel('Suicide Number')
plt.ylabel('GDP for Year')
plt.title('Suicide Numbers vs GDP for Year by Generation and Gender')
plt.show()

In [None]:
#GDP for Capital vs GDP for Year by Generation and Gender
plt.figure(figsize=(12,8))
ax = sns.scatterplot(x="gdp_per_capital", 
                     y="gdp_for_year",
                     hue="generation", 
                     style="sex", 
                     data=suicideData)
plt.xlabel('GDP per Capital')
plt.ylabel('GDP for Year')
plt.title('GDP for Capital vs GDP for Year by Generation and Gender')
plt.show()

In [None]:
#GDP for Year vs Population by Generation and Gender
plt.figure(figsize=(12,8))
ax = sns.scatterplot(x="gdp_for_year", 
                     y="population",
                     hue="generation", 
                     style="sex", data=suicideData)
plt.xlabel('GDP for Year')
plt.ylabel('Population')
plt.title('GDP for Year vs Population by Generation and Gender')
plt.show()

<a id = "23"></a><br>
## EXTRA: Examine suicide cases, occurred in Turkey Republic, basis on gender, age, year and generation.

In [None]:
tr = suicideData[suicideData["country"] == "Turkey"]
tr

In [None]:
#Year
trYearSum = tr.groupby("year", as_index= False)["suicides_number"].sum()
plt.figure(figsize=(15,6))
sns.pointplot(x = "year", 
              y = "suicides_number",
              color="#bb3f3f",
              data = trYearSum)
plt.xlabel('Year')
plt.ylabel('Total Number of Suicide')
plt.title('Total Suicide Numbers by Years in Turkey Republic')
plt.show()

In [None]:
#Gender
trGenderSum = tr.groupby("sex", as_index= False)["suicides_number"].sum()
trGenderSum

In [None]:
# Data to plot
labels = 'Men', 'Women'
sizes = [7562, 2569]
colors = ['lightskyblue', 'lightcoral']
explode = (0.1, 0)  # explode 1st slice

# Plot
plt.figure(figsize = (7,7))
plt.pie(sizes, explode=explode, labels=labels, colors=colors,
autopct='%1.1f%%', shadow=True, startangle=140)

plt.axis('equal')
plt.show()

In [None]:
#Age
trAgeSum = tr.groupby("age", as_index = False)["suicides_number"].sum().sort_values(by="suicides_number", ascending = False)
plt.figure(figsize=(8,5))
sns.barplot(x = "age", 
            y = "suicides_number", 
            data = trAgeSum)
plt.xlabel('Age Group')
plt.ylabel('Total Suicide Number')
plt.title('Total Suicide Numbers by Age Groups in Turkey Republic')
plt.show()

In [None]:
trAgeRateAvg = tr.groupby("age", as_index = False)["suicides_per_100k_pop"].mean().sort_values(by="suicides_per_100k_pop", ascending = False)
plt.figure(figsize=(8,5))
sns.barplot(x = "age", 
            y = "suicides_per_100k_pop", 
            data = trAgeRateAvg)
plt.xlabel('Age Group')
plt.ylabel('Average Suicide Ratio')
plt.title('Average Suicide Ratio by Age Groups in Turkey Republic')
plt.show()

In [None]:
#Generation
trGenSum = tr.groupby("generation", as_index = False)["suicides_number"].sum().sort_values(by="suicides_number", ascending = False)
plt.figure(figsize=(8,5))
sns.barplot(x = "generation", 
            y = "suicides_number", 
            data = trGenSum)
plt.xlabel('Generation')
plt.ylabel('Total Suicide Number')
plt.title('Total Suicide Numbers by Generations in Turkey Republic')
plt.show()

In [None]:
trGenRateAvg = tr.groupby("generation", as_index = False)["suicides_per_100k_pop"].mean().sort_values(by="suicides_per_100k_pop", ascending = False)
plt.figure(figsize=(8,5))
sns.barplot(x = "generation", 
            y = "suicides_per_100k_pop", 
            data = trGenRateAvg)
plt.xlabel('Generation')
plt.ylabel('Average Suicide Ratio')
plt.title('Average Suicide Ratio by Generation in Turkey Republic')
plt.show()

<a id = "24"></a><br>
# CONCLUSION
The data generally was observed in 2 ways. On the basis of number and rate. This is because 10 suicide cases in a society where 100 people live and 10 suicide cases in a society where 1 million people live do not give us the same information. This is a little bit about how we look at the picture.
* The rate of men who commit suicide between 1985 and 2015 is almost 4 times more than women.
* The number of men who commit suicide between 1985 and 2015 is 3.33 times more than women.
* The highest rates of suicide cases were generally observed in countries with cold climates such as Russia, Lithuania and Belarus.
* Considering the numbers, suicide cases are mostly between the ages of 35-54 and at least between the ages of 5-14. In addition, those born mostly between 1946 and 1964 (Boomers) attempted suicide.
* Considering the ratio, suicide cases are mostly in the 75+ age group and at least in the 5-14 age group. In addition, those most born between 1901-1927 (G.I. Generation) attempted suicide.
* When we examine both on the basis of ratio and number, we can say that there is a serious suicidal tendency in people between 1987-1995. After 1995, even if there is an increase in suicide numbers, we observe a decrease in rates. The reason for this is the population growth in the world.

In short, we can evaluate suicide cases according to human characteristics thanks to this data set. However, we can never ask why, because we have not been given such data. If we had such a feature, we could answer many more questions.
