<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:#5642C5;
           font-size:110%;
           font-family:Verdana;
           letter-spacing:0.9px">
    <p style="padding: 10px;
              color:white;
              font-size:110%">
        If you like this notebook, please give it an <span style="color:#F28835;"><b><i>upvote</i></b></span> as it keeps me motivated to create more quality kernels.
    </p>
</div>

In [None]:
import numpy as np 
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from matplotlib import rcParams
import plotly.express as px
from plotly.offline import iplot, init_notebook_mode
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.figure_factory as ff
import folium 
from folium import plugins
rcParams['figure.figsize'] = (15,10)






import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#5642C5;
           font-size:110%;
           font-family:Verdana">
    <h1 id ='loading_data' style="color:white;">1. Loading Data
    <a class="anchor-link" href="https://www.kaggle.com/shubhamksingh/suicide-data-diving-deep#loading_data" target="_self"></a>
    </h1>
</div>

In [None]:
data = pd.read_csv('../input/suicide-rates-overview-1985-to-2016/master.csv')

In [None]:
data.sample()

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#5642C5;
           font-size:110%;
           font-family:Verdana">
    <h1 id='missing_data' style="color:white;">2. Missing Data<a class="anchor-link" href="https://www.kaggle.com/shubhamksingh/suicide-data-diving-deep#missing_data" target="_self"></a></h1>
</div>

In [None]:
missing = data.isnull().sum()
missing = missing[missing>0].to_frame(name='Missing')
missing.style.background_gradient(cmap='Reds',subset=["Missing"])

19456... well thats a big number right there. It will do us more harm than good. Lets get rid of it ASAP.

In [None]:
data.drop(['HDI for year'], axis=1, inplace=True)

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#5642C5;
           font-size:110%;
           font-family:Verdana">
    <h1 id='data_overview' style="color:white;">3. Data Overview<a class="anchor-link" href="https://www.kaggle.com/shubhamksingh/suicide-data-diving-deep#data_overview" target="_self"></a></h1>
</div>

In [None]:
data.shape

In [None]:
data.head(3)

In [None]:
data.tail(3)

In [None]:
data.info()

In [None]:
data.describe()

In [None]:
# Lets look at random 20 samples, shall we
data.sample(n=20, random_state=1)

In [None]:
display(list(data.columns))

To be honest, I simply do not like the name of 3 columns. I mean we do not want that dollar sign at the end with unnecessary spaces, and suicide_no can be changed to suicides. So, before we start our journey I want to change them.

In [None]:
data = data.rename(columns={'suicides_no':'suicides', ' gdp_for_year ($) ':'gdp_for_year', 'gdp_per_capita ($)':'gdp_per_capita'})

In [None]:
display(list(data.columns))

There we go, looks better!!

In [None]:
display(data['country'].nunique())
display(data['country'].unique())

There are a total of 101 countries in this DataSet.

In [None]:
print(f"Min Year in Dataset: {min(data.year)}\nMax Year in Dataset: {max(data.year)}")

We are dealing with data from 1985 to 2016. We will try to find good insights in this data. We will find out how the suicides rates changed over time with respect to different countries and gender along with other factors.

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#EC2566;
           font-size:90%;
           font-family:Verdana">
    <h1 style="color:white;">1985</h1>
</div>

In [None]:
# Dataset of the year 1987
year_1985 = data[(data['year'] == 1985)]

# Total number of suicides in year 1987 (countrywise)
year_1985 = year_1985.groupby('country')[['suicides']].sum().reset_index()

# Sort values in ascending order
year_1985 = year_1985.sort_values(by='suicides', ascending=False)

# Styling output dataframe
year_1985.style.background_gradient(cmap='Reds', subset=['suicides'])

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#EC2566;
           font-size:90%;
           font-family:Verdana">
    <h1 style="color:white;">2016</h1>
</div>

In [None]:
# Dataset of the year 2016
year_2016 = data[(data['year'] == 2016)]

# Total number of suicides in year 2016 (countrywise)
year_2016 = year_2016.groupby('country')[['suicides']].sum().reset_index()

# Sort values in ascending order
year_2016 = year_2016.sort_values(by='suicides', ascending=False)

# Styling output dataframe
year_2016.style.background_gradient(cmap='Reds', subset=['suicides'])

If we look at the data above, we can conclude out that ***suicides were more widespread and frequent in 1987*** when compared to 2016. Later in this kernel we will look at the trend chart of suicide rate.

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#5642C5;
           font-size:110%;
           font-family:Verdana">
    <h1 id='data_viz' style="color:white;">4. Data Visualizations<a class="anchor-link" href="https://www.kaggle.com/shubhamksingh/suicide-data-diving-deep#data_viz" target="_self"></a></h1>
</div>

#### Before we start visualizing the data and extracting information from it. Lets look at how many people in total commited suicide from 1985 to 2016.

In [None]:
print(f"Total number of suicides from 1985-2016: {data.suicides.sum()}")

In [None]:
corrmat = data.corr()
plt.subplots(figsize=(17,17))
plt.title("Correlation Matrix")

sns.heatmap(corrmat, vmax=0.9, square=True, cmap="Oranges", annot=True, fmt='.1f', linewidth='.1')

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#EC2566;
           font-size:90%;
           font-family:Verdana">
    <h1 style="color:white;">1985 vs 2016 (Country Wise)</h1>
</div>

In [None]:
comparison_year = pd.DataFrame({'year':[1985, 2016], 'suicides': [year_1985.suicides.sum(), year_2016.suicides.sum()]})

fig = px.bar(comparison_year, x=comparison_year['year'], y=comparison_year['suicides'], color_discrete_sequence=[["#FA4152", "#36BB91"]])
fig.update_layout(title={
                  'text': "Number of Suicides in 1985 vs 2016",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=600,
                  xaxis_title="Year", 
                  yaxis_title="Number of Suicides",
                 )

fig.show()

In [None]:
fig = px.bar(year_1985, x=year_1985['suicides'], y=year_1985['country'], color=year_1985['country'], color_discrete_sequence=px.colors.qualitative.Pastel)

fig.update_layout(title={
                         'text':'Suicides in 1985 (Country wise)',
                         'y':0.98,
                         'x':0.5,
                         'xanchor': 'center',
                         'yanchor': 'top'},
                  plot_bgcolor='white', 
                  height=1000,
                  showlegend=False
                 )

fig.show()

In [None]:
fig = px.bar(year_2016, x=year_2016['suicides'], y=year_2016['country'], color=year_2016['country'], color_discrete_sequence=px.colors.qualitative.Pastel)

fig.update_layout(title={
                         'text':'Suicides in 2016 (Country wise)',
                         'y':0.98,
                         'x':0.5,
                         'xanchor': 'center',
                         'yanchor': 'top'},
                  plot_bgcolor='white', 
                  height=650
                 )

fig.show()

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#EC2566;
           font-size:90%;
           font-family:Verdana">
    <h1 style="color:white;">From 1985 to 2016 (The Bigger Picture)</h1>
</div>

In [None]:
year_suicides = data.groupby('year')[['suicides']].sum().reset_index()

In [None]:
year_suicides.sort_values(by='suicides', ascending=False).style.background_gradient(cmap='Reds', subset=['suicides'])

<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:orange;
           font-size:110%;
           font-family:Verdana;
           letter-spacing:0.9px">
    <p style="padding: 10px;
              color:white;
              font-size:100%">
We can see from the table above that the most number of suicides in the world took place in the year 1999.    </p>
</div>



In [None]:
fig = px.bar(year_suicides, year_suicides['year'], year_suicides['suicides'], color='year')

fig.update_layout(template='plotly_white')

fig.show()

In [None]:
country_year = data.groupby(['year','country'])[['suicides']].sum().reset_index()

# Looking at random 0.01% of data
country_year.sample(frac=0.01)

In [None]:
# For your convenience
country_year.to_csv('country_year_suicides.csv')

In [None]:
numerical_data = data.select_dtypes(exclude=['object']).copy()

numerical_data.head(3)

In [None]:
fig1 = plt.figure(figsize=(15,32))
for i in range(len(numerical_data.columns)):
    fig1.add_subplot(9, 2, i+1)
    sns.scatterplot(numerical_data.iloc[:, i], numerical_data['suicides'], palette='spring', marker='D', hue=numerical_data['suicides'], legend=False)
plt.tight_layout()
plt.show()


# PLOTLY

# fig = make_subplots(rows=3, cols=2)

# fig.add_trace(
#     go.Scatter(x=numerical_data['year'], y=numerical_data['suicides']),
#     row=1, col=1
# )
# fig.add_trace(
#     go.Scatter(x=numerical_data['population'], y=numerical_data['suicides']),
#     row=1, col=2
# )
# fig.add_trace(
#     go.Scatter(x=numerical_data['suicides/100k pop'], y=numerical_data['suicides']),
#     row=2, col=1
# )
# # Update xaxis properties
# fig.update_xaxes(title_text="Year", row=1, col=1)
# fig.update_xaxes(title_text="Population", row=1, col=2)
# fig.update_xaxes(title_text="Suicides/100k Population", showgrid=False, row=2, col=1)

# # Update yaxis properties
# fig.update_yaxes(title_text="Number of Suicides", row=1, col=1)
# fig.update_yaxes(title_text="Number of Suicides", row=1, col=2)
# fig.update_yaxes(title_text="Number of Suicides", row=2, col=1)


# fig.update_layout(height=1100, title_text="Subplots based on Number of Suicides")

# fig.show()

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#EC2566;
           font-size:90%;
           font-family:Verdana">
    <h1 style="color:white;">Age Group</h1>
</div>

In [None]:
age_grp = data.groupby('age')[['suicides']].sum().reset_index()

age_grp.sort_values(by='suicides', ascending=False).style.background_gradient(cmap='Reds', subset=['suicides'])

In [None]:
fig = go.Figure()

fig.add_trace(go.Scatter(x=age_grp['age'],y=age_grp['suicides'], 
                         line_shape='spline',fill='tonexty')) 

fig.update_layout(title={
                        'text': "Suicide based on Age-Group",
                        'y':0.95,
                        'x':0.5,
                        'xanchor': 'center',
                        'yanchor': 'top'}, 
                  yaxis_title="Number of Suicides", 
                  xaxis_title="Age Group")

fig.update_layout(plot_bgcolor='rgb(275, 275, 275)',height=600)

fig.show()

In [None]:
fig1 = px.bar(age_grp, x='age', y='suicides', color='age')

fig1.update_layout(title={
                  'text': "Suicides based on Age-Group",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=600
                 )

fig1.data[2].marker.line.width = 3
fig1.data[2].marker.line.color = "black"

# ----------------------------------------------------

fig2 = px.pie(age_grp, 'age', 'suicides', hole=.5)
fig2.update_traces(textposition='inside', textinfo='percent+label')
fig2.update_layout(title={
                        'text': "Suicide based on Age-Group",
                        'y':0.95,
                        'x':0.5,
                        'xanchor': 'center',
                        'yanchor': 'top'},
                         showlegend=False,
                         height=600
                        )

fig2.data[0].marker.line.width = 2
fig2.data[0].marker.line.color = "black"

fig1.show()
fig2.show()

<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:orange;
           font-size:110%;
           font-family:Verdana;
           letter-spacing:0.9px">
    <p style="padding: 10px;
              color:white;
              font-size:100%">
        People with age between 35-54 years commited the most number of suicides. This data expands from 1985 to 2016. One thing can be inferred from the plot above, is that people of age 25-34 years are more likely to commit suicide followed by people of age ranging from 55-74 years.
    </p>
</div>

In [None]:
age_year_grp = data.groupby(['year', 'age'])[['suicides']].sum().reset_index()

In [None]:
fig = px.scatter_3d(age_year_grp, 'age', 'suicides', 'year', size='suicides', color='age')
fig.update_layout(title={
                  'text': "3D Graph of Age - Suicide - Year",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=800
                 )
fig.show()

In [None]:
fig = px.scatter(age_year_grp, 'age', 'suicides', color='year', size='suicides')

fig.update_layout(title={
                  'text': "2D Graph of Age - Suicide - Year",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=600,
                 )
fig.show()

In [None]:
fig = px.bar(age_year_grp, x="age", y="suicides",color='age',animation_frame = 'year')
fig.update_layout(xaxis={'categoryorder':'total descending'})
fig.update_layout(title='Suicides by different Age Group from 1985-2016')
fig.update_layout(showlegend=False, height=600, plot_bgcolor='white', template='plotly_white')

fig.data[2].marker.line.width = 2
fig.data[2].marker.line.color = "black"

fig.show()

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#EC2566;
           font-size:90%;
           font-family:Verdana">
    <h1 style="color:white;">Gender</h1>
</div>

In [None]:
data.head(1)

In [None]:
gender = data.groupby('sex')[['suicides']].sum().reset_index()

In [None]:
fig1 = go.Figure()
fig1.add_trace(go.Pie(labels=gender['sex'], values=gender['suicides'], hole=.5, pull=[0.1]))
fig1.update_traces(textposition='inside', textinfo='percent+label')
fig1.update_layout(title={
                        'text': "Suicide based on Gender",
                        'y':0.95,
                        'x':0.5,
                        'xanchor': 'center',
                        'yanchor': 'top'},
                         showlegend=False,
                         height=600
                        )

fig1.data[0].marker.line.width = 3
fig1.data[0].marker.line.color = "black"

# ----------------------------------------------------

fig2 = px.bar(gender, x=gender['sex'], y=gender['suicides'], color_discrete_sequence=[["#FA4152", "#36BB91"]])

fig2.update_layout(title={
                  'text': "Number of Suicides: Male and Female",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=600,
                  xaxis_title="Gender", 
                  yaxis_title="Number of Suicides",
                 )

fig1.show()
fig2.show()

<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:orange;
           font-size:110%;
           font-family:Verdana;
           letter-spacing:0.9px">
    <p style="padding: 10px;
              color:white;
              font-size:100%">
        From the above bar chart, we can see that males commit more suicides when compared to females.    </p>
</div>

In [None]:
gender = data.groupby(['sex', 'age'])[['suicides']].sum().reset_index()

male = gender[(gender['sex']=='male')]
female = gender[(gender['sex']=='female')]

In [None]:
male.sort_values(by='suicides', ascending=False).style.background_gradient(cmap='Purples', subset=['suicides'])

In [None]:
female.sort_values(by='suicides', ascending=False).style.background_gradient(cmap='Oranges', subset=['suicides'])

In both the cases (males and females), people with age between 35-54 years are more likely to commit a suicide.

In [None]:
fig = px.bar(gender, x=gender['sex'], y=gender['suicides'], color='age')

fig.update_layout(title={
                  'text': "Suicides: Male & Female baed on age-group",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=600,
                  xaxis_title="Gender and Age Group", 
                  yaxis_title="Number of Suicides", 
                  barmode='group'
                 )

In [None]:
numerical_data['sex'] = data['sex']
sns.pairplot(numerical_data, hue='sex', height=2.5, diag_kind='kde')

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#EC2566;
           font-size:90%;
           font-family:Verdana">
    <h1 style="color:white;">Population</h1>
</div>

In [None]:
pop = data.groupby(['country','population'])['suicides'].sum().reset_index()

In [None]:
fig = px.scatter(pop, 'suicides', 'population', size='suicides', color='country')

fig.update_layout(title={
                  'text': "Suicides based on Population of Country",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=600,
                  xaxis_title="Number of Suicides", 
                  yaxis_title="Population of different Countries", 
                  showlegend=False)

<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:orange;
           font-size:110%;
           font-family:Verdana;
           letter-spacing:0.9px">
    <p style="padding: 10px;
              color:white;
              font-size:100%">
    Countries with lower 'Population' have considerably lower suicide rates when compared to countries with higher population. We saw this relationship of population with suicides in the heatmap correlation matrix and it becomes quite clear from the graph above.    
    </p>
</div>

In [None]:
pop_year = data.groupby(['year'])[['population','suicides']].sum().reset_index()

pop_year.style.background_gradient(cmap='Blues', subset=['population'])\
.background_gradient(cmap='Reds', subset=['suicides'])

In [None]:
fig = px.scatter(pop_year, 'suicides', 'population', color='population', size='suicides')
fig.update_layout(title={
                  'text': "Suicides based on Population of Country",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=600,
                  xaxis_title="Number of Suicides", 
                  yaxis_title="Population of different Countries", 
                  showlegend=False)
fig.show()

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#EC2566;
           font-size:90%;
           font-family:Verdana">
    <h1 style="color:white;">Suicide/100k Population</h1>
</div>

In [None]:
per100k = data.groupby(['country', 'year'])[['suicides/100k pop']].sum().reset_index()

per100k.sort_values(by='suicides/100k pop', ascending=False).head(20).style.background_gradient(cmap='Blues', subset=['suicides/100k pop'])

In [None]:
fig = px.bar(per100k, x="country", y="suicides/100k pop",color='country',animation_frame = 'year')
fig.update_layout(xaxis={'categoryorder':'total descending'})
fig.update_layout(title='Suicide/100k from 1985-2016')
fig.update_layout(showlegend=False, height=600, plot_bgcolor='white', template='plotly_white')


fig.show()

In [None]:
fig = px.scatter(per100k, 'country', 'suicides/100k pop', color='country', size='suicides/100k pop')

fig.update_layout(title={
                  'text': "Suicide/100k Population of Country",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=600,
                  xaxis_title="Countries", 
                  yaxis_title="Suicide/100k Population", 
                  showlegend=False)

<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:orange;
           font-size:110%;
           font-family:Verdana;
           letter-spacing:0.9px">
    <p style="padding: 10px;
              color:white;
              font-size:100%">
    'Lithuania' has the most number of suicides commited per 100k of its population most of the years. In above animated bar chart we can see that, a few countries slowly slide in for number one position but Lithuania again comes back at top. We also notice countries like 'Hungry', 'Russian Federation' and 'Sri Lanka' at the top few times.    
    </p>
</div>

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#EC2566;
           font-size:90%;
           font-family:Verdana">
    <h1 style="color:white;">GDP Per Capita</h1>
</div>

In [None]:
gdp = data.groupby(['country', 'year'])[['gdp_per_capita']].mean().reset_index()
gdp

In [None]:
fig = px.scatter(gdp, 'country', 'gdp_per_capita', color='country', size='gdp_per_capita')

fig.update_layout(title={
                  'text': "GDP per Capita",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=600,
                  xaxis_title="Countries", 
                  yaxis_title="GDP per Capita", 
                  showlegend=False)

In [None]:
fig1 = px.violin(gdp, 'gdp_per_capita', points='all', box=True)
fig1.update_layout(title={
                  'text': "GDP per Capita (Range)",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=600,  
                  showlegend=False)

# ----------------------------------------------------------

fig2 = px.box(gdp,'country', 'gdp_per_capita', color='country')
fig2.update_layout(title={
                  'text': "GDP per Capita (Country Wise)",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=600,
                  showlegend=False)


fig1.show()
fig2.show()

In [None]:
gdp_suicides = data.groupby(['gdp_per_capita', 'country'])[['suicides']].sum().reset_index()

gdp_suicides

In [None]:
fig = px.scatter(gdp_suicides, 'gdp_per_capita', 'suicides', color='country', size='gdp_per_capita')
fig.update_layout(title={
                  'text': "Relation between GDP/Capita & Suicides",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=600,
                  xaxis_title="GDP per Capita", 
                  yaxis_title="Suicides",
                  showlegend=False)

<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:orange;
           font-size:110%;
           font-family:Verdana;
           letter-spacing:0.9px">
    <p style="padding: 10px;
              color:white;
              font-size:100%">
    We can easily conclude that, if the GDP per capita of the country is high then there will be less suicides and the people will be more happy. This is because there will be more jobs and good pay in the country.   
    </p>
</div>

<div style="color:white;
           padding:8px 10px 0 10px;
           display:inline-block;
           border-radius:5px;
           background-color:#EC2566;
           font-size:90%;
           font-family:Verdana">
    <h1 style="color:white;">Generation</h1>
</div>

<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:#39365C;
           font-size:110%;
           font-family:Verdana;
           letter-spacing:0.9px">
    <p style="padding: 10px;
              color:white;
              font-size:95%">
    Lets learn about different types of generations:<br><br>

        <b>Millenials</b>: Millennials, also known as Generation Y, are the demographic cohort following Generation X and preceding Generation Z. Researchers and popular media use the early 1980s as starting birth years and the mid-1990s to early 2000s as ending birth years, with 1981 to 1996 a widely accepted defining range for the generation.<br><br>

        <b>Generation X</b>: Generation X is the demographic cohort following the baby boomers and preceding the Millennials. Researchers and popular media typically use birth years around 1965 to 1980 to define Generation Xers, although some sources use birth years beginning as early as 1960 and ending somewhere from 1977 to 1985.<br><br>

        <b>Generation Z</b>: Generation Z, or Gen Z for short, is the demographic cohort succeeding Millennials and preceding Generation Alpha. Researchers and popular media use the mid-to-late 1990s as starting birth years and the early 2010s as ending birth years.<br><br>

        <b>The Silent Generation</b>: The Silent Generation is the demographic cohort following the Greatest Generation and preceding the baby boomers. The cohort is defined as individuals born between 1928 and 1945.<br><br>

        <b>Boomers</b>: Baby boomers are the demographic cohort following the Silent Generation and preceding Generation X. The generation is most often defined as individuals born between 1946 and 1964, during the post–World War II baby boom.<br><br>

        <b>G.I. Generation</b>: The Greatest Generation, also known as the G.I. Generation and the World War II generation, is the demographic cohort following the Lost Generation and preceding the Silent Generation. The cohort is defined as individuals generally born between 1901 and 1927.<br><br>  
    </p>
</div>


In [None]:
data.head(3)

In [None]:
data.generation.unique()

In [None]:
gen = data.groupby('generation')[['suicides']].sum().reset_index()

gen.sort_values(by='suicides', ascending=False).style.background_gradient(cmap='Purples', subset=['suicides'])

In [None]:
fig = px.bar(gen.sort_values(by='suicides', ascending=False), x='generation', y='suicides', color='generation')

fig.update_layout(title={
                  'text': "Suicides based on Generation",
                  'y':0.98,
                  'x':0.5,
                  'xanchor': 'center',
                  'yanchor': 'top'}, 
                  template='plotly_white', 
                  height=600,
                  xaxis_title="Generation", 
                  yaxis_title="Suicides",
                  showlegend=False)

<div style="color:white;
           display:fill;
           border-radius:5px;
           background-color:orange;
           font-size:110%;
           font-family:Verdana;
           letter-spacing:0.9px">
    <p style="padding: 10px;
              color:white;
              font-size:100%">
    We can conclude from above graph that the Boomers Generation are more likely to commit suicide followed by the Silent Generation.<br>Generation Z are least likely to commit a suicide.
    </p>
</div>

Please take a look at my other kernels:

* https://www.kaggle.com/shubhamksingh/united-states-messed-up-covid19
* https://www.kaggle.com/shubhamksingh/formatting-notebooks-tutorial-html-markdown
* https://www.kaggle.com/shubhamksingh/cracking-covid19-prediction-in-depth-eda
* https://www.kaggle.com/shubhamksingh/top-3-stacking-blending-in-depth-eda
* https://www.kaggle.com/shubhamksingh/titanic-the-ride