##**1.Libraries**


In [42]:
import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

# **2. Import file**

In [43]:
# Read the file

## Save the path
p ='/content/osb_saludmental_conduc-suicida.xlsx'
## Read and create the dataframe
d = pd.read_excel(p)
d.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 677 entries, 0 to 676
Data columns (total 6 columns):
 #   Column                   Non-Null Count  Dtype 
---  ------                   --------------  ----- 
 0   Area                     677 non-null    int64 
 1   Location                 677 non-null    object
 2   Year                     677 non-null    int64 
 3   Behavior classification  677 non-null    object
 4   cases                    677 non-null    int64 
 5   Population               677 non-null    int64 
dtypes: int64(4), object(2)
memory usage: 31.9+ KB


##**3.Database fixes**

In [44]:
d.rename(columns = {'Year': 'year', 'Population':'population', 'Location':'location','Behavior classification':'behavior', 'Area':'area'}, inplace = True)
d.head()

Unnamed: 0,area,location,year,behavior,cases,population
0,1,Usaquén,2012,Completed suicide,24,505657
1,2,Chapinero,2012,Completed suicide,11,147658
2,3,Santa Fe,2012,Completed suicide,9,101296
3,4,San Cristóbal,2012,Completed suicide,14,385095
4,5,Usme,2012,Completed suicide,8,345128


In [45]:
d = d.replace('N.A.','Sin dato')

In [46]:
# Duplicate the dataframe
d2 = d

In [47]:
# Duplicate year as string / How to create a column
d2['year_str'] = d2['year'].astype(str)
d2.info()


<class 'pandas.core.frame.DataFrame'>
RangeIndex: 677 entries, 0 to 676
Data columns (total 7 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   area        677 non-null    int64 
 1   location    677 non-null    object
 2   year        677 non-null    int64 
 3   behavior    677 non-null    object
 4   cases       677 non-null    int64 
 5   population  677 non-null    int64 
 6   year_str    677 non-null    object
dtypes: int64(4), object(3)
memory usage: 37.1+ KB


In [48]:
## Attach the index
d2 = d2.set_index(['location','year','behavior'])

d2.head()

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,area,cases,population,year_str
location,year,behavior,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Usaquén,2012,Completed suicide,1,24,505657,2012
Chapinero,2012,Completed suicide,2,11,147658,2012
Santa Fe,2012,Completed suicide,3,9,101296,2012
San Cristóbal,2012,Completed suicide,4,14,385095,2012
Usme,2012,Completed suicide,5,8,345128,2012


##**4.Pie chart**##

In [49]:
### sort the data frame
d2_cases = d2.sort_values('cases', ascending= False)
## Extract 20 location
d2_cases_20 = d2_cases.iloc[0:30]

In [50]:
fig1 = px.pie(d2_cases_20, values = 'cases', names = d2_cases_20.index.get_level_values(0))

# Center the title
fig1.update_layout(title_text = 'Cases of suicidal behavior by locality',
                  title_font_size = 25,
                  title_x = 0.5)
fig1.show()

**History**

Suicide is a preventable cause of death that affects people of all ages, genders and backgrounds. Data on cases of suicidal behavior in different parts of Bogotá show that Kennedy has the highest rate, at 23.6%. In addition, Suba, Bosa and Ciudad Bolívar concentrate the largest number of cases of suicidal behavior. These four locations represent 72.7% of the total cases. It is recommended to strengthen public suicide prevention policies, improve care for people with mental health problems and raise awareness among the population about suicide.

##**5.Box Plot** ##

In [51]:
fig2= px.box(d2, x= 'year_str', y ='cases')

# Center the title
fig2.update_layout(title_text = 'Cases of suicidal behavior by years',
                  title_font_size = 25,
                  title_x = 0.5)
fig2.show()

**History**

The number of cases of suicidal behavior in Bogotá has increased steadily over the last 10 years, going from 246 cases in 2012 to 3,482 cases in 2022. Although there are periods without cases, the median has grown from 29 in 2012 to 208 in 2022. This increase is worrying, since suicide is a preventable cause of death.

There are several possible explanations for this increase. One possible explanation is the increase in mental health problems, such as depression and anxiety. Another possible explanation is the increase in social risk factors, such as poverty and violence.

## **6. Alittle more complex graph**

In [52]:
## create a ist to sellect location

c_list = ['Engativá', 'Suba', 'Kennedy', 'Chapinero', 'San Cristóbal', 'Usme']

## Select drom the dataframe (unidexing)

d3 = d2.reset_index()
loc = d3[d3['location'].isin(c_list)]

loc.head()

Unnamed: 0,location,year,behavior,area,cases,population,year_str
1,Chapinero,2012,Completed suicide,2,11,147658,2012
3,San Cristóbal,2012,Completed suicide,4,14,385095,2012
4,Usme,2012,Completed suicide,5,8,345128,2012
7,Kennedy,2012,Completed suicide,8,26,992398,2012
9,Engativá,2012,Completed suicide,10,21,784983,2012


In [53]:
fig3 = px.line(loc, x ='population', y = 'cases', text ='year_str', color = 'location')
fig3.update_traces(textposition = 'top center')

fig3.update_layout(title_text = 'Cases of suicidal behavior by population and locality',
                  title_font_size = 25,
                  title_x = 0.5)
fig3.show()

**History**

Detailed analysis of the graph reveals a notable increase in cases of suicidal behavior compared to the general population and a specific region during the year 2022. The town of Kennedy stands out significantly, being the most populated and with the highest incidence of cases of suicidal behavior. suicidal behavior throughout the city.

Kennedy's high population density correlates worryingly with a high incidence of suicidal behavior, indicating the need to understand the underlying factors. In addition, Suba and Engativá also present a considerable number of cases, pointing out similar mental health challenges in multiple areas.

##**7.Animation**

In [54]:
d3 = loc.groupby(['year','behavior']).sum()
d3 = d3.reset_index()
d3.head()




The default value of numeric_only in DataFrameGroupBy.sum is deprecated. In a future version, numeric_only will default to False. Either specify numeric_only or select only columns which should be valid for the function.



Unnamed: 0,year,behavior,area,cases,population
0,2012,Completed suicide,40,109,3745398
1,2012,Suicidal ideation,40,676,3745398
2,2012,Suicide attempt,40,810,3745398
3,2013,Completed suicide,40,105,3765238
4,2013,Suicidal ideation,40,1095,3765238


In [55]:
l2 = d3.pivot(index = "year",
                    columns = "behavior",
                    values = "cases")

l2 = l2.reset_index()
l2['year_str'] = l2['year'].astype(str)
l2.head()

behavior,year,Completed suicide,Suicidal ideation,Suicide attempt,year_str
0,2012,109,676,810,2012
1,2013,105,1095,993,2013
2,2014,123,1766,1404,2014
3,2015,149,2131,1457,2015
4,2016,174,2242,1308,2016


In [56]:
l2.rename(columns = {'Completed suicide': 'com', 'Suicidal ideation':'ide', 'Suicide attempt':'att'}, inplace = True)
l2.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11 entries, 0 to 10
Data columns (total 5 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   year      11 non-null     int64 
 1   com       11 non-null     int64 
 2   ide       11 non-null     int64 
 3   att       11 non-null     int64 
 4   year_str  11 non-null     object
dtypes: int64(4), object(1)
memory usage: 568.0+ bytes


In [57]:
# The frame

fig = go.Figure(
    layout= go.Layout(
        updatemenus = [dict(type = 'buttons', direction = 'right', x = 0.9, y = 1.16)],
        xaxis = dict(range= [2017, 2021],
                     autorange = False, tickwidth = 2, dtick = 1,
                     title_text = 'year'),
        yaxis = dict(range=[0,11000],
                     autorange = False,
                     title_text =''),
        title = 'Main suicide cases in Bogota',
        title_font_size = 30,
        title_x =0.5
    )
)

## Add Traces
init = 1

## Completed suicide
fig.add_trace(
    go.Scatter(
        x = l2.year[:init],
        y = l2.com[:init],
        name = 'Completed suicide',
        line = dict(color='green'),
        mode = 'lines'
    )
)

## Suicidal ideation
fig.add_trace(
    go.Scatter(
        x = l2.year[:init],
        y = l2.ide[:init],
        name = 'Suicidal ideation',
        line = dict(color='red'),
        mode = 'lines'
    )
)

## Suicide attempt
fig.add_trace(
    go.Scatter(
        x = l2.year[:init],
        y = l2.att[:init],
        name = 'Suicide attempt',
        line = dict(color='blue'),
        mode = 'lines'
    )
)

## Frames
frames = [
    go.Frame(
        data = [
            go.Scatter(x=l2.year[:k], y=l2.com[:k]),
            go.Scatter(x=l2.year[:k], y=l2.ide[:k]),
            go.Scatter(x=l2.year[:k], y=l2.att[:k])
        ]
    )
    for k in range(init,len(l2)+1)
]

## Animation
fig.update(frames=frames)

## play button
fig.update_layout(
    updatemenus = [
        dict(
            buttons =list([
                dict(
                label = 'Play',
                method = 'animate',
                args = [None,{'frame':{'duration':800}}]
                )
            ]

            )
        )
    ]
)
fig.show()

**History**

In Bogotá, cases of completed suicide showed fluctuations over the years, with an increase until 2019, followed by a decrease in 2020 and another increase in 2021. Cases of suicidal ideation increased steadily from 2017 to 2021, indicating a growing concern in the population. Cases of attempted suicide decreased from 2017 to 2019, but increased in 2020 and 2021.

The analysis reveals that suicidal ideation is the category with the highest values, reaching its peak in 2021 with 10 thousand cases. This is followed by cases of attempted suicide with 4 thousand in the same year, while cases of completed suicide reached their highest peak in 2019 with 267 cases.