# Is there any relationship between suicides and forest area?

Required tools and libraries in this analysis:

In [47]:
import pandas as pd
import basedosdados as bd

def unpack(l):
    return ", ".join(map(str, l)) 

## Available data
### To approach this question we'll consider the following datasets:
#### 1. [Sistema de Informações sobre Mortalidade (SIM)](https://basedosdados.org/dataset/br-ms-sim?bdm_table=microdados)
CID-10 codes of interest: X60-X84 e Y87.0 

#### 2. [Censo Agropecuário](https://basedosdados.org/dataset/br-ibge-censo-agropecuario?bdm_table=municipio)


#### 3. [População brasileira](https://basedosdados.org/dataset/br-ms-populacao?bdm_table=municipio)


### References:
- [1] [Mortalidade por suicídio e notificações de lesões autoprovocadas no Brasil](https://www.gov.br/saude/pt-br/centrais-de-conteudo/publicacoes/boletins/epidemiologicos/edicoes/2021/boletim_epidemiologico_svs_33_final.pdf)

## Simplest case 
At first, lets focus on the year of 2017, as it is covered by all datasets and the most recent available Censo Agropecuário.

In [54]:
def get_sim_suic_municipio_2017():
    try:
        return pd.read_csv('../data/sim_suic_municipio.csv')
    except FileNotFoundError:

        query = f"""
                    SELECT id_municipio, SUM(numero_obitos) as numero_obitos
                    FROM basedosdados.br_ms_sim.municipio_causa
                    WHERE ano = 2017 
                    AND (
                        CONTAINS_SUBSTR(causa_basica, 'X6')
                        OR CONTAINS_SUBSTR(causa_basica, 'X7')
                        OR CONTAINS_SUBSTR(causa_basica, 'X8')
                        OR causa_basica = 'Y870'
                        )
                    GROUP BY id_municipio
                """

        sim_suic_municipio = bd.read_sql(query=query,
                                         billing_project_id='explorando-basedosdados')
        sim_suic_municipio.to_csv(
            '../data/sim_suic_municipio.csv', index=False)
        return sim_suic_municipio

sim_suic_municipio_2017 = get_sim_suic_municipio_2017()
sim_suic_municipio_2017

Downloading: 100%|██████████| 3144/3144 [00:00<00:00, 6802.77rows/s]


Unnamed: 0,id_municipio,numero_obitos
0,1600303,31
1,1600402,2
2,1600535,1
3,1100015,3
4,1100023,10
...,...,...
3139,3170800,4
3140,3171030,1
3141,3171204,3
3142,3171709,3


In [None]:
#todo: experiment with list comprehension instead of BigQueary CONTAINS_SUBSTR to generate the CID-10 code range
# ['X'+ str(n) for n in range(600,841)]

In [53]:
def get_agro_forest_municipio_2017():
    try:
        return pd.read_csv('../data/agro_forest_municipio_2017.csv')
    except FileNotFoundError:
        agro_selected_col = ('id_municipio',
                             'area_total',
                             'area_mata_natural',
                             'area_mata_plantada',
                             'area_sistema_agroflorestal',
                             'area_mata')

        query = f"""
                    SELECT {unpack(agro_selected_col)}
                    FROM basedosdados.br_ibge_censo_agropecuario.municipio
                    WHERE ano = 2017 
                """

        agro_forest_municipio_2017 = bd.read_sql(query=query,
                                                 billing_project_id='explorando-basedosdados')
        agro_forest_municipio_2017.to_csv(
            '../data/agro_forest_municipio_2017.csv', index=False)
        return agro_forest_municipio_2017


agro_forest_municipio_2017 = get_agro_forest_municipio_2017()
agro_forest_municipio_2017


Downloading: 100%|██████████| 5563/5563 [00:00<00:00, 6468.97rows/s]


Unnamed: 0,id_municipio,area_total,area_mata_natural,area_mata_plantada,area_sistema_agroflorestal,area_mata
0,1100015,372746.0,108944.0,258.0,10481.0,109202.0
1,1100023,334256.0,88865.0,615.0,1573.0,89480.0
2,1100031,113085.0,17271.0,593.0,0.0,17864.0
3,1100049,221390.0,31620.0,180.0,0.0,31800.0
4,1100056,126686.0,27954.0,0.0,489.0,27954.0
...,...,...,...,...,...,...
5558,5222005,79967.0,12186.0,715.0,0.0,12901.0
5559,5222054,62341.0,5304.0,0.0,2626.0,5304.0
5560,5222203,84770.0,5542.0,0.0,4663.0,5542.0
5561,5222302,186118.0,34659.0,724.0,1083.0,35383.0
