# Is there any relationship between suicides and forest area?

Required tools and libraries in this analysis:

In [1]:
import pandas as pd
import basedosdados as bd

## Available data
### To approach this question we'll consider the following datasets:
#### 1. [Sistema de Informações sobre Mortalidade (SIM)](https://basedosdados.org/dataset/br-ms-sim?bdm_table=microdados)
CID-10 codes of interest: X60-X84 e Y87.0 

#### 2. [Censo Agropecuário](https://basedosdados.org/dataset/br-ibge-censo-agropecuario?bdm_table=municipio)


#### 3. [População brasileira](https://basedosdados.org/dataset/br-ms-populacao?bdm_table=municipio)


### References:
- [1] [Mortalidade por suicídio e notificações de lesões autoprovocadas no Brasil](https://www.gov.br/saude/pt-br/centrais-de-conteudo/publicacoes/boletins/epidemiologicos/edicoes/2021/boletim_epidemiologico_svs_33_final.pdf)

## Simplest case 
At first, lets focus on the year of 2017, as it is covered by all datasets and the most recent available Censo Agropecuário.

In [45]:
def get_sim_suic_municipio():
    try:
        return pd.read_csv('../data/sim_suic_municipio.csv')
    except FileNotFoundError:

        query = f"""
                    SELECT id_municipio, SUM(numero_obitos) as numero_obitos
                    FROM basedosdados.br_ms_sim.municipio_causa
                    WHERE ano = 2017 
                    AND (
                        CONTAINS_SUBSTR(causa_basica, 'X6')
                        OR CONTAINS_SUBSTR(causa_basica, 'X7')
                        OR CONTAINS_SUBSTR(causa_basica, 'X8')
                        OR causa_basica = 'Y870'
                        )
                    GROUP BY id_municipio
                """

        sim_suic_municipio = bd.read_sql(query=query,
                                         billing_project_id='explorando-basedosdados')
        sim_suic_municipio.to_csv(
            '../data/sim_suic_municipio.csv', index=False)
        return sim_suic_municipio

sim_suic_municipio = get_sim_suic_municipio()
sim_suic_municipio

Unnamed: 0,id_municipio,numero_obitos
0,1600303,31
1,1600402,2
2,1600535,1
3,1200203,6
4,1200252,2
...,...,...
3139,3556305,1
3140,3556354,1
3141,3556800,1
3142,3557105,7


In [None]:
#todo: experiment with list comprehension instead of BigQueary CONTAINS_SUBSTR to generate the CID-10 code range
# ['X'+ str(n) for n in range(600,841)]