# **Brazilian Elections**

This code develops an analysis of Brazilian federal and municipal elections from 2014 to 2020 using geopandas and other visualization tools. The databases used on this analysis were taken from [Base dos Dados](https://basedosdados.org/) and from [geodata-br](https://github.com/tbrugz/geodata-br). Base dos Dados provided two databases, which were accessed throughout SQL queries: [Diretórios Brasileiros](https://basedosdados.org/dataset/br-bd-diretorios-brasil), from where were taken some metadata such as region, state acronyms and the id for each city; and [Eleições Brasileiras](https://basedosdados.org/dataset/br-tse-eleicoes), from where were taken the data about Brazilian elections. The GeoJSON file provided by geodata-br was loaded on the `create-database.py`, on this project, and then imported into MySQL using some Python code. It is important to run the code on the `.py` file mentioned if you would like to run the code on this Notebook yourself.

## Libraries required

In [None]:
import basedosdados as bd
import geopandas as gpd
import pandas as pd
import geoplot
import sqlalchemy 
import getpass
import plotly.express as px
from shapely import wkt


## Connecting to the database `analise_eleitoral` on **MySQL**

In [None]:
p = getpass.getpass("Enter password: ")
engine = sqlalchemy.create_engine("mysql+pymysql://{user}:{pw}@localhost/{db}"
                       .format(user="root",
                              pw=p,
                              db="analise_eleitoral"))
conn = engine.connect()

Note that the query bellow gets the full table from MySQL. More importantly, if we take a closer look at the type of each column on the dataframe, we will notice that `geometria` is an `object`. Actually, its datatype is `wkt` (or Well-known text, a representation of geometries), as it was its format when the data was imported into MySQL. We would like it to be of type `geometry`.

In [None]:
query = ''' 
SELECT * 
FROM municipalities
'''

geometries = pd.read_sql(query, conn)
geometries.dtypes

To reach that goal, lets run the code bellow:

In [None]:
# references: 
# https://docs.geopandas.org/en/latest/docs/reference/api/geopandas.GeoSeries.to_wkt.html
# https://stackoverflow.com/questions/56433138/converting-a-column-of-polygons-from-string-to-geopandas-geometry

geometries['geometria'] = gpd.GeoSeries.from_wkt(geometries['geometria'])
geometries = gpd.GeoDataFrame(geometries, geometry='geometria')
geometries.dtypes

In [None]:
geometries.head()

In [None]:
conn.close()

Now the `geometries` dataframe is ready for plotting maps. Before that, lets query some electoral data from Base dos Dados using *Google Big Query* throughout the *basedosdados* library.

## Importing the electoral data from **Base dos Dados**

### For the municipal elections

In [None]:
# municipal data about candidates

query = """
SELECT ano, id_municipio, id_candidato_bd, sigla_partido, cargo, idade, genero, instrucao, estado_civil, raca
FROM basedosdados.br_tse_eleicoes.candidatos
WHERE ano in (2020, 2016)
"""

candidate_mun = bd.read_sql(query, 
                                billing_project_id="analise-eleitoral-330723")

In [None]:
candidate_mun.head()

In [None]:
# municipal data about results

query = """
SELECT id_candidato_bd, resultado, votos
FROM basedosdados.br_tse_eleicoes.resultados_candidato
WHERE ano in (2020, 2016)
"""

candidate_result = bd.read_sql(query, 
                                billing_project_id="analise-eleitoral-330723")

In [None]:
candidate_result.head()

In [None]:
# merging the dfs to get the result of each candidate plus socio-economic information 

mun = pd.merge(candidate_mun, candidate_result, on='id_candidato_bd')


In [None]:
# then we can finally merge with the df which contains the geometry information

municipal = pd.merge(mun, geometries, on='id_municipio')

In [None]:
municipal.head()

### For the federal elections

In [None]:
# federal data about candidates

query = """
SELECT ano, id_municipio, id_candidato_bd, sigla_partido, cargo, idade, genero, instrucao, estado_civil, raca
FROM basedosdados.br_tse_eleicoes.candidatos
WHERE ano in (2018, 2014)
"""

candidate_fed = bd.read_sql(query, 
                                billing_project_id="analise-eleitoral-330723")

In [None]:
candidate_fed.head()

In [None]:
# federal data about results

query = """
SELECT id_candidato_bd, resultado, votos
FROM basedosdados.br_tse_eleicoes.resultados_candidato
WHERE ano in (2018, 2014)
"""

candidate_result_fed = bd.read_sql(query, 
                                billing_project_id="analise-eleitoral-330723")

In [None]:
candidate_result_fed.head()

In [None]:
# merging the dfs to get the result of each candidate plus socio-economic information 

fed = pd.merge(candidate_fed, candidate_result_fed, on='id_candidato_bd')
fed.head()

In [None]:
# then we can finally merge with the df which contains the geometry information

federal = pd.merge(fed, geometries, on='id_municipio')

In [None]:
# that does not seem to work because Base dos Dados agreggates the votes data for each state 
# and ignores the municipalities. As we do not have any geometry for states, I guess it won't work.
# Maybe we could somehow agreggate the geometries by states but it doesn't sound like something very simple to do

federal.head()