# Welcome to the Brazilian Apocalypse!

We are currently on the second year of the Coronavirus pandemic. It is obvious the institutions aren't ready to face extreme conditions for a prolonged period of time, and neither are the people. While there is a large community of paranoid people in the US who are ready to survive the extinction of civilization, in Brazil this scenario is not something we consider when thinking about the future.

Brazil is a country with continental dimensions, with more than 220 million people spread over 8.5 million square kilometers. There must be some dark forgotten places where one could ride out an extinction level event. There are more than five thousand cities and we will sort through different types of data to determine the best places to enjoy the end of the world.

After ranking the cities from best to worst, we will try to optimize our escape route, choosing our starting city and defining the sequence of cities to get to our final destination. 

## 1. Definition of the problem

There are several different causes for an apocalypse, and people are getting more creative everyday. Zombies, nuclear war, large asteroids, deadly viruses, global warming, exploding volcanoes, massive earthquakes, alien invasion! Although there are huge variations on each of these scenarios, we can look for some information that will help no matter what comes our way. 

Crowded places should be avoided, food must be plentiful, fresh water must be available. We also need medication to cure and prevent diseases, and firearms for protection. It would be smart to be away from nuclear power plants, but solar energy might be useful. Average temperatures around 22 degrees are a good choice. Being close to an airport shoud be interesting, even a small one, for a quick getaway, if you can find and fly a plane. 

We should also be worried about rebuilding civilization, so we also looking at libraries and universities.

We will load data that helps to determine the potential for each city to fulfill one of this requirements, describing its source and how we create a metric to rank the cities. After processing all this information, our final data set will be a table where each row is a city and each column will show a score between 0 and 1 for a certain feature. Our last feature will be the sum of all scores, and we will rank the cities by this metric. Hopefully, we will have a clear winner for the whole country. 

We also have a list of neighboring cities for each location. So we can map a route from any place to the our sanctuary or try to find the best place at a certain distance from a starting point.

## 2. Loading

In [None]:
import folium
import pandas as pd
from bs4 import BeautifulSoup 
import requests 
import os

## Altitude, Temperature and Rainfall

In [None]:
df_climate = pd.read_excel('KoppenBrazilianmunicipalities.xls', sheet_name='Data')

In [None]:
df_climate.head()

In [None]:
df_climate['mean_temp'] = df_climate[df_climate.columns[6:18]].mean(axis=1)
df_climate['mean_rain'] = df_climate[df_climate.columns[18:]].mean(axis=1)

df_climate_clean = df_climate[['Municipality', 'IBGE-Code','State', 'Altitude', 'mean_temp', 'mean_rain']]
df_climate_clean.head()

In [None]:
df_climate_clean.loc[:,'IBGE-Code'] = df_climate_clean.loc[:,'IBGE-Code'].astype(str) 

In [None]:
df_climate_clean.describe().T

## Loading maps

In [None]:
city_json = 'geojs-100-mun.json'

In [None]:
map_brazil = folium.Map(location = [-15.793889, -47.882778],
                       zoom_start=4)
map_brazil

In [None]:
folium.Choropleth(
    geo_data=city_json,
    name='temperature',
    data=df_climate_clean,
    columns=['IBGE-Code', 'mean_temp'],
    key_on='properties.id',
    fill_color='YlOrRd',
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name='Altitude'
).add_to(map_brazil)

In [None]:
map_brazil

## Population and Health Human Development Index

In [None]:
df_atlas = pd.read_excel('Atlas 2013_municipal, estadual e Brasil.xlsx', sheet_name='MUN 91-00-10')
df_atlas = df_atlas[df_atlas['ANO']==2010]
df_atlas_clean = df_atlas[['Codmun7','Município','pesotot','IDHM_L']]
df_atlas_clean.loc[:,'Codmun7'] = df_atlas_clean.loc[:,'Codmun7'].astype(str)
df_atlas_clean.head()

In [None]:
map_brazil = folium.Map(location = [-15.793889, -47.882778],
                       zoom_start=4)
folium.Choropleth(
    geo_data=city_json,
    name='idhm',
    data=df_atlas_clean,
    columns=['Codmun7', 'IDHM_L'],
    key_on='properties.id',
    fill_color='YlOrRd',
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name='IDHM_L'
).add_to(map_brazil)

In [None]:
map_brazil

## Area

In [None]:
df_area = pd.read_excel('AR_BR_RG_UF_RGINT_RGIM_MES_MIC_MUN_2020.xls', sheet_name='AR_BR_MUN_2020')
df_area_clean = df_area[['CD_GCMUN','NM_MUN_2020','AR_MUN_2020']]
df_area_clean.loc[:,'CD_GCMUN'] = df_area_clean.loc[:,'CD_GCMUN'].astype(int).astype(str)
df_area_clean.head()

In [None]:
map_brazil = folium.Map(location = [-15.793889, -47.882778],
                       zoom_start=4)
folium.Choropleth(
    geo_data=city_json,
    name='area',
    data=df_area_clean,
    columns=['CD_GCMUN', 'AR_MUN_2020'],
    key_on='properties.id',
    fill_color='YlOrRd',
    fill_opacity=0.7,
    line_opacity=0.2,
    legend_name='area'
).add_to(map_brazil)
map_brazil

## Solar incidence

In [None]:
df_solar = pd.read_csv('direct_normal_means_sedes-munic.csv',sep=';')
df_solar_clean = df_solar[['LON','LAT','NAME','STATE','ANNUAL']]
df_solar_clean.head()

In [None]:
df_solar_clean = df_solar_clean.replace(['ACRE', 'ALAGOAS', 'AMAPÁ', 'AMAZONAS', 'BAHIA', 'CEARÁ',
       'DISTRITO FEDERAL', 'ESPÍRITO SANTO', 'GOIÁS', 'MARANHÃO',
       'MATO GROSSO', 'MATO GROSSO DO SUL', 'MINAS GERAIS', 'PARÁ',
       'PARAÍBA', 'PARANÁ', 'PERNAMBUCO', 'PIAUÍ', 'RIO DE JANEIRO',
       'RIO GRANDE DO NORTE', 'RIO GRANDE DO SUL', 'RONDÔNIA', 'RORAIMA',
       'SANTA CATARINA', 'SÃO PAULO', 'SERGIPE', 'TOCANTINS'],['AC', 'AL', 'AP', 'AM', 'BA', 'CE',
       'DF', 'ES', 'GO', 'MA', 'MT', 'MS', 'MG', 'PA', 'PB', 'PR', 'PE', 'PI', 'RJ', 'RN', 'RS', 
        'RO', 'RR', 'SC', 'SP', 'SE', 'TO'])
df_solar_clean.head()

In [None]:
df_solar_clean['citystate'] = df_solar_clean['NAME']+df_solar_clean['STATE']
df_solar_clean.head()

In [None]:
df_climate_clean['citystate'] = df_climate_clean['Municipality']+df_climate_clean['State']
df_climate_clean.head()

In [None]:
df_climatesolar = pd.merge(df_solar_clean, df_climate_clean, on='citystate', how='inner')
df_climatesolar_clean = df_climatesolar[['Municipality','State','IBGE-Code','LON','LAT','Altitude','mean_temp','mean_rain','ANNUAL']]
df_climatesolar_clean.head()

## Energy sources

In [None]:
df_energy = pd.read_csv('EmpreendimentoOperacao.csv')
df_energy = df_energy[['dscMunicipio','dscOrigemCombustivel','mdaPotenciaOutorgadakW']]
df_energy = df_energy.dropna(axis = 0, how ='any')
df_energy.head()

In [None]:
df_energy['pos'] = df_energy['dscMunicipio'].str.find('CodIbge:').astype(int)
df_energy.head()

In [None]:
df_energy['CodMunic'] = df_energy.apply(lambda x: x['dscMunicipio'][x['pos']+9:x['pos']+16],axis=1)
df_energy.head()

In [None]:
df_energy_clean = df_energy[['CodMunic','dscOrigemCombustivel','mdaPotenciaOutorgadakW']]
df_energy_clean.head()

In [None]:
df_energy_clean['mdaPotenciaOutorgadakW'] = df_energy_clean['mdaPotenciaOutorgadakW'].astype(int)

In [None]:
df_energy_final =df_energy_clean.groupby(['CodMunic','dscOrigemCombustivel'], as_index=False).agg({'mdaPotenciaOutorgadakW':sum})
df_energy_final.head()

In [None]:
df_energy_pivot = df_energy_final.pivot(index='CodMunic', columns='dscOrigemCombustivel', values='mdaPotenciaOutorgadakW')
df_energy_pivot.reset_index(inplace=True)
df_energy_pivot.head()

## Hospital beds

In [None]:
df_hospital = pd.read_csv('leitoshospitalares.csv',sep=';')
df_hospital.head()

In [None]:
df_hospital['Município'] = df_hospital.apply(lambda x: x['Município'][:6],axis=1)
df_hospital.head()

## Neighbor cities

In [None]:
df_neighbors = pd.read_csv('adjacency_matrix.csv')
df_neighbors.head()

## Agriculture production

In [None]:
df_agro1 = pd.read_excel('tabela1612.xlsx')
df_agro2 = pd.read_excel('tabela1613.xlsx')
df_agro = pd.merge(df_agro1, df_agro2, on=['Cód.','Município'], how='inner')
df_agro = df_agro.fillna(0)
df_agro['Total prod'] = (df_agro['Produto das lavouras temporárias'].astype(int) + df_agro['Produto das lavouras permanentes'].astype(int))/100.
df_agro_clean = df_agro[['Cód.','Total prod']]
df_agro_clean.head()

## Airports

In [None]:
url = 'https://en.wikipedia.org/wiki/List_of_airports_in_Brazil'
html_data = requests.get(url).text
soup = BeautifulSoup(html_data,"html5lib") 

In [None]:
#get the list
table_contents = []
table = soup.findAll('table')
for row in table[0].findAll('tr'):
    temp=[]
    for cell in row.findAll('td'):
        temp.append(cell.text)
    table_contents.append(temp)
table_contents[:5]

In [None]:
df_airports = pd.DataFrame(table_contents,columns=['Municipio','Estado','Sigla','Sigla2','Nome','outro'])
df_airports = df_airports.drop([0]).reset_index(inplace=False)
df_airports_clean = df_airports[['Municipio','Estado']]
df_airports_clean.head()

In [None]:
df_airports_clean['Estado'] = df_airports_clean['Estado'].replace(['Acre', 'Alagoas', 'Amapá', 'Amazonas', 'Bahia', 'Ceará',
       'Federal District', 'Espírito Santo', 'Goiás', 'Maranhão', 'Mato Grosso', 'Mato Grosso do Sul', 
        'Minas Gerais', 'Pará', 'Paraíba', 'Paraná', 'Pernambuco', 'Piauí', 'Rio de Janeiro',
       'Rio Grande do Norte', 'Rio Grande do Sul', 'Rondônia', 'Roraima', 'Santa Catarina', 'São Paulo', 
        'Sergipe', 'Tocantins'],['AC', 'AL', 'AP', 'AM', 'BA', 'CE',
       'DF', 'ES', 'GO', 'MA', 'MT', 'MS', 'MG', 'PA', 'PB', 'PR', 'PE', 'PI', 'RJ', 'RN', 'RS', 
        'RO', 'RR', 'SC', 'SP', 'SE', 'TO'])
df_airports_clean.head()

In [None]:
df_airports_clean['citystate'] = df_airports_clean['Municipio']+df_airports_clean['Estado']
df_climateairport = pd.merge(df_airports_clean, df_climate_clean, on='citystate', how='inner')
df_climateairport.head()

In [None]:
df_airports_final = pd.DataFrame(df_climateairport[['IBGE-Code']].value_counts(),columns=['Airports'])
df_airports_final.reset_index(inplace=True)
df_airports_final.head()

## Military Bases

In [None]:
url = 'https://en.wikipedia.org/wiki/List_of_Brazilian_military_bases'
html_data = requests.get(url).text
soup = BeautifulSoup(html_data,"html5lib") 

In [None]:
#get the list
table_contents = []
tables = soup.findAll('table')
for table in tables:
    for row in table.findAll('tr'):
        temp=[]
        for cell in row.findAll('td'):
            temp.append(cell.text)
        table_contents.append(temp)
table_contents[:5]

In [None]:
df_military = pd.DataFrame(table_contents,columns=['Location', 'State', 'ICAO', 'Code', 'Basename','outro1','outro2'])
df_military = df_military.drop([0]).reset_index(inplace=False)
df_military_clean = df_military[['Location','State']]
df_military_clean.head()

In [None]:
df_military_clean['State'] = df_military_clean['State'].replace(['Acre', 'Alagoas', 'Amapá', 'Amazonas', 'Bahia', 'Ceará',
       'Federal District', 'Espírito Santo', 'Goiás', 'Maranhão', 'Mato Grosso', 'Mato Grosso do Sul', 
        'Minas Gerais', 'Pará', 'Paraíba', 'Paraná', 'Pernambuco', 'Piauí', 'Rio de Janeiro',
       'Rio Grande do Norte', 'Rio Grande do Sul', 'Rondônia', 'Roraima', 'Santa Catarina', 'São Paulo', 
        'Sergipe', 'Tocantins'],['AC', 'AL', 'AP', 'AM', 'BA', 'CE',
       'DF', 'ES', 'GO', 'MA', 'MT', 'MS', 'MG', 'PA', 'PB', 'PR', 'PE', 'PI', 'RJ', 'RN', 'RS', 
        'RO', 'RR', 'SC', 'SP', 'SE', 'TO'])
df_military_clean['citystate'] = df_military_clean['Location']+df_military_clean['State']
df_climatemilitary = pd.merge(df_military_clean, df_climate_clean, on='citystate', how='inner')
df_climatemilitary.head()

In [None]:
df_military_final = pd.DataFrame(df_climatemilitary[['IBGE-Code']].value_counts(),columns=['Military'])
df_military_final.reset_index(inplace=True)
df_military_final.head()

## Libraries

In [None]:
path='libraries/'
df_libraries = pd.DataFrame(columns=['MUNICÍPIO','NOME DA BIBLIOTECA','VÍNCULO','ENDEREÇO', 'BAIRRO','TELEFONE','E-MAIL','outro','estado'])
for filename in os.listdir(path):
        df_temp = pd.read_csv(path+filename,skiprows=3)
        df_temp['estado'] = filename[-6:-4]
        df_libraries = df_libraries.append(df_temp,ignore_index=True)
df_libraries.head()

In [None]:
df_libraries['MUNICÍPIO'] = df_libraries['MUNICÍPIO'].fillna(method='ffill')
df_libraries_clean = df_libraries[['MUNICÍPIO','estado']]
df_libraries_clean['citystate'] = df_libraries_clean['MUNICÍPIO']+df_libraries_clean['estado']
df_climatelibrary = pd.merge(df_libraries_clean, df_climate_clean, on='citystate', how='inner')
df_libraries_final = pd.DataFrame(df_climatelibrary[['IBGE-Code']].value_counts(),columns=['Libraries'])
df_libraries_final.reset_index(inplace=True)
df_libraries_final.head()

## Universities

In [None]:
df_universities = pd.read_csv('SUP_IES_2019.CSV',encoding='iso-8859-1' ,sep='|')
df_universities_final = pd.DataFrame(df_universities[['CO_MUNICIPIO']].value_counts(),columns=['Universities'])
df_universities_final.head()