## 1.0 Introduction

The main purpose of this function is to automatically fill in Brazilian regions and states from the city and postal codes (CEP).

## 1.1 Import modules

In [1]:
# From the module
from gumly import geo_location

#Others
import pandas as pd

## 1.2 Creating a data

In [2]:
d = {'Customer': [1, 2, 3, 4],
     'City' : ['São Paulo', 'Sao Paulo', 'sao paulo', 'São Pauol']} # With a typo on the last entry

## 1.3 Filling the region from the city

In [3]:
dfregion = pd.DataFrame(data=d)
dfregion['Region'] = geo_location.city_to_region(dfregion, 'City')
dfregion

Unnamed: 0,Customer,City,Region
0,1,São Paulo,Sudeste
1,2,Sao Paulo,Sudeste
2,3,sao paulo,Sudeste
3,4,São Pauol,


In [4]:
dfmicroregion = pd.DataFrame(data=d)
dfmicroregion['Microregion'] = geo_location.city_to_microregion(dfmicroregion, 'City')
dfmicroregion

Unnamed: 0,Customer,City,Microregion
0,1,São Paulo,São Paulo
1,2,Sao Paulo,São Paulo
2,3,sao paulo,São Paulo
3,4,São Pauol,


In [6]:
dfmesoregion = pd.DataFrame(data=d)
dfmesoregion['Mesoregion'] = geo_location.city_to_mesoregion(dfmesoregion, 'City')
dfmesoregion

Unnamed: 0,Customer,City,Mesoregion
0,1,São Paulo,Metropolitana de São Paulo
1,2,Sao Paulo,Metropolitana de São Paulo
2,3,sao paulo,Metropolitana de São Paulo
3,4,São Pauol,


In [7]:
dfimediate = pd.DataFrame(data=d)
dfimediate['Imediate_region'] = geo_location.city_to_imediate_region(dfimediate, 'City')
dfimediate

Unnamed: 0,Customer,City,Imediate_region
0,1,São Paulo,São Paulo
1,2,Sao Paulo,São Paulo
2,3,sao paulo,São Paulo
3,4,São Pauol,


In [8]:
dfintermediarie = pd.DataFrame(data=d)
dfintermediarie['Intermediarie_region'] = geo_location.city_to_intermediarie_region(dfintermediarie,'City')
dfintermediarie

Unnamed: 0,Customer,City,Intermediarie_region
0,1,São Paulo,São Paulo
1,2,Sao Paulo,São Paulo
2,3,sao paulo,São Paulo
3,4,São Pauol,


## 1.4 Filling the state from the city

In [9]:
dfstate = pd.DataFrame(data=d)
dfstate['State'] = geo_location.city_to_state(dfstate, 'City')
dfstate

Unnamed: 0,Customer,City,State
0,1,São Paulo,São Paulo
1,2,Sao Paulo,São Paulo
2,3,sao paulo,São Paulo
3,4,São Pauol,


## 1.5 Filling the region from the state

In [14]:
dstate = {'Customer': [1, 2, 3, 4],
     'State' : ['Ceará', 'Ceara', 'ceara', 'ceaara']} 

In [15]:
dfstateregion = pd.DataFrame(data=dstate)
dfstateregion['state_to_region'] = geo_location.state_to_region(dfstateregion, 'State')
dfstateregion

Unnamed: 0,Customer,State,state_to_region
0,1,Ceará,Nordeste
1,2,Ceara,Nordeste
2,3,ceara,Nordeste
3,4,ceaara,


## 1.7 Creating a data with postal codes (CEPs)

In [20]:
dcep = {'Customer': [1, 2, 3],
     'CEP' : ['03033-070', '03033070', '03033']} 

## 1.8 Filling the state from the CEP

In [21]:
dfcepstate = pd.DataFrame(data=dcep)
dfcepstate['State'] = geo_location.cep_to_state(dfcepstate, 'CEP')
dfcepstate

Unnamed: 0,Customer,CEP,State
0,1,03033-070,São Paulo
1,2,03033070,São Paulo
2,3,03033,São Paulo


## 1.9 Filling the region from the CEP

In [23]:
dfcepregion = pd.DataFrame(data=dcep)
dfcepregion['Region'] = geo_location.cep_to_region(dfcepregion, 'CEP')
dfcepregion

Unnamed: 0,Customer,CEP,Region
0,1,03033-070,Sudeste
1,2,03033070,Sudeste
2,3,03033,Sudeste


## 2.0 Conclusion and library advantages

This implementation is a simple approach to quickly fill brazilian states and regions in dataframes where this information is not present. It can accept complete and incomplete postal codes (CEPs) and city names without the correct accentuation or capitalization but it cannot handle typos.

## References

[ibge library](https://pypi.org/project/ibge/)

[ibge website](https://www.ibge.gov.br/)

[Everything about CEP](https://www.correios.com.br/enviar/precisa-de-ajuda/tudo-sobre-cep)
