# Brazil and São Paulo's average income progress compared.

In this study we compare average income changes between Brazil and its richest city São Paulo. Considering a quarterly period from 2012 to 2020 and difference between education levels.
Data was queried using [(IBGE) Brazilian Institute of Geography and Statistics' API](https://servicodados.ibge.gov.br/api/docs/agregados?versao=3), read using *pandas* and prepared for plotting on *matplotlib* and *seaborn*.

### Data querying.

We are interested in exploring differences between São Paulo and Brazil's average income.
IBGE's National Sample Survey of Households (PNAD) - specifically aggregated data code 5935 - was queried considering the following criteria:
- Average income of all citizens, regardless of education level;
- Average income of citizens that completed high school;
- Average income of citizens that have a degree;
- All quarterly periods available.

The following query link was obtained using the [IBGE's query builder](https://servicodados.ibge.gov.br/api/docs/agregados?versao=3#api-bq).

In [1]:
IBGE_query = 'https://servicodados.ibge.gov.br/api/v3/agregados/5438/periodos/201201|201202|201203|201204|201301|201302|201303|201304|201401|201402|201403|201404|201501|201502|201503|201504|201601|201602|201603|201604|201701|201702|201703|201704|201801|201802|201803|201804|201901|201902|201903|201904|202001/variaveis/5935?localidades=N1[all]|N6[3550308]&classificacao=1568[120704,11630,11632]'

In [2]:
import pandas as pd

data = pd.read_json(IBGE_query)
data

Unnamed: 0,id,variavel,unidade,resultados
0,5935,"Rendimento médio real de todos os trabalhos, e...",Reais,"[{'classificacoes': [{'id': '1568', 'nome': 'N..."


Because the JSON received has its particularities, we are going to use *pandas* to organize data in tabular form.

### Preparing Data

In addition to the time series themselves, the last column of the *'data'* contains education level labels.
This column needs to be expanded, time series extracted and correctly labeled.

In [24]:
# Normalizing json and finding series of interest, gathering useful data labels.
average_income = pd.json_normalize(data.iloc[0,3],
                                   record_path='series',
                                   meta=['classificacoes'])

# Extract category from nested JSON
average_income['categoria']= average_income['classificacoes'].apply(lambda x:list(x['categoria'].values())[0])

# # Dropping redundant columns, translating some names and setting category as new index
# average_income = average_income.drop(columns=['localidade.id', 'localidade.nivel.id',
#                                               'localidade.nivel.nome', 'classificacoes'])
# average_income.replace({'Ensino médio completo ou equivalente': 'High School',
#                         'Ensino superior completo ou equivalente': 'University',
#                         'São Paulo (SP)': 'São Paulo',
#                         'Brasil': 'Brazil'},
#                        inplace = True)
# average_income.set_index('categoria', inplace=True)

average_income

Unnamed: 0,Falselocalidade.id,Falselocalidade.nivel.id,Falselocalidade.nivel.nome,Falselocalidade.nome,Falseserie.201201,Falseserie.201202,Falseserie.201203,Falseserie.201204,Falseserie.201301,Falseserie.201302,...,Falseserie.201802,Falseserie.201803,Falseserie.201804,Falseserie.201901,Falseserie.201902,Falseserie.201903,Falseserie.201904,Falseserie.202001,classificacoes,categoria
0,1,N1,Brasil,Brasil,2729,2575,2597,2617,2745,2650,...,2727,2729,2820,3022,2721,2736,2851,3044,"{'id': '1568', 'nome': 'Nível de instrução', '...",Total
1,3550308,N6,Município,São Paulo (SP),4009,3948,3988,4139,4068,4198,...,4731,4546,4996,5142,4621,4570,4823,5140,"{'id': '1568', 'nome': 'Nível de instrução', '...",Total
2,1,N1,Brasil,Brasil,2452,2282,2281,2289,2405,2333,...,2146,2121,2210,2338,2098,2101,2188,2288,"{'id': '1568', 'nome': 'Nível de instrução', '...",Ensino médio completo ou equivalente
3,3550308,N6,Município,São Paulo (SP),2716,2542,2492,2590,2772,2795,...,2602,2494,2690,2579,2292,2250,2410,2540,"{'id': '1568', 'nome': 'Nível de instrução', '...",Ensino médio completo ou equivalente
4,1,N1,Brasil,Brasil,7081,6606,6621,6602,6881,6597,...,6205,6249,6368,6790,6081,6154,6398,6715,"{'id': '1568', 'nome': 'Nível de instrução', '...",Ensino superior completo ou equivalente
5,3550308,N6,Município,São Paulo (SP),8814,8388,8498,8646,8197,8573,...,9734,9319,10023,10272,9315,9206,9596,9850,"{'id': '1568', 'nome': 'Nível de instrução', '...",Ensino superior completo ou equivalente
