# Os efeitos da pandemia e mudança de governos sobre o preço de combustível 

Por:
- Djállen Fabrício Lima Dias
- Flávio Mesquita Marinho Filho
- Rhuan Gabriel Do Nascimento Galdino

In [1]:
import pandas as pd

# Dados

Nesta seção iremos explicar sobre o que tinha nos dados e quais informações usamos, assim também como explicar como tratamos os arquivos para chegarmos onde queríamos

Utilizamos os dados que estavam disponíveis em:
- [PIB Brasileiro](https://www.ibge.gov.br/estatisticas/economicas/contas-nacionais/9300-contas-nacionais-trimestrais.html?=&t=series-historicas&utm_source=landing&utm_medium=explica&utm_campaign=pib#evolucao-taxa)
- [Preços dos combustíveis - série histórica mensal a partir de 2013](https://www.gov.br/anp/pt-br/assuntos/precos-e-defesa-da-concorrencia/precos/precos-revenda-e-de-distribuicao-combustiveis/serie-historica-do-levantamento-de-precos#)
- [Calculo do PIB](https://www.ibge.gov.br/explica/pib.php)
- [IDH Brasileiro](https://www.kaggle.com/datasets/fidelissauro/indice-desenvolviment-humano-brasil)
- [IDH Mundial](https://globaldatalab.org/shdi/table/shdi/?levels=1&interpolation=0&extrapolation=0)

## IDH

In [2]:
# Esse arquivo tem poucas informações, mas com ele pudemos analisar o IDH brasileiro de maneira mais geral
dataIdh = pd.read_csv("arquivos/IDH/orig_idh.csv") # o arquivo original do kaggle
dataIdh

Unnamed: 0,ano_referencia,idh,idh_feminino,idh_masculino,expectativa_de_vida,expectativa_de_vida_feminina,expectativa_de_vida_masculina,expectativa_de_anos_escola,expectativa_de_anos_escola_feminina,expectativa_de_anos_escola_masculina
0,1991,0.616,0.0,0.0,66.31,69.33,63.44,12.33,0.0,0.0
1,1992,0.622,0.0,0.0,66.71,69.78,63.8,12.55,0.0,0.0
2,1993,0.63,0.0,0.0,67.11,70.18,64.19,12.77,0.0,0.0
3,1994,0.638,0.0,0.0,67.57,70.65,64.64,12.99,0.0,0.0
4,1995,0.646,0.0,0.0,67.92,71.09,64.9,13.22,0.0,0.0
5,1996,0.653,0.0,0.0,68.41,71.59,65.38,13.44,0.0,0.0
6,1997,0.66,0.0,0.0,68.81,72.03,65.74,13.66,0.0,0.0
7,1998,0.666,0.0,0.0,69.19,72.43,66.09,13.88,0.0,0.0
8,1999,0.671,0.0,0.0,69.52,72.81,66.38,14.1,0.0,0.0
9,2000,0.679,0.0,0.0,69.74,73.43,66.26,14.33,0.0,0.0


Como não foi preciso transformar os dados nos tipos certos, apenas foi preciso criar um novo arquivo com os dados que realmente iríamos usar, que são os de 2017 até 2021

In [3]:
dataIdh = pd.read_csv("arquivos/IDH/2017-2021_idh.csv") # arquivo que usamos
dataIdh

Unnamed: 0,ano_referencia,idh,idh_feminino,idh_masculino,expectativa_de_vida,expectativa_de_vida_feminina,expectativa_de_vida_masculina,expectativa_de_anos_escola,expectativa_de_anos_escola_feminina,expectativa_de_anos_escola_masculina
0,2017,0.759,0.753125,0.760735,74.83,78.03,71.64,15.49,15.84,15.13
1,2018,0.764,0.758375,0.765752,75.11,78.27,71.96,15.7,16.08,15.33
2,2019,0.766,0.761318,0.767225,75.34,78.47,72.2,15.6,16.03,15.17
3,2020,0.758,0.753576,0.75844,74.01,77.37,70.7,15.6,16.03,15.17
4,2021,0.754,0.750013,0.75468,72.75,76.01,69.56,15.6,16.03,15.17


O próximo arquivo de IDH vem de um outro que continha informações de vários outros países, novamente criamos um arquivo que atendesse ao que iríamos usar

In [4]:
bigdataIdh = pd.read_csv("arquivos/IDH/GDL-Subnational-HDI-data.csv") # arquivo original do 'Global Data Lab'
bigdataIdh

Unnamed: 0,Country,Continent,ISO_Code,Level,GDLCODE,Region,1990,1991,1992,1993,...,2012,2013,2014,2015,2016,2017,2018,2019,2020,2021
0,Afghanistan,Asia/Pacific,AFG,National,AFGt,Total,0.273,0.279,0.287,0.297,...,0.466,0.474,0.479,0.478,0.481,0.482,0.483,0.488,0.483,0.478
1,Afghanistan,Asia/Pacific,AFG,Subnat,AFGr101,Central (Kabul Wardak Kapisa Logar Parwan Panj...,0.332,0.339,0.349,0.361,...,0.548,0.552,0.553,0.548,0.551,0.553,0.555,0.561,0.556,0.550
2,Afghanistan,Asia/Pacific,AFG,Subnat,AFGr102,Central Highlands (Bamyan Daikundi),0.281,0.288,0.297,0.308,...,0.480,0.483,0.483,0.477,0.479,0.479,0.480,0.484,0.479,0.472
3,Afghanistan,Asia/Pacific,AFG,Subnat,AFGr103,East (Nangarhar Kunar Laghman Nooristan),0.287,0.293,0.301,0.311,...,0.468,0.469,0.466,0.459,0.461,0.463,0.464,0.469,0.464,0.459
4,Afghanistan,Asia/Pacific,AFG,Subnat,AFGr104,North (Samangan Sar-e-Pul Balkh Jawzjan Faryab),0.259,0.265,0.274,0.284,...,0.466,0.480,0.492,0.497,0.500,0.501,0.502,0.507,0.502,0.497
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1967,Zimbabwe,Africa,ZWE,Subnat,ZWEr104,Mashonaland West,0.458,0.461,0.448,0.441,...,0.532,0.545,0.556,0.565,0.566,0.569,0.572,0.567,0.566,0.559
1968,Zimbabwe,Africa,ZWE,Subnat,ZWEr108,Masvingo,0.480,0.483,0.469,0.462,...,0.535,0.552,0.568,0.580,0.585,0.591,0.598,0.596,0.595,0.588
1969,Zimbabwe,Africa,ZWE,Subnat,ZWEr105,Matebeleland North,0.458,0.461,0.448,0.441,...,0.530,0.538,0.545,0.548,0.549,0.551,0.554,0.548,0.547,0.541
1970,Zimbabwe,Africa,ZWE,Subnat,ZWEr106,Matebeleland South,0.471,0.474,0.460,0.454,...,0.544,0.550,0.555,0.557,0.566,0.577,0.589,0.592,0.591,0.585


In [5]:
# este é o arquivo que utilizamos para analisar mais sobre o IDH brasileiro
# pois ele contém informações de alguns estados brasileiros
brdataIdh = pd.read_csv("arquivos/IDH/2017-2021_BRASIL-GDL-Subnational-HDI-data.csv") 
brdataIdh

Unnamed: 0,Region,2017,2018,2019,2020,2021
0,Total,0.759,0.764,0.766,0.758,0.754
1,Acre,0.713,0.718,0.72,0.712,0.709
2,Alagoas,0.717,0.721,0.723,0.715,0.712
3,Amapa,0.746,0.751,0.753,0.745,0.741
4,Amazonas,0.735,0.74,0.742,0.734,0.73
5,Bahia,0.733,0.738,0.74,0.732,0.728
6,Ceara,0.731,0.736,0.738,0.73,0.726
7,Distrito Federal,0.821,0.826,0.829,0.82,0.816
8,Espirito Santo,0.761,0.767,0.769,0.761,0.757
9,Goias,0.766,0.771,0.773,0.765,0.761


## IPCA

Os dados de IPCA estão no website do IBGE, eles estavam no formato '.xlsx', não era necessário, mas os transformamos em '.csv', para termos um ganho de velocidade de leitura

Os dados estão distribuídos por anos, alguns deles estavam com o tipo (dtype) de algumas colunas errados, e também os valores nulos estava como '-' no arquivo, eles foram alterados para 0

In [6]:
# abaixo estará um exemplo de um dos arquivos originais, tendo em vista que mostrar todos seria inviável
dadosIpca = pd.read_excel("arquivos/IPCA/ipca_201801Subitem.xls")
dadosIpca.head(20)
# dadosIpca.dtypes # verificando que todos os tipos são 'object'

Unnamed: 0,SISTEMA NACIONAL DE ÍNDICES DE PREÇOS AO CONSUMIDOR,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,Unnamed: 10,Unnamed: 11,Unnamed: 12,Unnamed: 13,Unnamed: 14
0,"VARIAÇÕES MENSAIS POR GRUPOS, ITENS E SUBITENS",,,,,,,,,,,,,,
1,IPCA - JANEIRO DE 2018,,,,,,,,,,,,,,
2,,,,,,,,,,,,,,,
3,,RJ,POA,BH,REC,SP,DF,BEL,FOR,SAL,CUR,GOI,VIT,CG,NACIONAL
4,,,,,,,,,,,,,,,
5,ÍNDICE GERAL,0.42,0.68,0.36,0.03,0.21,-0.15,0.08,0.34,0.35,0.26,0.05,0.7,0.1,0.29
6,ALIMENTAÇÃO E BEBIDAS,1.1,0.98,0.63,0.54,0.33,0.58,0.86,0.87,1.08,1.11,0.69,1.14,0.96,0.74
7,ALIMENTAÇÃO NO DOMICÍLIO,1.61,1.54,0.91,1.22,0.9,1.25,0.78,1.04,1.48,0.89,0.96,1.37,0.99,1.12
8,"CEREAIS, LEGUMINOSAS E OLEAGINOSAS",-1.09,2.28,-2.47,-0.94,-0.74,-2.43,-0.61,-2.4,-1.81,-0.06,-1.05,-0.98,1.22,-0.99
9,ARROZ,-1.29,2.07,-1.06,-0.1,0.38,-3.43,0.42,-1.25,-0.43,0.57,-1.47,-0.01,1.53,-0.23


In [7]:
# abaixo está o arquivo exposto acima, porém, já tratado
trat_dadosIpca = pd.read_csv("arquivos/IPCA/ipca_2018/ipca_201801Subitem.csv")
trat_dadosIpca.head(20)
# trat_dadosIpca.dtypes # enquanto aqui, os tipos são 'float64'

Unnamed: 0,"VARIAÇÕES MENSAIS POR GRUPOS, ITENS E SUBITENS - IPCA - JANEIRO DE 2018",RJ,POA,BH,REC,SP,DF,BEL,FOR,SAL,CUR,GOI,VIT,CG,NACIONAL
0,ÍNDICE GERAL,0.42,0.68,0.36,0.03,0.21,-0.15,0.08,0.34,0.35,0.26,0.05,0.7,0.1,0.29
1,ALIMENTAÇÃO E BEBIDAS,1.1,0.98,0.63,0.54,0.33,0.58,0.86,0.87,1.08,1.11,0.69,1.14,0.96,0.74
2,ALIMENTAÇÃO NO DOMICÍLIO,1.61,1.54,0.91,1.22,0.9,1.25,0.78,1.04,1.48,0.89,0.96,1.37,0.99,1.12
3,"CEREAIS, LEGUMINOSAS E OLEAGINOSAS",-1.09,2.28,-2.47,-0.94,-0.74,-2.43,-0.61,-2.4,-1.81,-0.06,-1.05,-0.98,1.22,-0.99
4,ARROZ,-1.29,2.07,-1.06,-0.1,0.38,-3.43,0.42,-1.25,-0.43,0.57,-1.47,-0.01,1.53,-0.23
5,FEIJÃO-MULATINHO,0.0,0.0,0.0,-3.58,0.0,0.0,0.0,-1.49,0.6,0.0,0.0,0.0,0.0,-1.92
6,FEIJÃO-PRETO,-0.79,2.89,0.0,0.0,0.0,0.0,-0.62,0.0,0.0,-2.96,0.0,-1.35,-2.14,-0.28
7,FEIJÃO-MACASSAR (FRADINHO),0.0,0.0,0.0,2.1,0.0,0.0,0.0,-5.68,-5.05,0.0,0.0,0.0,0.0,-3.94
8,FEIJÃO-CARIOCA (RAJADO),0.0,0.0,-6.61,-2.13,-3.64,-0.35,-3.68,0.71,-3.72,-0.22,0.16,-6.13,0.79,-3.32
9,"FARINHAS, FÉCULAS E MASSAS",0.16,-0.15,1.0,-1.89,-1.28,1.82,-0.34,-1.59,-0.06,2.22,-0.32,0.23,-0.28,-0.29


## Ações

## Preço do Combustível

Os arquivos com as informações do preço do combustível foram pegos do site do governo, são dados da ANP (Agência Nacional do Petróleo, Gás Natural e Biocombustíveis)

In [11]:
# o arquivo veio em excel e com algumas formatações que não seriam interessantes para nós
# logo, foi preciso editar ele para que as informações ficassem mais facilmente compreendidas
# também foi transformado em csv, pos em xlsx demorava para carregar
df = pd.read_excel("arquivos/Preco_Combustivel/mensal-municipios-2016-a-2018.xlsx") # arquivo 'cru'
df2 = pd.read_csv("arquivos/Preco_Combustivel/editado-mensal-municipios-2016-a-2018.csv") # arquivo editado

In [14]:
df # 'cru'

Unnamed: 0,"AGÊNCIA NACIONAL DO PETRÓLEO, GÁS NATURAL E BIOCOMBUSTÍVEIS - ANP",Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,Unnamed: 10,Unnamed: 11,Unnamed: 12,Unnamed: 13,Unnamed: 14,Unnamed: 15,Unnamed: 16,Unnamed: 17
0,SUPERINTENDÊNCIA DE DEFESA DA CONCORRÊNCIA,,,,,,,,,,,,,,,,,
1,SISTEMA DE LEVANTAMENTO DE PREÇOS,,,,,,,,,,,,,,,,,
2,,,,,,,,,,,,,,,,,,
3,RELATÓRIO DE DEFESA DA CONCORRÊNCIA,,,,,,,,,,,,,,,,,
4,,,,,,,,,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
90497,2018-12-01 00:00:00,OLEO DIESEL S10,SUDESTE,RIO DE JANEIRO,VOLTA REDONDA,14,R$/l,3.899,0.175,3.699,4.299,0.809,0.045,3.09,0,3.09,3.09,0
90498,2018-12-01 00:00:00,OLEO DIESEL S10,SUDESTE,SAO PAULO,VOTORANTIM,36,R$/l,3.41,0.142,3.199,3.699,0.504,0.042,2.907,0.098,2.819,3.07,0.034
90499,2018-12-01 00:00:00,OLEO DIESEL S10,SUDESTE,SAO PAULO,VOTUPORANGA,42,R$/l,3.633,0.148,3.199,3.89,0.453,0.041,3.18,0.131,3.044,3.374,0.041
90500,2018-12-01 00:00:00,OLEO DIESEL S10,SUL,SANTA CATARINA,XANXERE,16,R$/l,3.438,0.198,3.169,3.77,0.411,0.058,3.027,0.047,2.97,3.072,0.015


In [15]:
df2 # editado

Unnamed: 0,MÊS,PRODUTO,REGIÃO,ESTADO,MUNICÍPIO,NÚMERO DE POSTOS PESQUISADOS,UNIDADE DE MEDIDA,PREÇO MÉDIO REVENDA,DESVIO PADRÃO REVENDA,PREÇO MÍNIMO REVENDA,PREÇO MÁXIMO REVENDA,MARGEM MÉDIA REVENDA,COEF DE VARIAÇÃO REVENDA,PREÇO MÉDIO DISTRIBUIÇÃO,DESVIO PADRÃO DISTRIBUIÇÃO,PREÇO MÍNIMO DISTRIBUIÇÃO,PREÇO MÁXIMO DISTRIBUIÇÃO,COEF DE VARIAÇÃO DISTRIBUIÇÃO
0,01-01-2016,ETANOL HIDRATADO,NORTE,PARA,ABAETETUBA,4,R$/l,3.430,0.000,3.430,3.430,0.524,0.000,2.906,0.098,2.822,2.991,0.034
1,01-01-2016,ETANOL HIDRATADO,NORDESTE,PERNAMBUCO,ABREU E LIMA,28,R$/l,2.800,0.097,2.690,2.999,0.326,0.035,2.474,0.005,2.470,2.480,0.002
2,01-01-2016,ETANOL HIDRATADO,NORDESTE,MARANHAO,ACAILANDIA,16,R$/l,3.215,0.192,2.829,3.350,0.352,0.060,2.863,0.000,2.863,2.863,0.000
3,01-01-2016,ETANOL HIDRATADO,SUDESTE,SAO PAULO,ADAMANTINA,32,R$/l,2.548,0.078,2.470,2.690,0.364,0.031,2.184,0.096,2.080,2.324,0.044
4,01-01-2016,ETANOL HIDRATADO,CENTRO OESTE,GOIAS,AGUAS LINDAS DE GOIAS,20,R$/l,3.096,0.099,2.970,3.190,0.000,0.032,0.000,0.000,0.000,0.000,0.000
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
90481,12-01-2018,OLEO DIESEL S10,SUDESTE,RIO DE JANEIRO,VOLTA REDONDA,14,R$/l,3.899,0.175,3.699,4.299,0.809,0.045,3.090,0.000,3.090,3.090,0.000
90482,12-01-2018,OLEO DIESEL S10,SUDESTE,SAO PAULO,VOTORANTIM,36,R$/l,3.410,0.142,3.199,3.699,0.504,0.042,2.907,0.098,2.819,3.070,0.034
90483,12-01-2018,OLEO DIESEL S10,SUDESTE,SAO PAULO,VOTUPORANGA,42,R$/l,3.633,0.148,3.199,3.890,0.453,0.041,3.180,0.131,3.044,3.374,0.041
90484,12-01-2018,OLEO DIESEL S10,SUL,SANTA CATARINA,XANXERE,16,R$/l,3.438,0.198,3.169,3.770,0.411,0.058,3.027,0.047,2.970,3.072,0.015


# Perguntas

- Como a pandemia afetou esses dados, análise dos anos anteriores e subsequentes a pandemia?
- Como a mudança de Governo afetou esses dados, análise dos anos anteriores e do Governo atual?
- Como a pandemia e a mudança de Governo vão se correlacionar?
- O preço do combustível condiz com a realidade brasileira de renda?
- A mudança nos preços do combustível fez a população usar mais veículos particulares, públicos ou de aplicativos?
- Essas mudanças afetaram o valor de passagens de veículos públicos, particulares ou de aplicativo?
- Como a pandemia afetou o PIB brasileiro, com base em análise dos anos anteriores e subsequentes a pandemia?
- Como a pandemia afetou a inflação (IPCA) brasileira?
- Como a pandemia afetou o IDH brasileiro?
- Como esses dados estão correlacionados e podem eles apontar de fato o quanto a pandemia afetou a vida do brasileiro?
- Como esses acontecimentos influenciaram nas ações da Petrobrás e preço do combustível?
- E como essa variação nas ações e preços influenciam na vida do brasileiro?

# Conclusão

Uma das melhorias que pode ser feita nesta análise é de adicionar dados mais recentes, para que a comparação possa ser feita de uma melhor forma