<a href="https://colab.research.google.com/github/vitorsr/ccd/blob/master/baseline.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Análise de Dados de Queimadas na Amazônia

In [1]:
!wget -q -O ccd_2019.zip https://www.dropbox.com/s/7rriacb7c6vzf3m/ccd_2019.zip

!unzip ccd_2019.zip

Archive:  ccd_2019.zip
replace bdmep_meta.csv? [y]es, [n]o, [A]ll, [N]one, [r]ename: N


## Bibliotecas Utilizadas

In [0]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

O objetivo aqui é analisar a Influência das Queimadas ocorrentes na Floresta Amazônica com variações na temperatura local. Para isto, utilizaremos dois Datasets:


*   O primeiro com Dados de Medições de Temperatura em várias estações do País;
*   O segundo com o resgistro de Queimadas no Brasil.



## Leitura dos Dados

<br><b>variável | descrição | unidade </b></br>
<br>date | data e hora da coleta             -
<br> id | ID da estação de coleta             -
<br>prec | precipitação |       mm
<br>tair | temperatura do ar  | graus Celsius
<br>tw | temperatura de bulbo úmido |  graus Celsius
<br>tmax | temperatura máxima do ar  | graus Celsius
<br>tmin | temperatura mínima do ar  | graus Celsius
<br>urmax | umidade relativa máxima  |             %
<br>patm | pressão atmosférica         |    hPa
<br>pnmm | pressão atmosférica média ao nível do mar  |           hPa
<br>wd | direção do vento        |   graus
<br>wsmax  |  rajadas de vento    |         m/s
<br>n |  horas de sol          |     h
<br>cc |   cobertura de nuvens    |           -
<br>evap  |   evaporação        |      mm
<br>ur  |   umidade relativa          |     %
<br>ws   |   velocidade do vento       |      m/s

In [3]:
df = pd.read_csv("inmetr.csv")
df.head()

Unnamed: 0,date,id,prec,tair,tw,tmax,tmin,urmax,patm,pnmm,wd,wsmax,n,cc,evap,ur,ws
0,1970-05-04 00:00:00,83010,,,,32.6,,,,,,,,,,89.75,
1,1970-05-04 12:00:00,83010,,25.3,24.0,,23.7,90.0,1005.9,,,,,5.0,,,
2,1970-05-04 18:00:00,83010,,29.2,27.1,,,85.0,1004.2,,,,,4.0,,,
3,1970-05-05 00:00:00,83010,,25.0,27.0,32.6,,92.0,1007.5,,,,,8.0,2.4,88.25,
4,1970-05-05 12:00:00,83010,0.0,25.9,24.8,,23.0,91.0,1006.4,,,,,4.0,,,


In [4]:
df_m = pd.read_csv("bdmep_meta.csv")
df_m.head()

Unnamed: 0,id,lon,lat,alt,name,state,uf,time_zone,offset_utc,time_zone.1,offset_utc.1
0,83010,-68.733333,-11.016667,260.0,Brasiléia,Acre,AC,America/Rio_Branco,-5,America/Rio_Branco,-5
1,82704,-72.666667,-7.633333,170.0,Cruzeiro do Sul,Acre,AC,America/Rio_Branco,-5,America/Rio_Branco,-5
2,82915,-67.8,-9.966667,160.0,Rio Branco,Acre,AC,America/Rio_Branco,-5,America/Rio_Branco,-5
3,82807,-70.766667,-8.166667,190.0,Tarauacá,Acre,AC,America/Rio_Branco,-5,America/Rio_Branco,-5
4,83098,-36.166667,-10.15,56.13,Coruripe,Alagoas,AL,America/Maceio,-3,America/Maceio,-3


O primeiro arquivo contém os dados principais de Medições Metereológicas, enquanto o segundo apenas serve de referência para a Localização das Estações de Medição.

Os dados apresentados abragem o período entre os anos de 1970 e 2018.

Filtrando apenas para o Mês de Agosto:

In [0]:
mask = (df['date'].str.split('-',n=2,expand = True)[1]=='08')
df = df.loc[mask]

Extraindo o Ano da Coluna Date:

In [6]:
df['date'] = df['date'].str.split('-',n=1,expand = True)[1]
df.head()

Unnamed: 0,date,id,prec,tair,tw,tmax,tmin,urmax,patm,pnmm,wd,wsmax,n,cc,evap,ur,ws
267,08-01 00:00:00,83010,,23.5,24.0,33.2,,,1006.6,,,,,8.0,2.2,84.75,
268,08-01 12:00:00,83010,0.0,20.2,20.0,,15.8,98.0,1010.4,,,,,6.0,,,
269,08-01 18:00:00,83010,,31.5,24.9,,,59.0,1009.2,,,,,5.0,,,
270,08-02 00:00:00,83010,,23.3,24.2,33.1,,91.0,1007.2,,,,,8.0,2.7,83.5,
271,08-02 12:00:00,83010,0.0,20.0,19.9,,15.0,99.0,1009.2,,,,,5.0,,,


Extraindo a Hora de Medição da Coluna Date:

In [7]:
df['date'] = df['date'].str.split(' ',n=1,expand = True)[0]
df.head()

Unnamed: 0,date,id,prec,tair,tw,tmax,tmin,urmax,patm,pnmm,wd,wsmax,n,cc,evap,ur,ws
267,08-01,83010,,23.5,24.0,33.2,,,1006.6,,,,,8.0,2.2,84.75,
268,08-01,83010,0.0,20.2,20.0,,15.8,98.0,1010.4,,,,,6.0,,,
269,08-01,83010,,31.5,24.9,,,59.0,1009.2,,,,,5.0,,,
270,08-02,83010,,23.3,24.2,33.1,,91.0,1007.2,,,,,8.0,2.7,83.5,
271,08-02,83010,0.0,20.0,19.9,,15.0,99.0,1009.2,,,,,5.0,,,


In [9]:
df = df.groupby(by=['date','id']).mean()
df.head(1000)

Unnamed: 0_level_0,Unnamed: 1_level_0,prec,tair,tw,tmax,tmin,urmax,patm,pnmm,wd,wsmax,n,cc,evap,ur,ws
date,id,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
08-01,82024,6.219512,27.235246,24.882883,32.069231,23.400000,82.653465,1001.716216,1012.100000,10.176471,1.075546,5.175000,5.283333,2.597143,84.051471,1.025690
08-01,82042,4.133333,26.858163,24.436145,32.003448,22.071875,83.881579,,,3.578947,0.445155,5.860000,5.754639,2.345455,83.840000,0.467799
08-01,82093,5.218182,25.945946,24.086486,31.000000,20.669231,85.419355,,,6.225806,0.675676,6.153846,0.683784,2.023077,88.363636,0.641026
08-01,82095,5.961905,26.821538,24.107692,31.630000,21.500000,80.377358,,,6.867925,1.218750,6.473684,1.259574,2.780000,82.944444,1.318182
08-01,82098,3.716327,28.204027,25.416779,31.976000,23.632000,79.320611,1010.518571,1012.216667,5.694656,1.385906,8.237500,4.687162,2.765306,81.761364,1.290667
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
08-03,83536,0.000000,22.933333,17.099333,29.255814,12.516327,56.486667,,,13.560000,1.500000,6.994118,3.213333,5.573529,58.590000,1.500000
08-03,83538,0.226667,17.738710,13.782203,22.569231,11.859524,67.421488,875.545690,1017.018421,15.935484,1.943463,8.028571,3.858871,4.297674,68.807692,1.912523
08-03,83539,0.000000,21.056250,15.862500,27.366667,11.350000,61.659574,,,9.479167,2.625000,,1.895833,5.185714,64.666667,2.500000
08-03,83543,0.282051,23.315044,19.342453,27.971053,16.582051,69.991071,997.501770,1019.783333,9.258929,1.185466,5.659459,4.703704,3.438889,70.175676,1.313808
