# MONAN_POST

Neste notebook é apresentado um exemplo de conversão das tabelas de estatísticas do SCANTEC (arquivos tabulados) para o formato CSV. Os arquivos CSV são utilizados como fontes de dados de um catáogo carregado com o `intake`.

**Nota:** o notebook [`03-intake_catalog.ipynb`](03-intake_catalog.ipynb) apresenta a utilização da biblioteca `intake` para o acesso remoto destes dados por meio de um catálogo. Veja o script `create_catalog.sh` para a criação do arquivo de catálogo.

Nas células a seguir, observe que a instrução `%%time` é utilizada para contabilizar o tempo de execução do comando.

In [1]:
import os
import re
import numpy as np
import pandas as pd

from datetime import datetime, timedelta

In [2]:
# Listas de atributos (experimentos e estatísticas) utilizados no loop para a escrita dos arquivos
# Nota: os arquivos csv serão escritos localmente

Exps = ['DTC', 'BAMH', 'BAMH0', 'X666']
Stats = ['VIES', 'RMSE', 'ACOR']

data = '20230216002023030300'

burl = 'https://s0.cptec.inpe.br/pesquisa/das/dist/carlos.bastarz/MONAN/monan_post/data'

In [3]:
%%time

# Loop sobre as listas de atributos
# No loop, observe que os nomes dos arquivos são formados (e.g, VIESBAM_20230216002023030300T.scan)
# em seguida, os arquivos são lidos com o Pandas e armazenados no dicionário df_dic, onde são indexados
# pelo nome do arquivo
# Depois, o dicionário df_dic é concatenado e os arquivos CSV são escritos em disco (e.g., scantec_df_T1_hn.csv)

df_dic = {}

for exp in Exps:
    for stat in Stats:
        fname = stat + exp + '_' + data + 'T.scan'
        pname = os.path.join(burl, fname)
        df_dic[fname] = pd.read_csv(pname, sep="\s+")
                
        dfc = pd.concat(df_dic, axis=0)
        csv_name = 'scantec_df.csv'
        dfc.to_csv(csv_name)

CPU times: user 137 ms, sys: 463 µs, total: 137 ms
Wall time: 1.14 s


In [4]:
# Verificação do arquivo CSV escrito

dft = pd.read_csv('scantec_df.csv', index_col=[0,1])

In [5]:
dft

Unnamed: 0,Unnamed: 1,%Previsao,vtmp:925,vtmp:850,vtmp:500,temp:850,temp:500,temp:250,pslc:000,umes:925,umes:850,umes:500,zgeo:850,zgeo:500,zgeo:250,uvel:850,uvel:500,uvel:250,vvel:850,vvel:500,vvel:250
VIESDTC_20230216002023030300T.scan,0,0,0.276,-0.164,-0.071,-0.152,-0.076,-0.164,-0.035,-0.000,-0.000,0.000,4.119,2.143,0.771,-0.147,0.021,0.089,0.108,-0.013,-0.018
VIESDTC_20230216002023030300T.scan,1,24,0.231,-0.317,-0.058,-0.312,-0.068,-0.255,-0.199,0.000,-0.000,0.000,3.069,-0.859,-2.729,-0.151,0.026,0.018,0.097,0.009,-0.085
VIESDTC_20230216002023030300T.scan,2,48,0.165,-0.447,-0.042,-0.448,-0.055,-0.311,-0.364,0.000,0.000,0.000,2.034,-3.114,-4.541,-0.163,0.026,0.037,0.134,-0.046,-0.044
VIESDTC_20230216002023030300T.scan,3,72,0.136,-0.518,-0.045,-0.526,-0.058,-0.433,-0.457,0.000,0.000,0.000,1.406,-4.622,-6.719,-0.125,0.040,0.050,0.159,-0.035,-0.098
VIESDTC_20230216002023030300T.scan,4,96,0.107,-0.567,-0.073,-0.578,-0.088,-0.568,-0.518,0.000,0.000,0.000,0.966,-5.849,-9.173,-0.173,0.005,0.063,0.135,0.001,-0.124
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
ACORX666_20230216002023030300T.scan,7,168,-0.017,0.162,0.526,0.161,0.526,0.538,0.986,0.092,0.129,0.231,0.581,0.564,0.672,0.365,0.571,0.666,0.351,0.431,0.556
ACORX666_20230216002023030300T.scan,8,192,-0.055,0.153,0.438,0.152,0.438,0.412,0.986,0.067,0.157,0.247,0.552,0.475,0.582,0.422,0.473,0.598,0.437,0.311,0.477
ACORX666_20230216002023030300T.scan,9,216,-0.043,0.259,0.373,0.258,0.373,0.507,0.983,0.152,0.241,0.185,0.540,0.494,0.542,0.435,0.410,0.508,0.377,0.377,0.448
ACORX666_20230216002023030300T.scan,10,240,-0.109,0.057,0.244,0.056,0.244,0.239,0.980,0.036,0.090,0.034,0.537,0.486,0.550,0.347,0.354,0.475,0.348,0.354,0.413
