# PETCA - Projeto de Análise de Contas de Energia com Aprendizado de Máquina e Redes Neurais

## Índice
- [Modelos Utilizados](#modelos-utilizados)
- [Importando Pacotes e Bibliotecas](#importando-os-pacotes-e-bibliotecas)
- [Importando os Datasets](#importando-os-datasets)
- [Análise Inicial dos Datasets](#análise-inicial-dos-datasets)
- [Análise Exploratória dos Dados](#aed)
- [Criando os Modelos](#criando-os-modelos)
- [Treinando os Modelos](#treinando-os-modelos)
- [Resultados os Modelos](#resultados-dos-modelos)
    - [Realização dos Testes](#testes)
    - [Qualidade dos Modelos](#qualidade-dos-testes-e-resultados)
- [Discussão](#discussão)

## Modelos Utilizados
- Árvore de Decisão (Decision Tree)
- Ensemble
- Floresta Randômica (Random Forest)
- Redes Neurais Convolucionais
- Regressão Linear
- Regressão Polinomial
- Support Vector Machine (SVM)

## Importando os pacotes e bibliotecas

In [112]:
# biblioteca para realizar o corte teste | treino
from sklearn.model_selection import train_test_split

# bibliotecas de classificacao
## Floresta Randomica;
## Arvore de Decisao; e
## Support Vector Machine (SVM).
from sklearn.ensemble  import RandomForestClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC

# biblioteca de modelos polinomiais
from sklearn.preprocessing import PolynomialFeatures

# biblioteca de modelos lineares
## Regressao Linear; e
## Support Vector Machine (SVM).
from sklearn.linear_model import LinearRegression
from sklearn.svm import SVR

# bibliotecas de suporte -----
## Impressao de Graficos
from matplotlib import pyplot as plt
import seaborn as sns

## Bibliotecas Base
import pandas as pd
import numpy as np
# ----------------------------

# bibliotecas e pacotes do TensorFlow
## Redes Neurais Convolucionais
import tensorflow as tf
from keras import layers, models

## Importando os datasets

In [113]:
df_residencial_raw = pd.read_csv("./datasets_directory/raw/CONSUMO MENSAL DE ENERGIA ELÉTRICA POR CLASSE - CONSUMO COMERCIAL POR UF.csv", sep = ",", index_col = 0)

## Análise Inicial dos Datasets

### Consumo Residencial por UF

In [114]:
df_residencial_raw.sample(10)

Unnamed: 0_level_0,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7,Unnamed: 8,Unnamed: 9,Unnamed: 10,...,Unnamed: 243,Unnamed: 244,Unnamed: 245,Unnamed: 246,Unnamed: 247,Unnamed: 248,Unnamed: 249,Unnamed: 250,Unnamed: 251,Unnamed: 252
Empresa de Pesquisa Energética - EPE,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
São Paulo,1.464.892,1.421.228,1.416.476,1.579.356,1.386.690,1.280.007,1.348.602,1.324.733,1.430.716,1.455.419,...,2.892.123,2.895.258,,,,,,,,
Nota: atualização defasada para não antecipar informações de distribuidoras que devem obedecer às intruções da CVM sobre publicação de resultados.,,,,,,,,,,,...,,,,,,,,,,
Ceará,95.718,86.036,90.007,97.130,90.758,93.769,87.315,92.398,97.622,93.272,...,198.606,210.942,,,,,,,,
Mato Grosso do Sul,51.890,52.518,51.766,55.175,49.652,44.157,42.928,49.557,51.954,53.788,...,112.134,109.172,,,,,,,,
Alagoas,31.461,30.119,30.572,32.021,30.512,31.649,27.614,27.011,29.066,29.911,...,89.432,86.378,,,,,,,,
Amazonas,49.832,50.457,47.374,49.875,50.280,50.734,53.678,52.589,55.148,55.374,...,138.722,137.675,,,,,,,,
Amapá,12.164,3.894,8.639,8.461,7.704,8.309,8.733,7.858,8.850,9.621,...,25.339,26.448,,,,,,,,
Piauí,25.398,20.769,20.862,24.450,22.877,24.519,22.937,22.760,25.997,25.217,...,76.114,75.892,,,,,,,,
Acre,7.895,7.329,7.420,7.345,7.142,6.955,7.205,7.580,8.309,7.910,...,23.381,24.940,,,,,,,,
Santa Catarina,170.067,178.963,186.488,170.145,161.961,142.886,146.430,147.906,148.465,153.485,...,546.541,543.175,,,,,,,,


#### Quantidade de valores nulos

In [115]:
df_residencial_transposto = df_residencial_raw.transpose()
df_residencial_transposto.sample(10)

Empresa de Pesquisa Energética - EPE,Consumo de energia elétrica na rede (MWh),Sistema SIMPLES,NaN,NaN.1,NaN.2,TOTAL POR UF,Rondônia,Acre,Amazonas,Roraima,...,Rio de Janeiro,São Paulo,Paraná,Santa Catarina,Rio Grande do Sul,Mato Grosso do Sul,Mato Grosso,Goiás,Distrito Federal,Nota: atualização defasada para não antecipar informações de distribuidoras que devem obedecer às intruções da CVM sobre publicação de resultados.
Unnamed: 136,,,,,ABR,7.906.721,53.46,18.96,109.084,16.211,...,970.97,2.598.202,530.812,365.028,483.143,107.193,144.484,205.112,171.004,
Unnamed: 34,,,,,OUT,4.581.791,31.501,9.672,60.813,7.223,...,602.321,1.573.310,283.187,172.002,267.765,52.846,77.126,111.893,103.907,
Unnamed: 31,,,,,JUL,4.230.515,30.472,9.341,59.619,6.062,...,552.084,1.443.154,264.781,168.342,264.047,48.172,71.54,96.527,93.021,
Unnamed: 48,,,,,DEZ,5.161.662,31.473,10.03,66.297,8.529,...,659.305,1.759.015,328.203,212.892,322.732,58.207,81.445,121.354,116.875,
Unnamed: 224,,,,,AGO,7.308.252,62.511,22.869,135.581,23.444,...,690.904,2.262.362,530.176,370.186,382.543,103.458,145.534,212.732,158.129,
Unnamed: 120,,,,,DEZ,7.572.217,53.253,16.636,108.177,13.929,...,889.383,2.563.194,495.333,321.712,462.041,104.814,133.112,189.498,177.349,
Unnamed: 246,,,,,JUN,,,,,,...,,,,,,,,,,
Unnamed: 234,,,,,JUN,7.600.730,63.088,21.567,132.674,22.501,...,713.134,2.394.251,518.358,403.19,402.681,77.641,140.848,217.487,163.329,
Unnamed: 93,,,,,SET,6.113.102,46.396,14.487,95.488,11.53,...,707.78,2.042.227,414.322,247.669,351.584,77.98,112.049,170.9,157.65,
Unnamed: 198,,,,,JUN,5.611.285,49.545,17.47,94.842,14.633,...,609.429,1.727.509,408.897,291.964,297.931,73.437,123.006,152.168,128.192,


In [116]:
columns_to_be_droped = [
    "Consumo de energia elétrica na rede (MWh)",
    "Sistema SIMPLES",
    "Nota: atualização defasada para não antecipar informações de distribuidoras que devem obedecer às intruções da CVM sobre publicação de resultados."
]

In [117]:
df_residencial_transposto.drop(columns = columns_to_be_droped, inplace = True, axis = "columns")
df_residencial_transposto.sample(10)

Empresa de Pesquisa Energética - EPE,NaN,NaN.1,NaN.2,TOTAL POR UF,Rondônia,Acre,Amazonas,Roraima,Pará,Amapá,...,Espírito Santo,Rio de Janeiro,São Paulo,Paraná,Santa Catarina,Rio Grande do Sul,Mato Grosso do Sul,Mato Grosso,Goiás,Distrito Federal
Unnamed: 234,,,JUN,7.600.730,63.088,21.567,132.674,22.501,197.274,25.023,...,150.047,713.134,2.394.251,518.358,403.19,402.681,77.641,140.848,217.487,163.329
Unnamed: 42,,,JUN,4.587.140,30.746,8.845,63.666,7.022,95.474,10.952,...,95.068,577.943,1.559.608,294.247,174.883,283.037,46.731,73.861,107.7,108.56
Unnamed: 118,,,OUT,7.071.235,55.138,17.748,112.59,14.97,146.416,21.607,...,138.417,851.243,2.331.562,445.573,289.931,394.109,87.824,133.671,190.949,169.395
Unnamed: 247,,,JUL,,,,,,,,...,,,,,,,,,,
Unnamed: 166,,,OUT,7.309.398,59.752,25.05,105.666,19.052,164.753,24.081,...,140.469,813.67,2.365.924,491.471,315.708,387.34,105.478,147.462,200.851,160.81
Unnamed: 75,,,MAR,6.162.429,36.503,10.884,70.211,9.187,112.369,13.666,...,123.78,803.336,2.021.962,398.905,285.548,407.398,74.204,108.259,157.397,138.439
Unnamed: 51,,,MAR,5.242.256,29.042,9.498,59.746,7.792,91.565,10.941,...,116.013,698.453,1.740.362,353.332,238.409,348.897,60.641,92.921,139.027,108.137
Unnamed: 213,,,SET,7.274.689,61.224,23.707,125.349,21.839,177.392,25.153,...,142.23,725.812,2.255.854,513.504,351.131,341.161,107.484,153.48,224.473,175.27
Unnamed: 116,,,AGO,6.601.480,50.027,15.395,108.376,12.683,141.526,18.219,...,139.785,780.516,2.068.582,455.781,278.257,404.969,86.111,119.864,183.617,166.481
Unnamed: 96,,,DEZ,6.497.944,47.466,15.066,89.652,12.243,127.926,17.688,...,134.714,778.591,2.114.706,416.117,285.78,424.115,84.721,112.723,161.371,158.356


In [118]:
columns_to_rename = pd.Series(df_residencial_transposto.columns)
columns_to_rename = columns_to_rename.fillna("new_name" + (columns_to_rename.groupby(columns_to_rename.isnull()).cumcount() + 1).astype(str))
df_residencial_transposto.columns = columns_to_rename
df_residencial_transposto.sample(10)

Empresa de Pesquisa Energética - EPE,new_name1,new_name2,new_name3,TOTAL POR UF,Rondônia,Acre,Amazonas,Roraima,Pará,Amapá,...,Espírito Santo,Rio de Janeiro,São Paulo,Paraná,Santa Catarina,Rio Grande do Sul,Mato Grosso do Sul,Mato Grosso,Goiás,Distrito Federal
Unnamed: 16,,,ABR,4.687.080,26.397,7.587,55.001,6.593,80.3,9.605,...,94.214,655.122,1.598.831,303.092,195.076,292.785,59.365,85.785,108.313,99.922
Unnamed: 6,,,JUN,3.840.001,24.112,6.955,50.734,5.472,82.47,8.309,...,75.631,506.022,1.280.007,239.079,142.886,245.687,44.157,61.587,91.441,87.778
Unnamed: 34,,,OUT,4.581.791,31.501,9.672,60.813,7.223,95.213,10.892,...,91.027,602.321,1.573.310,283.187,172.002,267.765,52.846,77.126,111.893,103.907
Unnamed: 108,,,DEZ,7.079.565,50.225,16.036,100.25,13.597,128.819,19.642,...,147.533,874.682,2.308.213,476.817,322.23,461.929,98.093,118.621,180.039,166.843
Unnamed: 8,,,AGO,3.867.268,25.604,7.58,52.589,5.82,83.589,7.858,...,75.329,503.307,1.324.733,249.298,147.906,248.318,49.557,62.202,89.515,84.814
Unnamed: 197,,,MAI,5.774.137,50.229,16.436,84.384,14.57,126.892,18.051,...,109.376,593.771,1.855.630,407.228,296.154,313.339,83.259,127.306,155.934,131.602
Unnamed: 45,,,SET,4.787.183,33.884,9.846,65.619,7.467,98.375,12.092,...,95.25,621.84,1.665.504,315.13,188.014,284.909,55.571,79.517,115.259,107.222
Unnamed: 244,,,ABR,9.043.976,67.186,24.94,137.675,26.73,196.144,26.448,...,203.481,846.565,2.895.258,663.592,543.175,506.31,109.172,161.94,256.047,176.765
Unnamed: 182,,,FEV,8.209.493,55.38,19.788,104.135,17.042,135.183,19.099,...,163.971,930.608,2.664.088,586.916,435.692,501.161,121.933,159.859,220.296,175.615
Unnamed: 53,,,MAI,5.090.517,31.079,9.743,64.721,8.278,92.931,9.497,...,111.173,672.621,1.693.988,318.134,208.745,306.33,54.7,81.725,141.226,115.668


In [132]:
df_residencial_transposto.drop(columns = ["new_name1"], inplace = True,axis = "columns")
df_residencial_transposto.sample(10)

In [153]:
#df_residencial_transposto[~df_residencial_transposto["new_name2"].isna()]
df_residencial_transposto.groupby(["new_name2", "new_name3"]).count()

Unnamed: 0_level_0,Empresa de Pesquisa Energética - EPE,TOTAL POR UF,Rondônia,Acre,Amazonas,Roraima,Pará,Amapá,Tocantins,Maranhão,Piauí,...,Espírito Santo,Rio de Janeiro,São Paulo,Paraná,Santa Catarina,Rio Grande do Sul,Mato Grosso do Sul,Mato Grosso,Goiás,Distrito Federal
new_name2,new_name3,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
2004,JAN,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
2005,JAN,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
2006,JAN,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
2007,JAN,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
2008,JAN,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
2009,JAN,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
2010,JAN,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
2011,JAN,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
2012,JAN,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1
2013,JAN,1,1,1,1,1,1,1,1,1,1,...,1,1,1,1,1,1,1,1,1,1


## AED
### Análise Exploratória dos Dados

## Criando os Modelos

## Treinando os Modelos

## Resultados dos Modelos

### Testes

### Qualidade dos Testes e Resultados

## Discussão

A discussão vai aqui