## Modelagem de Séries Temporais com ARIMA
### Exercício

Utilizando o conjunto de dados fornecido, realize as
seguintes atividades para desenvolvimento de modelos
preditivos e exploratórios robustos para entender padrões temporais e comportamentos categóricos nos dados deconsumo de energia:
#### Ajuste um modelo ARIMA 
para prever o consumo de
energia (Usage_kWh) com base nas variáveis
relacionadas, como potência reativa e CO2.
Regressão Logística
#### Ajuste um modelo de regressão logística para prever o tipo de carga (Tipo de carga) com base nas variáveis como consumo de energia, potência reativa, etc.

In [3]:
#importando as bibliotecas 
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import statsmodels.tsa.api as smt
from statsmodels.tsa.arima.model import ARIMA
import statsmodels.api as sm


In [4]:
#carregar os dados
data= pd.read_csv('consumo_energia.csv')


In [5]:
data.columns

Index(['date', 'Usage_kWh', 'Lagging_Current_Reactive.Power_kVarh',
       'Leading_Current_Reactive_Power_kVarh', 'CO2(tCO2)',
       'Lagging_Current_Power_Factor', 'Leading_Current_Power_Factor', 'NSM',
       'WeekStatus', 'Day_of_week', 'Load_Type'],
      dtype='object')

In [6]:
data.rename(columns={'Lagging_Current_Power_Factor':'Atrasado','Leading_Current_Power_Factor':'Adiantado','Load_Type':'tipo_carga','CO2(tCO2)':'CO2'},inplace=True)

In [10]:
data['tipo_carga'].replace({'Light_Load':'Carga leve','Medium_Load':'Carga Média','Maximum_Load':'Carga máxima'},inplace=True)

The behavior will change in pandas 3.0. This inplace method will never work because the intermediate object on which we are setting values always behaves as a copy.

For example, when doing 'df[col].method(value, inplace=True)', try using 'df.method({col: value}, inplace=True)' or df[col] = df[col].method(value) instead, to perform the operation inplace on the original object.


  data['tipo_carga'].replace({'Light_Load':'Carga leve','Medium_Load':'Carga Média','Maximum_Load':'Carga máxima'},inplace=True)


In [11]:
data.head()

Unnamed: 0,date,Usage_kWh,Lagging_Current_Reactive.Power_kVarh,Leading_Current_Reactive_Power_kVarh,CO2,Atrasado,Adiantado,NSM,WeekStatus,Day_of_week,tipo_carga
0,01/01/2018 00:15,3.17,2.95,0.0,0.0,73.21,100.0,900,Weekday,Monday,Carga leve
1,01/01/2018 00:30,4.0,4.46,0.0,0.0,66.77,100.0,1800,Weekday,Monday,Carga leve
2,01/01/2018 00:45,3.24,3.28,0.0,0.0,70.28,100.0,2700,Weekday,Monday,Carga leve
3,01/01/2018 01:00,3.31,3.56,0.0,0.0,68.09,100.0,3600,Weekday,Monday,Carga leve
4,01/01/2018 01:15,3.82,4.5,0.0,0.0,64.72,100.0,4500,Weekday,Monday,Carga leve


In [12]:
#separando as variaveis dependentes e independentes
X= data[['Atrasado','Adiantado','CO2']]
y=data['Usage_kWh']

In [13]:
X= sm.add_constant(X)
X.head()

Unnamed: 0,const,Atrasado,Adiantado,CO2
0,1.0,73.21,100.0,0.0
1,1.0,66.77,100.0,0.0
2,1.0,70.28,100.0,0.0
3,1.0,68.09,100.0,0.0
4,1.0,64.72,100.0,0.0


In [9]:
y.head()

0    3.17
1    4.00
2    3.24
3    3.31
4    3.82
Name: Usage_kWh, dtype: float64

In [14]:
#AJUSTAR MODELO ARIMA
arima_modelo= ARIMA(y,order=(1,1,1),exog=X.values)
arima_results=arima_modelo.fit()

In [15]:
print(arima_results.summary())

                               SARIMAX Results                                
Dep. Variable:              Usage_kWh   No. Observations:                35040
Model:                 ARIMA(1, 1, 1)   Log Likelihood             -100653.014
Date:                Fri, 13 Dec 2024   AIC                         201320.028
Time:                        22:28:15   BIC                         201379.277
Sample:                             0   HQIC                        201338.899
                              - 35040                                         
Covariance Type:                  opg                                         
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const       1.565e-05   9.05e-10   1.73e+04      0.000    1.56e-05    1.56e-05
x1             0.0445      0.003     13.431      0.000       0.038       0.051
x2             0.0177      0.002      9.758      0.0

In [16]:
## REGRESSÃO LOGISTICA

In [19]:
y_logit=(data['tipo_carga']=='Carga máxima').astype(int)  # previsão de carga
x_logit= sm.add_constant(data[['Usage_kWh','Atrasado','Adiantado']])

In [25]:
y_logit.value_counts()

tipo_carga
0    27768
1     7272
Name: count, dtype: int64

In [22]:
#aJUSTAR O MODELo de regressão logistica

logic_model= sm.Logit(y_logit,x_logit)
logic_result= logic_model.fit()

Optimization terminated successfully.
         Current function value: 0.364896
         Iterations 8


In [23]:
# imprimir o sumario de resultados
print(logic_result.summary())

                           Logit Regression Results                           
Dep. Variable:             tipo_carga   No. Observations:                35040
Model:                          Logit   Df Residuals:                    35036
Method:                           MLE   Df Model:                            3
Date:                Fri, 13 Dec 2024   Pseudo R-squ.:                  0.2855
Time:                        22:38:02   Log-Likelihood:                -12786.
converged:                       True   LL-Null:                       -17894.
Covariance Type:            nonrobust   LLR p-value:                     0.000
                 coef    std err          z      P>|z|      [0.025      0.975]
------------------------------------------------------------------------------
const        -11.5010      0.199    -57.712      0.000     -11.892     -11.110
Usage_kWh      0.0151      0.001     26.713      0.000       0.014       0.016
Atrasado       0.0772      0.002     44.445      0.0