# EBAC - Regressão II - regressão múltipla

## Tarefa I

#### Previsão de renda

Vamos trabalhar com a base 'previsao_de_renda.csv', que é a base do seu próximo projeto. Vamos usar os recursos que vimos até aqui nesta base.

|variavel|descrição|
|-|-|
|data_ref                | Data de referência de coleta das variáveis |
|index                   | Código de identificação do cliente|
|sexo                    | Sexo do cliente|
|posse_de_veiculo        | Indica se o cliente possui veículo|
|posse_de_imovel         | Indica se o cliente possui imóvel|
|qtd_filhos              | Quantidade de filhos do cliente|
|tipo_renda              | Tipo de renda do cliente|
|educacao                | Grau de instrução do cliente|
|estado_civil            | Estado civil do cliente|
|tipo_residencia         | Tipo de residência do cliente (própria, alugada etc)|
|idade                   | Idade do cliente|
|tempo_emprego           | Tempo no emprego atual|
|qt_pessoas_residencia   | Quantidade de pessoas que moram na residência|
|renda                   | Renda em reais|

In [1]:
import pandas as pd

In [2]:
df = pd.read_csv('previsao_de_renda.csv')

In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15000 entries, 0 to 14999
Data columns (total 15 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   Unnamed: 0             15000 non-null  int64  
 1   data_ref               15000 non-null  object 
 2   id_cliente             15000 non-null  int64  
 3   sexo                   15000 non-null  object 
 4   posse_de_veiculo       15000 non-null  bool   
 5   posse_de_imovel        15000 non-null  bool   
 6   qtd_filhos             15000 non-null  int64  
 7   tipo_renda             15000 non-null  object 
 8   educacao               15000 non-null  object 
 9   estado_civil           15000 non-null  object 
 10  tipo_residencia        15000 non-null  object 
 11  idade                  15000 non-null  int64  
 12  tempo_emprego          12427 non-null  float64
 13  qt_pessoas_residencia  15000 non-null  float64
 14  renda                  15000 non-null  float64
dtypes:

1. Ajuste um modelo para prever log(renda) considerando todas as covariáveis disponíveis.
    - Utilizando os recursos do Patsy, coloque as variáveis qualitativas como *dummies*.
    - Mantenha sempre a categoria mais frequente como casela de referência
    - Avalie os parâmetros e veja se parecem fazer sentido prático.  


2. Remova a variável menos significante e analise:
    - Observe os indicadores que vimos, e avalie se o modelo melhorou ou piorou na sua opinião.
    - Observe os parâmetros e veja se algum se alterou muito.  


3. Siga removendo as variáveis menos significantes, sempre que o *p-value* for menor que 5%. Compare o modelo final com o inicial. Observe os indicadores e conclua se o modelo parece melhor. 
    

In [4]:
# item 1

df.isna().sum()

Unnamed: 0                  0
data_ref                    0
id_cliente                  0
sexo                        0
posse_de_veiculo            0
posse_de_imovel             0
qtd_filhos                  0
tipo_renda                  0
educacao                    0
estado_civil                0
tipo_residencia             0
idade                       0
tempo_emprego            2573
qt_pessoas_residencia       0
renda                       0
dtype: int64

In [5]:
df1 = df.dropna()
df1.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 12427 entries, 0 to 14999
Data columns (total 15 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   Unnamed: 0             12427 non-null  int64  
 1   data_ref               12427 non-null  object 
 2   id_cliente             12427 non-null  int64  
 3   sexo                   12427 non-null  object 
 4   posse_de_veiculo       12427 non-null  bool   
 5   posse_de_imovel        12427 non-null  bool   
 6   qtd_filhos             12427 non-null  int64  
 7   tipo_renda             12427 non-null  object 
 8   educacao               12427 non-null  object 
 9   estado_civil           12427 non-null  object 
 10  tipo_residencia        12427 non-null  object 
 11  idade                  12427 non-null  int64  
 12  tempo_emprego          12427 non-null  float64
 13  qt_pessoas_residencia  12427 non-null  float64
 14  renda                  12427 non-null  float64
dtypes:

In [6]:
import statsmodels.api as sm
import numpy as np
import patsy

In [7]:
df1[['sexo']].value_counts()

sexo
F       7901
M       4526
dtype: int64

In [8]:
df1[['tipo_renda']].value_counts()

tipo_renda      
Assalariado         7633
Empresário          3508
Servidor público    1268
Bolsista               9
Pensionista            9
dtype: int64

In [9]:
df1[['educacao']].value_counts()

educacao           
Secundário             7045
Superior completo      4695
Superior incompleto     558
Primário                103
Pós graduação            26
dtype: int64

In [10]:
df1[['estado_civil']].value_counts()

estado_civil
Casado          8897
Solteiro        1543
União            924
Separado         739
Viúvo            324
dtype: int64

In [11]:
df1[['tipo_residencia']].value_counts()

tipo_residencia
Casa               11071
Com os pais          674
Governamental        360
Aluguel              183
Estúdio               75
Comunitário           64
dtype: int64

In [12]:
y, x = patsy.dmatrices(
    'np.log(renda) ~ C(sexo) + posse_de_veiculo + posse_de_imovel + qtd_filhos + C(tipo_renda) + C(educacao, Treatment(2)) + C(estado_civil) + C(tipo_residencia, Treatment(1)) + idade + tempo_emprego + qt_pessoas_residencia',
    df1    
)

In [13]:
x

DesignMatrix with shape (12427, 25)
  Columns:
    ['Intercept',
     'C(sexo)[T.M]',
     'posse_de_veiculo[T.True]',
     'posse_de_imovel[T.True]',
     'C(tipo_renda)[T.Bolsista]',
     'C(tipo_renda)[T.Empresário]',
     'C(tipo_renda)[T.Pensionista]',
     'C(tipo_renda)[T.Servidor público]',
     'C(educacao, Treatment(2))[T.Primário]',
     'C(educacao, Treatment(2))[T.Pós graduação]',
     'C(educacao, Treatment(2))[T.Superior completo]',
     'C(educacao, Treatment(2))[T.Superior incompleto]',
     'C(estado_civil)[T.Separado]',
     'C(estado_civil)[T.Solteiro]',
     'C(estado_civil)[T.União]',
     'C(estado_civil)[T.Viúvo]',
     'C(tipo_residencia, Treatment(1))[T.Aluguel]',
     'C(tipo_residencia, Treatment(1))[T.Com os pais]',
     'C(tipo_residencia, Treatment(1))[T.Comunitário]',
     'C(tipo_residencia, Treatment(1))[T.Estúdio]',
     'C(tipo_residencia, Treatment(1))[T.Governamental]',
     'qtd_filhos',
     'idade',
     'tempo_emprego',
     'qt_pessoas_residen

In [14]:
sm.OLS(y, x).fit().summary()

0,1,2,3
Dep. Variable:,np.log(renda),R-squared:,0.357
Model:,OLS,Adj. R-squared:,0.356
Method:,Least Squares,F-statistic:,287.5
Date:,"Wed, 13 Jul 2022",Prob (F-statistic):,0.0
Time:,23:27:43,Log-Likelihood:,-13568.0
No. Observations:,12427,AIC:,27190.0
Df Residuals:,12402,BIC:,27370.0
Df Model:,24,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,6.5264,0.219,29.853,0.000,6.098,6.955
C(sexo)[T.M],0.7874,0.015,53.723,0.000,0.759,0.816
posse_de_veiculo[T.True],0.0441,0.014,3.119,0.002,0.016,0.072
posse_de_imovel[T.True],0.0829,0.014,5.926,0.000,0.055,0.110
C(tipo_renda)[T.Bolsista],0.2209,0.241,0.916,0.360,-0.252,0.694
C(tipo_renda)[T.Empresário],0.1551,0.015,10.387,0.000,0.126,0.184
C(tipo_renda)[T.Pensionista],-0.3087,0.241,-1.280,0.201,-0.782,0.164
C(tipo_renda)[T.Servidor público],0.0576,0.022,2.591,0.010,0.014,0.101
"C(educacao, Treatment(2))[T.Primário]",0.0141,0.072,0.196,0.844,-0.127,0.155

0,1,2,3
Omnibus:,0.858,Durbin-Watson:,2.023
Prob(Omnibus):,0.651,Jarque-Bera (JB):,0.839
Skew:,0.019,Prob(JB):,0.657
Kurtosis:,3.012,Cond. No.,2130.0


Analisando os parâmetros, verifica-se que há muitas variáveis insignificantes, ou seja, com P-value abaixo de 5%, o que demonstra forte ou moderada evidência contra a Hipótese Nula (ou zero). Desta forma, é possível se trabalhar na tentativa de se melhorar o modelo de regressão.

In [21]:
# item 2 - remoção variável sexo

y, x = patsy.dmatrices(
    'np.log(renda) ~ posse_de_veiculo + posse_de_imovel + qtd_filhos + C(tipo_renda) + C(educacao, Treatment(2)) + C(estado_civil) + C(tipo_residencia, Treatment(1)) + idade + tempo_emprego + qt_pessoas_residencia',
    df1    
)

In [23]:
sm.OLS(y, x).fit().summary()

0,1,2,3
Dep. Variable:,np.log(renda),R-squared:,0.208
Model:,OLS,Adj. R-squared:,0.206
Method:,Least Squares,F-statistic:,141.6
Date:,"Thu, 14 Jul 2022",Prob (F-statistic):,0.0
Time:,00:33:27,Log-Likelihood:,-14868.0
No. Observations:,12427,AIC:,29780.0
Df Residuals:,12403,BIC:,29960.0
Df Model:,23,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,7.0122,0.243,28.915,0.000,6.537,7.488
posse_de_veiculo[T.True],0.2815,0.015,18.875,0.000,0.252,0.311
posse_de_imovel[T.True],0.0572,0.016,3.688,0.000,0.027,0.088
C(tipo_renda)[T.Bolsista],0.0855,0.268,0.319,0.750,-0.439,0.610
C(tipo_renda)[T.Empresário],0.1130,0.017,6.821,0.000,0.080,0.145
C(tipo_renda)[T.Pensionista],-0.2609,0.268,-0.974,0.330,-0.786,0.264
C(tipo_renda)[T.Servidor público],0.0108,0.025,0.437,0.662,-0.038,0.059
"C(educacao, Treatment(2))[T.Primário]",0.0910,0.080,1.138,0.255,-0.066,0.248
"C(educacao, Treatment(2))[T.Pós graduação]",-0.0367,0.158,-0.233,0.816,-0.346,0.273

0,1,2,3
Omnibus:,24.399,Durbin-Watson:,2.034
Prob(Omnibus):,0.0,Jarque-Bera (JB):,24.482
Skew:,0.108,Prob(JB):,4.83e-06
Kurtosis:,3.021,Cond. No.,2130.0


Foi removido a variável sexo por ser insignificante (P-value abaixo de 5%). Ao realizar esta remoção o R-quadrado de 0,357 reduziu para 0,208.Em contrapartida algumas variáveis se tornaram mais significativas em relação ao primeiro modelo. Por outro lado, o indice AIC demonstra o primeiro modelo como sendo o melhor (menor indice AIC).Desta forma, pode-se afirmar que o primeiro modelo (mais completo) seja melhor do que o este último sem a variável sexo, em razão do R-quadrado e do incide AIC.

In [30]:
# item 3.A - remoção variável posse de imóvel e qtd pessoas na residencia


y, x = patsy.dmatrices(
    'np.log(renda) ~ posse_de_veiculo + qtd_filhos + C(tipo_renda) + C(educacao, Treatment(2)) + C(estado_civil) + C(tipo_residencia, Treatment(1)) + idade + tempo_emprego',
    df1    
)
sm.OLS(y, x).fit().summary()


0,1,2,3
Dep. Variable:,np.log(renda),R-squared:,0.207
Model:,OLS,Adj. R-squared:,0.205
Method:,Least Squares,F-statistic:,154.0
Date:,"Thu, 14 Jul 2022",Prob (F-statistic):,0.0
Time:,00:52:48,Log-Likelihood:,-14877.0
No. Observations:,12427,AIC:,29800.0
Df Residuals:,12405,BIC:,29960.0
Df Model:,21,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,7.5274,0.040,186.235,0.000,7.448,7.607
posse_de_veiculo[T.True],0.2822,0.015,18.914,0.000,0.253,0.311
C(tipo_renda)[T.Bolsista],0.1014,0.268,0.378,0.705,-0.424,0.627
C(tipo_renda)[T.Empresário],0.1129,0.017,6.813,0.000,0.080,0.145
C(tipo_renda)[T.Pensionista],-0.2766,0.268,-1.032,0.302,-0.802,0.249
C(tipo_renda)[T.Servidor público],0.0100,0.025,0.404,0.686,-0.038,0.058
"C(educacao, Treatment(2))[T.Primário]",0.0943,0.080,1.179,0.238,-0.062,0.251
"C(educacao, Treatment(2))[T.Pós graduação]",-0.0201,0.158,-0.127,0.899,-0.329,0.289
"C(educacao, Treatment(2))[T.Superior completo]",0.0641,0.015,4.155,0.000,0.034,0.094

0,1,2,3
Omnibus:,22.608,Durbin-Watson:,2.034
Prob(Omnibus):,0.0,Jarque-Bera (JB):,22.672
Skew:,0.104,Prob(JB):,1.19e-05
Kurtosis:,3.023,Cond. No.,1590.0


In [32]:
# item 3.B - remoção variáve posse de veículo


y, x = patsy.dmatrices(
    'np.log(renda) ~ qtd_filhos + C(tipo_renda) + C(educacao, Treatment(2)) + C(estado_civil) + C(tipo_residencia, Treatment(1)) + idade + tempo_emprego',
    df1    
)
sm.OLS(y, x).fit().summary()


0,1,2,3
Dep. Variable:,np.log(renda),R-squared:,0.184
Model:,OLS,Adj. R-squared:,0.183
Method:,Least Squares,F-statistic:,139.8
Date:,"Thu, 14 Jul 2022",Prob (F-statistic):,0.0
Time:,00:56:15,Log-Likelihood:,-15053.0
No. Observations:,12427,AIC:,30150.0
Df Residuals:,12406,BIC:,30300.0
Df Model:,20,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,7.7020,0.040,192.971,0.000,7.624,7.780
C(tipo_renda)[T.Bolsista],-0.0264,0.272,-0.097,0.923,-0.559,0.506
C(tipo_renda)[T.Empresário],0.1122,0.017,6.674,0.000,0.079,0.145
C(tipo_renda)[T.Pensionista],-0.2691,0.272,-0.990,0.322,-0.802,0.264
C(tipo_renda)[T.Servidor público],-0.0005,0.025,-0.018,0.985,-0.049,0.049
"C(educacao, Treatment(2))[T.Primário]",0.0994,0.081,1.225,0.221,-0.060,0.258
"C(educacao, Treatment(2))[T.Pós graduação]",0.0444,0.160,0.277,0.782,-0.269,0.358
"C(educacao, Treatment(2))[T.Superior completo]",0.0820,0.016,5.256,0.000,0.051,0.113
"C(educacao, Treatment(2))[T.Superior incompleto]",-0.0366,0.036,-1.006,0.314,-0.108,0.035

0,1,2,3
Omnibus:,25.814,Durbin-Watson:,2.031
Prob(Omnibus):,0.0,Jarque-Bera (JB):,25.964
Skew:,0.112,Prob(JB):,2.3e-06
Kurtosis:,2.983,Cond. No.,1590.0


In [33]:
# item 3.C - remoção variável estado civil


y, x = patsy.dmatrices(
    'np.log(renda) ~ qtd_filhos + C(tipo_renda) + C(educacao, Treatment(2)) + C(tipo_residencia, Treatment(1)) + idade + tempo_emprego',
    df1    
)
sm.OLS(y, x).fit().summary()


0,1,2,3
Dep. Variable:,np.log(renda),R-squared:,0.18
Model:,OLS,Adj. R-squared:,0.179
Method:,Least Squares,F-statistic:,170.5
Date:,"Thu, 14 Jul 2022",Prob (F-statistic):,0.0
Time:,00:58:29,Log-Likelihood:,-15082.0
No. Observations:,12427,AIC:,30200.0
Df Residuals:,12410,BIC:,30320.0
Df Model:,16,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,7.6686,0.039,196.768,0.000,7.592,7.745
C(tipo_renda)[T.Bolsista],0.0115,0.272,0.042,0.966,-0.522,0.545
C(tipo_renda)[T.Empresário],0.1082,0.017,6.433,0.000,0.075,0.141
C(tipo_renda)[T.Pensionista],-0.3127,0.272,-1.150,0.250,-0.846,0.220
C(tipo_renda)[T.Servidor público],-0.0023,0.025,-0.090,0.928,-0.051,0.047
"C(educacao, Treatment(2))[T.Primário]",0.0895,0.081,1.102,0.270,-0.070,0.249
"C(educacao, Treatment(2))[T.Pós graduação]",0.0503,0.160,0.314,0.754,-0.264,0.364
"C(educacao, Treatment(2))[T.Superior completo]",0.0816,0.016,5.222,0.000,0.051,0.112
"C(educacao, Treatment(2))[T.Superior incompleto]",-0.0372,0.036,-1.022,0.307,-0.109,0.034

0,1,2,3
Omnibus:,32.111,Durbin-Watson:,2.031
Prob(Omnibus):,0.0,Jarque-Bera (JB):,32.345
Skew:,0.124,Prob(JB):,9.47e-08
Kurtosis:,2.977,Cond. No.,1580.0


In [34]:
# item 3.D - remoção variável quantidade de filhos


y, x = patsy.dmatrices(
    'np.log(renda) ~ C(tipo_renda) + C(educacao, Treatment(2)) + C(tipo_residencia, Treatment(1)) + idade + tempo_emprego',
    df1    
)
sm.OLS(y, x).fit().summary()


0,1,2,3
Dep. Variable:,np.log(renda),R-squared:,0.178
Model:,OLS,Adj. R-squared:,0.177
Method:,Least Squares,F-statistic:,179.5
Date:,"Thu, 14 Jul 2022",Prob (F-statistic):,0.0
Time:,01:01:12,Log-Likelihood:,-15096.0
No. Observations:,12427,AIC:,30220.0
Df Residuals:,12411,BIC:,30340.0
Df Model:,15,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,7.7498,0.036,215.286,0.000,7.679,7.820
C(tipo_renda)[T.Bolsista],-0.0049,0.272,-0.018,0.986,-0.539,0.529
C(tipo_renda)[T.Empresário],0.1029,0.017,6.124,0.000,0.070,0.136
C(tipo_renda)[T.Pensionista],-0.3155,0.272,-1.158,0.247,-0.849,0.218
C(tipo_renda)[T.Servidor público],-0.0015,0.025,-0.060,0.952,-0.051,0.048
"C(educacao, Treatment(2))[T.Primário]",0.0805,0.081,0.990,0.322,-0.079,0.240
"C(educacao, Treatment(2))[T.Pós graduação]",0.0629,0.160,0.392,0.695,-0.252,0.377
"C(educacao, Treatment(2))[T.Superior completo]",0.0822,0.016,5.253,0.000,0.052,0.113
"C(educacao, Treatment(2))[T.Superior incompleto]",-0.0425,0.036,-1.166,0.244,-0.114,0.029

0,1,2,3
Omnibus:,32.34,Durbin-Watson:,2.031
Prob(Omnibus):,0.0,Jarque-Bera (JB):,32.577
Skew:,0.125,Prob(JB):,8.43e-08
Kurtosis:,2.974,Cond. No.,1580.0


In [38]:
# item 3.E - remoção variável idade


y, x = patsy.dmatrices(
    'np.log(renda) ~ C(tipo_renda) + C(educacao, Treatment(2)) + C(tipo_residencia, Treatment(1)) + tempo_emprego',
    df1    
)
sm.OLS(y, x).fit().summary()


0,1,2,3
Dep. Variable:,np.log(renda),R-squared:,0.178
Model:,OLS,Adj. R-squared:,0.177
Method:,Least Squares,F-statistic:,192.3
Date:,"Thu, 14 Jul 2022",Prob (F-statistic):,0.0
Time:,01:03:15,Log-Likelihood:,-15097.0
No. Observations:,12427,AIC:,30220.0
Df Residuals:,12412,BIC:,30330.0
Df Model:,14,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,7.7700,0.014,538.261,0.000,7.742,7.798
C(tipo_renda)[T.Bolsista],0.0001,0.272,0.001,1.000,-0.534,0.534
C(tipo_renda)[T.Empresário],0.1028,0.017,6.115,0.000,0.070,0.136
C(tipo_renda)[T.Pensionista],-0.3184,0.272,-1.170,0.242,-0.852,0.215
C(tipo_renda)[T.Servidor público],-0.0016,0.025,-0.065,0.948,-0.051,0.048
"C(educacao, Treatment(2))[T.Primário]",0.0792,0.081,0.974,0.330,-0.080,0.238
"C(educacao, Treatment(2))[T.Pós graduação]",0.0630,0.160,0.393,0.694,-0.251,0.378
"C(educacao, Treatment(2))[T.Superior completo]",0.0811,0.016,5.218,0.000,0.051,0.112
"C(educacao, Treatment(2))[T.Superior incompleto]",-0.0458,0.036,-1.271,0.204,-0.116,0.025

0,1,2,3
Omnibus:,32.036,Durbin-Watson:,2.031
Prob(Omnibus):,0.0,Jarque-Bera (JB):,32.268
Skew:,0.124,Prob(JB):,9.84e-08
Kurtosis:,2.974,Cond. No.,382.0


In [39]:
# item 3.F - remoção variável quantidade de filhos


y, x = patsy.dmatrices(
    'np.log(renda) ~ C(tipo_renda) + C(educacao, Treatment(2)) + C(tipo_residencia, Treatment(1))',
    df1    
)
sm.OLS(y, x).fit().summary()


0,1,2,3
Dep. Variable:,np.log(renda),R-squared:,0.008
Model:,OLS,Adj. R-squared:,0.007
Method:,Least Squares,F-statistic:,7.72
Date:,"Thu, 14 Jul 2022",Prob (F-statistic):,1.66e-15
Time:,01:05:10,Log-Likelihood:,-16266.0
No. Observations:,12427,AIC:,32560.0
Df Residuals:,12413,BIC:,32660.0
Df Model:,13,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,8.2273,0.012,664.371,0.000,8.203,8.252
C(tipo_renda)[T.Bolsista],0.1624,0.299,0.543,0.587,-0.424,0.749
C(tipo_renda)[T.Empresário],0.0341,0.018,1.853,0.064,-0.002,0.070
C(tipo_renda)[T.Pensionista],-0.3594,0.299,-1.202,0.230,-0.946,0.227
C(tipo_renda)[T.Servidor público],0.1510,0.027,5.519,0.000,0.097,0.205
"C(educacao, Treatment(2))[T.Primário]",0.0075,0.089,0.084,0.933,-0.167,0.182
"C(educacao, Treatment(2))[T.Pós graduação]",-0.1446,0.176,-0.821,0.412,-0.490,0.201
"C(educacao, Treatment(2))[T.Superior completo]",0.0642,0.017,3.757,0.000,0.031,0.098
"C(educacao, Treatment(2))[T.Superior incompleto]",-0.1060,0.040,-2.681,0.007,-0.184,-0.028

0,1,2,3
Omnibus:,231.127,Durbin-Watson:,2.037
Prob(Omnibus):,0.0,Jarque-Bera (JB):,254.697
Skew:,0.307,Prob(JB):,4.94e-56
Kurtosis:,3.34,Cond. No.,42.3


O  último modelo encontrado (item 3-F) não parce muito melhor que o primeiro, visto que o R-quadrado do último modelo ser muito baixo (0.008), além do ACI ser muito mais alto em relação ao primeiro. Contudo, parece que o mlehor modelo (mais simples) encontrado seja o do item 3 - E, pois a redação do R-quadrado não foi tão significante e as variáveis apresentarem melhor significância em relação ao p-value. 