<a href="https://colab.research.google.com/github/Idalen/enem-score-predictor/blob/main/notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Trabalho De ML

In [1]:
import numpy as np
import pandas as pd

import json

import plotly.express as px
from matplotlib import pyplot as plt
import seaborn as sns

from sklearn.linear_model import ElasticNet
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.svm import SVR

from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import mean_squared_error as RMSE

from pathlib import Path
from google.colab import drive

# Redução do uso da memória

Devido ao consumo de memória do nosso dataset, decidimos aplicar algumas estratégias para a redução do uso pelo Pandas.
Primeiro, mudamos o tipo de dado utilizado pelas colunas para formatos que ocupam menos bytes e transformamos o arquivo para o formato *.parquet, que tem melhor suporte à compressão de dados. 

In [None]:
def reduce_mem_usage(df):
    """ iterate through all the columns of a dataframe and modify the data type
        to reduce memory usage.        
    """
    start_mem = df.memory_usage().sum() / 1024**2
    print('Memory usage of dataframe is {:.2f} MB'.format(start_mem))
    
    for col in df.columns:
        col_type = df[col].dtype
        
        if col_type != object:
            c_min = df[col].min()
            c_max = df[col].max()
            if str(col_type)[:3] == 'int':
                if c_min > np.iinfo(np.int8).min and c_max < np.iinfo(np.int8).max:
                    df[col] = df[col].astype(np.int8)
                elif c_min > np.iinfo(np.int16).min and c_max < np.iinfo(np.int16).max:
                    df[col] = df[col].astype(np.int16)
                elif c_min > np.iinfo(np.int32).min and c_max < np.iinfo(np.int32).max:
                    df[col] = df[col].astype(np.int32)
                elif c_min > np.iinfo(np.int64).min and c_max < np.iinfo(np.int64).max:
                    df[col] = df[col].astype(np.int64)  
            else:
                if c_min > np.finfo(np.float32).min and c_max < np.finfo(np.float32).max:
                    df[col] = df[col].astype(np.float32)
                else:
                    df[col] = df[col].astype(np.float64)
        else:
            df[col] = df[col].astype('category')

    end_mem = df.memory_usage().sum() / 1024**2
    print('Memory usage after optimization is: {:.2f} MB'.format(end_mem))
    print('Decreased by {:.1f}%'.format(100 * (start_mem - end_mem) / start_mem))
    
    return df

# Leitura dos arquivos

In [23]:
path = Path("/content/drive/MyDrive/datasets/dados-enem/")
drive.mount('/content//drive')

Drive already mounted at /content//drive; to attempt to forcibly remount, call drive.mount("/content//drive", force_remount=True).


## Anotações:
* Testar: se vale a pena eliminar quem está ausente plotando o gráfico pra ver a nota desse grupo de pessoas
* Como fazer a conexão do jupyther com o SSH
* https://python.plainenglish.io/how-to-create-a-interative-map-using-plotly-express-geojson-to-brazil-in-python-fb5527ae38fc

## 1) Tratar dados
* EDA Inicial
* Tratar nulos (lembre-se de discutir e avaliar as melhores estratégias)
* Mapear os valores e OneHotEncoding 

## 2) Preprocessamento
* Remover colunas (correlacionadas [>80%], baixa variância, semântica)
* (Opcional) Aplicar PCA 
* (Opcional) Feature Engineering
* Standardize/Normalize
* Tratar dados desbalanceados

## 3) Modelo
* Regressão linear<br>
a. Realizar análise dos pesos<br>
b. Aplicar técnicas de regularização<br> 

* Árvore de Decisão <br>
a. Profundidade <br>
b. Avaliar os cortes (impureza de gini / entropia) <br>

* Naive Bayes <br>
a. Quais features afetam significativamente P(nota|feature)<br>
b. GaussianNaiveBayes x BernoulliNaiveBayes<br>

* SVM<br>
a. Avaliar o hiperplano gerado/ onde o corte é realizado <br>
b. avaliar diferentes kernels <br>

In [29]:
class Model:


  _algorithms = {
      
      'ElasticNet': {
          'estimator':ElasticNet(),
          'parameters':{
              'alpha':[0.001, 0.5, 1.0],
              'l1_ratio': [0, 0.5, 1.0]
          }},

      'DecisionTree': {
          'estimator':DecisionTreeRegressor(),
          'parameters':{
              'max_depth':[100, 90, 80, 70],
              'min_samples_leaf':[1, 10, 20, 50, 100]
          }},

      # 'RandomForest': {
      #     'estimator':RandomForestRegressor(),
      #     'parameters':{
      #         'n_estimators':[11, 31, 51],
      #         'max_depth':[100, 90, 80,],
      #         'min_samples_leaf':[1, 20, 100],
      #     }},

      # 'KNN': {
      #     'estimator':KNeighborsRegressor(),
      #     'parameters':{
      #         'n_neighbors':[5, 23, 47, 83],
      #         'weights':['uniform', 'distance'],
      #         'p':[1, 1.5, 2]
      #     }},

      # 'SVM': {
      #     'estimator':SVR(),
      #     'parameters':{
      #         'kernel':['rbf', 'poly'],
      #         'gamma':[0.01, 0.5, 1.0],
      #         'C':[10, 100, 1000]
      #     }}

  }

  def __init__(self, verbose=True):
    pass

  def load(self, path, verbose=True):

    self.train_df = pd.read_parquet(path/'train.parquet').sample(40000)
    self.test_df = pd.read_parquet(path/'test.parquet').sample(10000)

    if verbose:
      print("Quantidade inicial de elementos no treino:", len(self.train_df))
      print("Quantidade inicial de elementos no teste:", len(self.test_df))
        
    self.train_df.set_index("NU_INSCRICAO", inplace=True)
    self.test_df.set_index("NU_INSCRICAO", inplace=True)

    self._targets = [col for col in self.train_df.columns if "NU_NOTA" in col]



  def prepare(self,verbose=True):

    if verbose:
      print("Mapeando valores...")    
    self._map_values(verbose)

    if verbose:
      print("Criando novas colunas...")
    self._create_features(verbose)

    if verbose:
      print("Eliminando colunas...")
    self._clear_cols(verbose)

    if verbose:
      print("Aplicando get dummies...")
    self._create_dummies(verbose)

    if verbose:
      print("Selecionando features mais importantes")
    self._feature_selection(verbose)


  def tune(self, random_state=0, verbose=True):


    X, Y = self.train_df.drop(columns=self._targets), self.train_df[self._targets] 

    self._results = {}

    gscv = None

    for name, algorithm in self._algorithms.items():
      if verbose:
        print(name)

      self._results[name] = {} 

      for target in self._targets:
        
        gscv = GridSearchCV(algorithm['estimator'], algorithm['parameters'], verbose = 3,
                             scoring='neg_root_mean_squared_error', return_train_score=True)
        gscv.fit(X, Y[target])

        self._results[name][target] = {}
        self._results[name][target]['best_params'] = gscv.best_params_
        self._results[name][target]['best_score'] = gscv.best_score_
  
    return gscv

  def _to_json(self):

    with open('results.json', 'w') as fp:
      json.dump(self._results, fp)
    fp.close()

    with open('selecteds.json', 'w') as fp:
      json.dump(self._selecteds, fp)

  def ranking(self, verbose=True):

    self._selecteds = {}

    for algoritmo in self._results:

      for target in self._targets:

        if target not in self._selecteds:
          if verbose:
            print("Chave", target, "estava vazia, vamos colocar o algoritmo", algoritmo)
          self._selecteds[target] = algoritmo
        
        else:
          if self._results[algoritmo][target]['best_score'] > self._results[self._selecteds[target]][target]['best_score']:
            if verbose:
              print("O algoritmo", algoritmo, "se mostrou mais eficiente que o", self._selecteds[target])
            self._selecteds[target] = algoritmo

    self._to_json()


  def predict(self):
    for target in self._targets:
      print("Vamos prever", target, "com o algoritmo", self._selecteds[target], "e hiperparâmetros", self._results[self._selecteds[target]][target]['best_params'])
      # Aplicar o treinamento

  def correlation(self, save=False, plot=True):
    
    fig = px.imshow(self.train_df.corr())
    
    if plot:
      fig.show()

    if save:
      pass


  def plot(self, column):
    
    tmp = self.train_df[column].value_counts()
    fig = px.bar(x=tmp.index, y=tmp.values)
    fig.show()
    
    melted = pd.melt(self.train_df, id_vars=[column], value_vars=self._targets, var_name='TP_NOTA', value_name='NU_NOTA')
    fig=px.box(melted.sample(1000000), x='TP_NOTA', y='NU_NOTA', color=column)
    fig.show()

  def null_analysis(self, plot=True, save=False, verbose=True):
    
    null_count = self.train_df.isna().apply(np.sum, axis=0)/self.train_df.shape[0]
    null_percentage_train = (null_count.loc[null_count!=0]*100).sort_values()
    fig_train = px.bar(x=null_percentage_train.index, y=null_percentage_train.values, title="Porcentagem de valores nulos nos dados de treino")

    null_count = self.test_df.isna().apply(np.sum, axis=0)/self.test_df.shape[0]
    null_percentage_test = (null_count.loc[null_count!=0]*100).sort_values()
    fig_test = px.bar(x=null_percentage_test.index, y=null_percentage_test.values, title="Porcentagem de valores nulos nos dados de teste")

    if plot:
      fig_train.show()
      fig_test.show()

    if save:
      pass

  def _feature_selection(self, verbose):
    
    to_drop = []
    treshold = 0.05
    for col in self.train_df.columns[1:]:
       if self.train_df[col].std() < treshold:
         to_drop.append(col)
    
    self.train_df.drop(columns=to_drop, inplace=True)
    self.test_df.drop(columns=to_drop, inplace=True)
    if verbose:
      print("[VARIANCE TRESHOLD] Removendo colunas:", to_drop)

    #################################################################################

    correlation = self.train_df.corr().abs()

    upper_triangle = correlation.where(np.triu(np.ones(correlation.shape), k=1).astype(bool))

    # Considera apenas colunas de correlação mínima de 0.85
    to_drop = [column for column in upper_triangle.columns if any(upper_triangle[column] > 0.9)]
    
    self.train_df.drop(columns=to_drop, inplace=True)
    self.test_df.drop(columns=to_drop, inplace=True)

    if verbose:
      print('[HIGH CORRELATION] Eliminando colunas redundantes:', to_drop)

    


  def _clear_cols(self, verbose):
    
    null_count = self.train_df.isna().apply(np.sum, axis=0)/self.train_df.shape[0]
    null_percentage_train = (null_count.loc[null_count!=0]*100).sort_values()

    null_count = self.test_df.isna().apply(np.sum, axis=0)/self.test_df.shape[0]
    null_percentage_test = (null_count.loc[null_count!=0]*100).sort_values()

    to_drop_columns_train = list(null_percentage_train[null_percentage_train > 30].index)
    to_drop_columns_test = list(null_percentage_test[null_percentage_test > 30].index)

    if verbose:
      print("[NULLS] Colunas dropadas no treino:", sorted(to_drop_columns_train))
      print("[NULLS] Colunas dropadas no teste:", sorted(to_drop_columns_test))

    self.train_df.drop(columns=to_drop_columns_train, inplace=True)
    self.test_df.drop(columns=to_drop_columns_test, inplace=True)

    ###################################################################################################################

    to_drop = ['CO_MUNICIPIO_RESIDENCIA', 'NO_MUNICIPIO_RESIDENCIA', 'CO_UF_RESIDENCIA', 'CO_MUNICIPIO_NASCIMENTO', 'NO_MUNICIPIO_NASCIMENTO',
    'CO_UF_NASCIMENTO', 'SG_UF_NASCIMENTO', 'TP_ANO_CONCLUIU', 'IN_TREINEIRO', 'CO_MUNICIPIO_PROVA', 'NO_MUNICIPIO_PROVA', 'CO_UF_PROVA',
    'SG_UF_PROVA']

    self.train_df.drop(columns=to_drop, inplace=True)
    self.test_df.drop(columns=to_drop, inplace=True)

    if verbose:
      print(f'[DROP COLUMNS] Colunas retiradas por falta de relevânica:{[to_drop]}')

    ##################################################################################################################

    to_drop = self.train_df[(self.train_df['TP_STATUS_REDACAO'].isna()) & (self.train_df['TP_PRESENCA_CH']=='Presente')].index
    self.train_df.drop(to_drop, inplace=True)

    to_drop = self.test_df[(self.test_df['TP_STATUS_REDACAO'].isna()) & (self.test_df['TP_PRESENCA_CH']=='Presente')].index
    self.test_df.drop(to_drop, inplace=True)

    if verbose:
      print(f'[INCONSISTENCY] Removendo inconsistências.')

    ##################################################################################################################
    

    to_drop = ['NU_NOTA_MT', 'NU_NOTA_CH', 'NU_NOTA_CN', 'NU_NOTA_LC', 'NU_NOTA_REDACAO', 'TP_STATUS_REDACAO']
    self.train_df.dropna(subset=to_drop, inplace=True)

    try:
      self.test_df.dropna(subset=to_drop, inplace=True)
    except KeyError:
      pass #

    if verbose:
      print('[NULL TARGETS] Removendo valores nulos nas colunas-alvo')


    #####################################################################################################################

    self.train_df.drop(self.train_df[self.train_df['TP_STATUS_REDACAO'] != 'Sem problemas'].index, inplace=True)
    self.test_df.drop(self.test_df[self.test_df['TP_STATUS_REDACAO'] != 'Sem problemas'].index, inplace=True)

    if verbose:
      print('[::] Removendo redações que tiraram nota 0')


  def _create_dummies(self, verbose):

    cols = [col for col in self.train_df.columns if ((self.train_df[col].dtype == 'object') or (self.train_df[col].dtype.name == 'category'))]

    self.train_df = pd.get_dummies(self.train_df, columns=cols)
    self.test_df = pd.get_dummies(self.test_df, columns=cols)

    if verbose:
      print(f"[GET DUMMIES] Colunas categóricas convertidas: {cols}")


  def _create_features(self, verbose):

    new_columns = []
    filled_columns = []
    ############################################################################################

    uf_regiao = {
      'RR':'Norte', 'AP':'Norte', 'AM':'Norte', 'PA':'Norte', 'AC':'Norte', 'RO':'Norte', 'TO':'Norte', 'MA':'Nordeste',
      'PI':'Nordeste', 'CE':'Nordeste', 'RN':'Nordeste', 'PB':'Nordeste', 'PE':'Nordeste', 'AL':'Nordeste', 'SE':'Nordeste',
      'BA':'Nordeste', 'MT':'Centro-oeste', 'DF':'Centro-oeste', 'GO':'Centro-oeste', 'MS':'Centro-oeste', 'MG':'Sudeste',
      'ES':'Sudeste', 'RJ':'Sudeste', 'SP':'Sudeste', 'PR':'Sul', 'SC':'Sul', 'RS':'Sul', 
      }

    self.train_df['NO_REGIAO_RESIDENCIA'] = self.train_df['SG_UF_RESIDENCIA'].map(uf_regiao)
    self.test_df['NO_REGIAO_RESIDENCIA'] = self.test_df['SG_UF_RESIDENCIA'].map(uf_regiao)

    new_columns.append('NO_REGIAO_RESIDENCIA')

    ############################################################################################

    mean_score_per_reg = self.train_df.groupby("NO_REGIAO_RESIDENCIA")[self._targets].mean()
    for col in self._targets:
      self.train_df["REG_NOTA_"+col.split("_")[2]+"_MEDIA"] = self.train_df['NO_REGIAO_RESIDENCIA'].apply(
          lambda row: mean_score_per_reg[col][row]) 
      self.test_df["REG_NOTA_"+col.split("_")[2]+"_MEDIA"] = self.test_df['NO_REGIAO_RESIDENCIA'].apply(
          lambda row: mean_score_per_reg[col][row]) 

      new_columns.append("REG_NOTA_"+col.split("_")[2]+"_MEDIA")
    ############################################################################################
  
    
    self.train_df['TP_MINORIA_RACIAL'] = ((self.train_df['TP_COR_RACA'] != 'Branca').astype(int) + (self.train_df['TP_COR_RACA'] != 'Amarela').astype(int)) -1
    self.test_df['TP_MINORIA_RACIAL'] = ((self.test_df['TP_COR_RACA'] != 'Branca').astype(int) + (self.test_df['TP_COR_RACA'] != 'Amarela').astype(int)) -1

    new_columns.append('TP_MINORIA_RACIAL')
    ############################################################################################

    cols = [col for col in self.train_df.columns if (("IN_" in col) and ('TREINEIRO' not in col))]

    self.train_df['TP_SITUACAO_ESPECIAL'] = self.train_df[cols].any(axis=1)
    self.test_df['TP_SITUACAO_ESPECIAL'] = self.test_df[cols].any(axis=1)

    new_columns.append('TP_SITUACAO_ESPECIAL')

    #############################################################################################


    self.train_df['TP_SOLTEIRO'] = self.train_df['TP_ESTADO_CIVIL'] == 'Solteiro(a)'
    self.test_df['TP_SOLTEIRO'] = self.test_df['TP_ESTADO_CIVIL'] == 'Solteiro(a)'

    new_columns.append('TP_SOLTEIRO')

    #############################################################################################

    median_train = self.train_df.loc[self.train_df['NU_IDADE'].notnull(), 'NU_IDADE'].median()
    
    self.train_df['NU_IDADE'] = self.train_df['NU_IDADE'].fillna(median_train)
    self.test_df['NU_IDADE'] = self.test_df['NU_IDADE'].fillna(median_train)
  
    filled_columns.append("NU_IDADE")
    #############################################################################################

    if verbose:
      print(f'[FEATURE ENGINEERING] Novas colunas: {new_columns}')
      print(f'[INPUTATION] Colunas com valores nulos preenchidos: {filled_columns}')
      


  
  def _map_values(self, verbose):
    #################################################################
    rename = {0:"0",#np.NaN,
      1:"Solteiro(a)",
      2:"Casado(a)/Mora com companheiro(a)",
      3:"Divorciado(a)/Desquitado(a)/Separado(a)",
      4:"Viúvo(a)"}

    self.train_df['TP_ESTADO_CIVIL'] = self.train_df['TP_ESTADO_CIVIL'].map(rename)
    self.test_df['TP_ESTADO_CIVIL'] = self.test_df['TP_ESTADO_CIVIL'].map(rename)

    #################################################################
    rename = {0:"0",#np.NaN,
      1:"Branca",
      2:"Preta",
      3:"Parda",
      4:"Amarela",
      5:"Indígena"}

    self.train_df['TP_COR_RACA'] = self.train_df['TP_COR_RACA'].map(rename)
    self.test_df['TP_COR_RACA'] = self.test_df['TP_COR_RACA'].map(rename)

    #################################################################
    rename = {0:"0",#np.NaN,
      1:"Brasileiro(a)",
      2:"Brasileiro(a) Naturalizado(a)",
      3:"Estrangeiro(a)",
      4:"Brasileiro(a) Nato(a), nascido(a) no exterior"
      }

    self.train_df['TP_NACIONALIDADE'] = self.train_df['TP_NACIONALIDADE'].map(rename)
    self.test_df['TP_NACIONALIDADE'] = self.test_df['TP_NACIONALIDADE'].map(rename)

    #################################################################
    rename = {1:"Já concluí o Ensino Médio",
      2:"Estou cursando e concluirei o Ensino Médio no ano corrente",
      3:"Estou cursando e concluirei o Ensino Médio após o ano corrente",
      4:"Não concluí e não estou cursando o Ensino Médio"
      }

    self.train_df['TP_ST_CONCLUSAO'] = self.train_df['TP_ST_CONCLUSAO'].map(rename)
    self.test_df['TP_ST_CONCLUSAO'] = self.test_df['TP_ST_CONCLUSAO'].map(rename)

    #################################################################
    rename = {0:"0",#np.NaN,
      1:"2018",
      2:"2017",
      3:"2016",
      4:"2015",
      5:"2014",
      6:"2013",
      7:"2012",
      8:"2011",
      9:"2010",
      10:"2009",
      11:"2008",
      12:"2007",
      13:"Antes de 2007"}

    self.train_df['TP_ANO_CONCLUIU'] = self.train_df['TP_ANO_CONCLUIU'].map(rename)
    self.test_df['TP_ANO_CONCLUIU'] = self.test_df['TP_ANO_CONCLUIU'].map(rename)

    #################################################################
    rename = {1:"0",#np.NaN,
      2:"Pública",
      3:"Privada",
      4:"Exterior"}

    self.train_df['TP_ESCOLA'] = self.train_df['TP_ESCOLA'].map(rename)
    self.test_df['TP_ESCOLA'] = self.test_df['TP_ESCOLA'].map(rename)

    #################################################################
    rename = {1:"Federal",
      2:"Estadual",
      3:"Municipal",
      4:"Privada"}

    self.train_df['TP_DEPENDENCIA_ADM_ESC'] = self.train_df['TP_DEPENDENCIA_ADM_ESC'].map(rename)
    self.test_df['TP_DEPENDENCIA_ADM_ESC'] = self.test_df['TP_DEPENDENCIA_ADM_ESC'].map(rename)

    #################################################################
    rename = {1:"Ensino Regular",
      2:"Educação Especial - Modalidade Substitutiva",
      3:"Educação de Jovens e Adultos"}

    self.train_df['TP_ENSINO'] = self.train_df['TP_ENSINO'].map(rename)
    self.test_df['TP_ENSINO'] = self.test_df['TP_ENSINO'].map(rename)

    #################################################################
    rename = {0:"Ausente",
      1:"Presente",
      2:"Eliminado"}

    for c in [col for col in self.train_df.columns if "TP_PRESENCA" in col]:
      self.train_df[c] = self.train_df[c].map(rename)
      self.test_df[c] = self.test_df[c].map(rename)

    #################################################################
    rename = {
        1:"Sem problemas",
        2:"Anulada",
        3:"Copiou texto motivador",
        4:"Em branco",
        6:"Fuga ao tema",
        7:"Não atende tipo textual",
        8:"Texto insuficiente",
        9:"Parte desconectada"
      }

    self.train_df['TP_STATUS_REDACAO'] = self.train_df['TP_STATUS_REDACAO'].map(rename)
    self.test_df['TP_STATUS_REDACAO'] = self.test_df['TP_STATUS_REDACAO'].map(rename)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5,
        'F':6,
        'G':7,
        'H':0
    }

    self.train_df['Q001'] = self.train_df['Q001'].map(rename).astype(int)
    self.test_df['Q001'] = self.test_df['Q001'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5,
        'F':6,
        'G':7,
        'H':0
    }

    self.train_df['Q002'] = self.train_df['Q002'].map(rename).astype(int)
    self.test_df['Q002'] = self.test_df['Q002'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5,
        'F':0,
    }

    self.train_df['Q003'] = self.train_df['Q003'].map(rename).astype(int)
    self.test_df['Q003'] = self.test_df['Q003'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5,
        'F':0,
    }

    self.train_df['Q004'] = self.train_df['Q004'].map(rename).astype(int)
    self.test_df['Q004'] = self.test_df['Q004'].map(rename).astype(int)

    #Q005 já é numérica

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5,
        'F':6,
        'G':7,
        'H':8,
        'I':9,
        'J':10,
        'K':11,
        'L':12,
        'M':13,
        'N':14,
        'O':15,
        'P':16,
        'Q':17
    }

    self.train_df['Q006'] = self.train_df['Q006'].map(rename).astype(int)
    self.test_df['Q006'] = self.test_df['Q006'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
    }

    self.train_df['Q007'] = self.train_df['Q007'].map(rename).astype(int)
    self.test_df['Q007'] = self.test_df['Q007'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q008'] = self.train_df['Q008'].map(rename).astype(int)
    self.test_df['Q008'] = self.test_df['Q008'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q009'] = self.train_df['Q009'].map(rename).astype(int)
    self.test_df['Q009'] = self.test_df['Q009'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q010'] = self.train_df['Q010'].map(rename).astype(int)
    self.test_df['Q010'] = self.test_df['Q010'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q011'] = self.train_df['Q011'].map(rename).astype(int)
    self.test_df['Q011'] = self.test_df['Q011'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q012'] = self.train_df['Q012'].map(rename).astype(int)
    self.test_df['Q012'] = self.test_df['Q012'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q013'] = self.train_df['Q013'].map(rename).astype(int)
    self.test_df['Q013'] = self.test_df['Q013'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q014'] = self.train_df['Q014'].map(rename).astype(int)
    self.test_df['Q014'] = self.test_df['Q014'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q015'] = self.train_df['Q015'].map(rename).astype(int)
    self.test_df['Q015'] = self.test_df['Q015'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q016'] = self.train_df['Q016'].map(rename).astype(int)
    self.test_df['Q016'] = self.test_df['Q016'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q017'] = self.train_df['Q017'].map(rename).astype(int)
    self.test_df['Q017'] = self.test_df['Q017'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':0,
        'B':1,
    }

    self.train_df['Q018'] = self.train_df['Q018'].map(rename).astype(int)
    self.test_df['Q018'] = self.test_df['Q018'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q019'] = self.train_df['Q019'].map(rename).astype(int)
    self.test_df['Q019'] = self.test_df['Q019'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':0,
        'B':1,
    }

    self.train_df['Q020'] = self.train_df['Q020'].map(rename).astype(int)
    self.test_df['Q020'] = self.test_df['Q020'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':0,
        'B':1,
    }

    self.train_df['Q021'] = self.train_df['Q021'].map(rename).astype(int)
    self.test_df['Q021'] = self.test_df['Q021'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q022'] = self.train_df['Q022'].map(rename).astype(int)
    self.test_df['Q022'] = self.test_df['Q022'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':0,
        'B':1,
    }

    self.train_df['Q023'] = self.train_df['Q023'].map(rename).astype(int)
    self.test_df['Q023'] = self.test_df['Q023'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q024'] = self.train_df['Q024'].map(rename).astype(int)
    self.test_df['Q024'] = self.test_df['Q024'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':0,
        'B':1,
    }

    self.train_df['Q025'] = self.train_df['Q025'].map(rename).astype(int)
    self.test_df['Q025'] = self.test_df['Q025'].map(rename).astype(int)

In [30]:
model = Model()
model.load(path)

Quantidade inicial de elementos no treino: 40000
Quantidade inicial de elementos no teste: 10000


In [31]:
model.prepare()

Mapeando valores...
Criando novas colunas...
[FEATURE ENGINEERING] Novas colunas: ['NO_REGIAO_RESIDENCIA', 'REG_NOTA_CN_MEDIA', 'REG_NOTA_CH_MEDIA', 'REG_NOTA_LC_MEDIA', 'REG_NOTA_MT_MEDIA', 'REG_NOTA_REDACAO_MEDIA', 'TP_MINORIA_RACIAL', 'TP_SITUACAO_ESPECIAL', 'TP_SOLTEIRO']
[INPUTATION] Colunas com valores nulos preenchidos: ['NU_IDADE']
Eliminando colunas...
[NULLS] Colunas dropadas no treino: ['CO_ESCOLA', 'CO_MUNICIPIO_ESC', 'CO_UF_ESC', 'NO_MUNICIPIO_ESC', 'SG_UF_ESC', 'TP_DEPENDENCIA_ADM_ESC', 'TP_ENSINO', 'TP_LOCALIZACAO_ESC', 'TP_SIT_FUNC_ESC']
[NULLS] Colunas dropadas no teste: ['CO_ESCOLA', 'CO_MUNICIPIO_ESC', 'CO_UF_ESC', 'NO_MUNICIPIO_ESC', 'SG_UF_ESC', 'TP_DEPENDENCIA_ADM_ESC', 'TP_ENSINO', 'TP_LOCALIZACAO_ESC', 'TP_SIT_FUNC_ESC']
[DROP COLUMNS] Colunas retiradas por falta de relevânica:[['CO_MUNICIPIO_RESIDENCIA', 'NO_MUNICIPIO_RESIDENCIA', 'CO_UF_RESIDENCIA', 'CO_MUNICIPIO_NASCIMENTO', 'NO_MUNICIPIO_NASCIMENTO', 'CO_UF_NASCIMENTO', 'SG_UF_NASCIMENTO', 'TP_ANO_CONCLUIU',

In [32]:
m = model.tune()

ElasticNet
Fitting 5 folds for each of 9 candidates, totalling 45 fits


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0;, score=(train=-64.531, test=-64.730) total time=   2.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0;, score=(train=-64.809, test=-63.566) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0;, score=(train=-64.634, test=-64.278) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0;, score=(train=-64.172, test=-66.101) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0;, score=(train=-64.468, test=-64.959) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-64.528, test=-64.730) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-64.806, test=-63.572) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-64.632, test=-64.278) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-64.170, test=-66.101) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-64.466, test=-64.962) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-64.524, test=-64.744) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-64.803, test=-63.588) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-64.630, test=-64.277) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-64.168, test=-66.099) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-64.464, test=-64.965) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.5, l1_ratio=0;, score=(train=-65.656, test=-66.093) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.5, l1_ratio=0;, score=(train=-65.935, test=-64.902) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.5, l1_ratio=0;, score=(train=-65.832, test=-65.276) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.5, l1_ratio=0;, score=(train=-65.396, test=-66.879) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.5, l1_ratio=0;, score=(train=-65.717, test=-65.628) total time=   2.0s
[CV 1/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-65.454, test=-65.849) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-65.733, test=-64.651) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-65.617, test=-65.049) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-65.163, test=-66.720) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-65.487, test=-65.460) total time=   0.3s
[CV 1/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-65.043, test=-65.265) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-65.328, test=-64.119) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-65.186, test=-64.590) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-64.692, test=-66.473) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-65.035, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=1.0, l1_ratio=0;, score=(train=-66.062, test=-66.557) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=1.0, l1_ratio=0;, score=(train=-66.351, test=-65.345) total time=   2.1s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=1.0, l1_ratio=0;, score=(train=-66.258, test=-65.705) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=1.0, l1_ratio=0;, score=(train=-65.844, test=-67.231) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=1.0, l1_ratio=0;, score=(train=-66.164, test=-66.009) total time=   2.0s
[CV 1/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-65.895, test=-66.358) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-66.184, test=-65.143) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-66.084, test=-65.519) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-65.646, test=-67.080) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-65.971, test=-65.856) total time=   0.3s
[CV 1/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-65.403, test=-65.703) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-65.688, test=-64.553) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-65.553, test=-64.943) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-65.034, test=-66.640) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-65.397, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


Fitting 5 folds for each of 9 candidates, totalling 45 fits


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0;, score=(train=-70.056, test=-69.434) total time=   2.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0;, score=(train=-70.015, test=-69.586) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0;, score=(train=-69.988, test=-69.710) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0;, score=(train=-69.562, test=-71.373) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0;, score=(train=-69.819, test=-70.391) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-70.052, test=-69.436) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-70.013, test=-69.580) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-69.985, test=-69.709) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-69.559, test=-71.378) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-69.817, test=-70.394) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-70.046, test=-69.458) total time=   1.6s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-70.010, test=-69.573) total time=   1.6s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-69.983, test=-69.709) total time=   1.6s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-69.556, test=-71.385) total time=   1.6s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-69.816, test=-70.396) total time=   1.6s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.5, l1_ratio=0;, score=(train=-71.044, test=-70.719) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.5, l1_ratio=0;, score=(train=-71.006, test=-70.801) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.5, l1_ratio=0;, score=(train=-71.070, test=-70.364) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.5, l1_ratio=0;, score=(train=-70.650, test=-72.044) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.5, l1_ratio=0;, score=(train=-70.916, test=-71.041) total time=   1.9s
[CV 1/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-70.874, test=-70.489) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-70.824, test=-70.580) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-70.866, test=-70.210) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-70.455, test=-71.891) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-70.720, test=-70.866) total time=   0.3s
[CV 1/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-70.508, test=-69.934) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-70.473, test=-70.140) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-70.451, test=-69.938) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-70.055, test=-71.599) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-70.317, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=1.0, l1_ratio=0;, score=(train=-71.387, test=-71.134) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=1.0, l1_ratio=0;, score=(train=-71.365, test=-71.210) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=1.0, l1_ratio=0;, score=(train=-71.459, test=-70.688) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=1.0, l1_ratio=0;, score=(train=-71.031, test=-72.336) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=1.0, l1_ratio=0;, score=(train=-71.300, test=-71.391) total time=   2.0s
[CV 1/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-71.246, test=-70.966) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-71.217, test=-71.025) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-71.288, test=-70.534) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-70.866, test=-72.214) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-71.137, test=-71.236) total time=   0.3s
[CV 1/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-70.826, test=-70.314) total time=   0.2s
[CV 2/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-70.776, test=-70.532) total time=   0.2s
[CV 3/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-70.802, test=-70.142) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-70.379, test=-71.855) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-70.647, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


Fitting 5 folds for each of 9 candidates, totalling 45 fits


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0;, score=(train=-53.542, test=-52.669) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0;, score=(train=-53.605, test=-52.394) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0;, score=(train=-53.354, test=-53.425) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0;, score=(train=-53.011, test=-54.733) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0;, score=(train=-53.146, test=-54.232) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-53.539, test=-52.672) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-53.603, test=-52.397) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-53.352, test=-53.424) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-53.009, test=-54.735) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-53.145, test=-54.231) total time=   1.8s
[CV 1/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-53.536, test=-52.683) total time=   1.6s
[CV 2/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-53.601, test=-52.404) total time=   1.6s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-53.351, test=-53.424) total time=   1.6s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-53.008, test=-54.738) total time=   1.7s
[CV 5/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-53.144, test=-54.228) total time=   2.4s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.5, l1_ratio=0;, score=(train=-54.423, test=-53.636) total time=   2.2s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.5, l1_ratio=0;, score=(train=-54.514, test=-53.249) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.5, l1_ratio=0;, score=(train=-54.272, test=-54.188) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.5, l1_ratio=0;, score=(train=-53.936, test=-55.500) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.5, l1_ratio=0;, score=(train=-54.093, test=-54.851) total time=   2.0s
[CV 1/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-54.275, test=-53.468) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-54.360, test=-53.084) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-54.120, test=-54.017) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-53.779, test=-55.376) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-53.929, test=-54.712) total time=   0.3s
[CV 1/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-53.994, test=-53.096) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-54.059, test=-52.802) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-53.820, test=-53.705) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-53.472, test=-55.124) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-53.615, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=1.0, l1_ratio=0;, score=(train=-54.758, test=-53.980) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=1.0, l1_ratio=0;, score=(train=-54.856, test=-53.620) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=1.0, l1_ratio=0;, score=(train=-54.622, test=-54.525) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=1.0, l1_ratio=0;, score=(train=-54.285, test=-55.817) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=1.0, l1_ratio=0;, score=(train=-54.452, test=-55.169) total time=   1.9s
[CV 1/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-54.658, test=-53.880) total time=   0.2s
[CV 2/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-54.748, test=-53.494) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-54.518, test=-54.395) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-54.171, test=-55.744) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-54.338, test=-55.056) total time=   0.3s
[CV 1/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-54.292, test=-53.429) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-54.347, test=-53.088) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-54.121, test=-53.958) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-53.781, test=-55.448) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-53.933, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


Fitting 5 folds for each of 9 candidates, totalling 45 fits


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0;, score=(train=-90.469, test=-90.587) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0;, score=(train=-90.426, test=-90.719) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0;, score=(train=-90.387, test=-90.906) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0;, score=(train=-90.370, test=-90.987) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0;, score=(train=-90.506, test=-90.448) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-90.464, test=-90.598) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-90.422, test=-90.724) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-90.383, test=-90.907) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-90.367, test=-90.980) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-90.504, test=-90.447) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-90.460, test=-90.621) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-90.419, test=-90.735) total time=   1.6s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-90.381, test=-90.911) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-90.366, test=-90.972) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-90.503, test=-90.444) total time=   1.6s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.5, l1_ratio=0;, score=(train=-92.597, test=-92.448) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.5, l1_ratio=0;, score=(train=-92.537, test=-92.616) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.5, l1_ratio=0;, score=(train=-92.491, test=-92.860) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.5, l1_ratio=0;, score=(train=-92.451, test=-92.894) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.5, l1_ratio=0;, score=(train=-92.629, test=-92.284) total time=   1.9s
[CV 1/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-92.056, test=-91.885) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-91.993, test=-92.073) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-91.956, test=-92.302) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-91.896, test=-92.416) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-92.086, test=-91.757) total time=   0.3s
[CV 1/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-90.976, test=-90.808) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-90.924, test=-91.061) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-90.886, test=-91.187) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-90.826, test=-91.507) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-91.018, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=1.0, l1_ratio=0;, score=(train=-93.342, test=-93.207) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=1.0, l1_ratio=0;, score=(train=-93.294, test=-93.336) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=1.0, l1_ratio=0;, score=(train=-93.244, test=-93.618) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=1.0, l1_ratio=0;, score=(train=-93.220, test=-93.577) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=1.0, l1_ratio=0;, score=(train=-93.386, test=-93.037) total time=   2.0s
[CV 1/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-92.878, test=-92.724) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-92.822, test=-92.857) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-92.779, test=-93.137) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-92.728, test=-93.150) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-92.909, test=-92.571) total time=   0.3s
[CV 1/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-91.487, test=-91.270) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-91.415, test=-91.512) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-91.398, test=-91.676) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-91.301, test=-91.899) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-91.510, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


Fitting 5 folds for each of 9 candidates, totalling 45 fits


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0;, score=(train=-137.246, test=-136.103) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0;, score=(train=-136.851, test=-137.655) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0;, score=(train=-136.733, test=-138.198) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0;, score=(train=-136.235, test=-140.146) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0;, score=(train=-137.627, test=-134.618) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-137.233, test=-136.117) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-136.841, test=-137.650) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-136.723, test=-138.187) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-136.226, test=-140.145) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-137.620, test=-134.609) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-137.223, test=-136.162) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-136.835, test=-137.653) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-136.716, test=-138.187) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-136.221, test=-140.145) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-137.616, test=-134.592) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.5, l1_ratio=0;, score=(train=-140.427, test=-139.378) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.5, l1_ratio=0;, score=(train=-140.026, test=-140.919) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.5, l1_ratio=0;, score=(train=-139.876, test=-141.467) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.5, l1_ratio=0;, score=(train=-139.560, test=-142.574) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.5, l1_ratio=0;, score=(train=-140.957, test=-137.047) total time=   2.0s
[CV 1/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-139.765, test=-138.640) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-139.356, test=-140.224) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-139.191, test=-140.825) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-138.850, test=-141.980) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-140.246, test=-136.382) total time=   0.3s
[CV 1/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-137.917, test=-136.522) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-137.514, test=-138.260) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-137.362, test=-138.984) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-136.917, test=-140.513) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=1.0;, sco

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=1.0, l1_ratio=0;, score=(train=-141.256, test=-140.238) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=1.0, l1_ratio=0;, score=(train=-140.868, test=-141.771) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=1.0, l1_ratio=0;, score=(train=-140.731, test=-142.292) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=1.0, l1_ratio=0;, score=(train=-140.438, test=-143.317) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=1.0, l1_ratio=0;, score=(train=-141.827, test=-137.909) total time=   1.9s
[CV 1/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-140.748, test=-139.686) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-140.342, test=-141.236) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-140.191, test=-141.796) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-139.886, test=-142.854) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-141.286, test=-137.346) total time=   0.3s
[CV 1/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-138.639, test=-137.184) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-138.161, test=-138.926) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-137.996, test=-139.650) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-137.590, test=-141.055) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=1.0;, sco

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


DecisionTree
Fitting 5 folds for each of 20 candidates, totalling 100 fits
[CV 1/5] END max_depth=100, min_samples_leaf=1;, score=(train=-0.349, test=-93.131) total time=   0.5s
[CV 2/5] END max_depth=100, min_samples_leaf=1;, score=(train=-0.457, test=-91.870) total time=   0.5s
[CV 3/5] END max_depth=100, min_samples_leaf=1;, score=(train=-0.457, test=-93.427) total time=   0.5s
[CV 4/5] END max_depth=100, min_samples_leaf=1;, score=(train=-0.000, test=-92.353) total time=   0.5s
[CV 5/5] END max_depth=100, min_samples_leaf=1;, score=(train=-0.294, test=-91.871) total time=   0.5s
[CV 1/5] END max_depth=100, min_samples_leaf=10;, score=(train=-55.136, test=-72.654) total time=   0.3s
[CV 2/5] END max_depth=100, min_samples_leaf=10;, score=(train=-55.364, test=-72.190) total time=   0.3s
[CV 3/5] END max_depth=100, min_samples_leaf=10;, score=(train=-55.292, test=-73.234) total time=   0.3s
[CV 4/5] END max_depth=100, min_samples_leaf=10;, score=(train=-54.866, test=-73.229) total tim

In [33]:
model.ranking()
model.predict()

Chave NU_NOTA_CN estava vazia, vamos colocar o algoritmo ElasticNet
Chave NU_NOTA_CH estava vazia, vamos colocar o algoritmo ElasticNet
Chave NU_NOTA_LC estava vazia, vamos colocar o algoritmo ElasticNet
Chave NU_NOTA_MT estava vazia, vamos colocar o algoritmo ElasticNet
Chave NU_NOTA_REDACAO estava vazia, vamos colocar o algoritmo ElasticNet


AttributeError: ignored

In [7]:
model._results

{'DecisionTree': {'NU_NOTA_CH': {'best_params': {'max_depth': 70,
    'min_samples_leaf': 100},
   'best_score': -71.36979527895173},
  'NU_NOTA_CN': {'best_params': {'max_depth': 100, 'min_samples_leaf': 100},
   'best_score': -66.06008092254851},
  'NU_NOTA_LC': {'best_params': {'max_depth': 100, 'min_samples_leaf': 100},
   'best_score': -54.38656482726235},
  'NU_NOTA_MT': {'best_params': {'max_depth': 100, 'min_samples_leaf': 100},
   'best_score': -93.01160204110991},
  'NU_NOTA_REDACAO': {'best_params': {'max_depth': 90,
    'min_samples_leaf': 100},
   'best_score': -140.58504289884291}},
 'ElasticNet': {'NU_NOTA_CH': {'best_params': {'alpha': 0.001, 'l1_ratio': 0},
   'best_score': -69.47536198096927},
  'NU_NOTA_CN': {'best_params': {'alpha': 0.001, 'l1_ratio': 0},
   'best_score': -64.47326888352583},
  'NU_NOTA_LC': {'best_params': {'alpha': 0.001, 'l1_ratio': 0},
   'best_score': -52.89958269748065},
  'NU_NOTA_MT': {'best_params': {'alpha': 0.001, 'l1_ratio': 0},
   'best