<a href="https://colab.research.google.com/github/Idalen/enem-score-predictor/blob/main/notebook.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Trabalho De ML

In [1]:
import numpy as np
import pandas as pd

import json

import plotly.express as px
from matplotlib import pyplot as plt
import seaborn as sns

from sklearn.linear_model import ElasticNet
from sklearn.tree import DecisionTreeRegressor
from sklearn.ensemble import RandomForestRegressor
from sklearn.neighbors import KNeighborsRegressor
from sklearn.svm import SVR

from sklearn.model_selection import train_test_split
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import mean_squared_error as RMSE

from pathlib import Path
from google.colab import drive

# Redução do uso da memória

Devido ao consumo de memória do nosso dataset, decidimos aplicar algumas estratégias para a redução do uso pelo Pandas.
Primeiro, mudamos o tipo de dado utilizado pelas colunas para formatos que ocupam menos bytes e transformamos o arquivo para o formato *.parquet, que tem melhor suporte à compressão de dados. 

In [None]:
def reduce_mem_usage(df):
    """ iterate through all the columns of a dataframe and modify the data type
        to reduce memory usage.        
    """
    start_mem = df.memory_usage().sum() / 1024**2
    print('Memory usage of dataframe is {:.2f} MB'.format(start_mem))
    
    for col in df.columns:
        col_type = df[col].dtype
        
        if col_type != object:
            c_min = df[col].min()
            c_max = df[col].max()
            if str(col_type)[:3] == 'int':
                if c_min > np.iinfo(np.int8).min and c_max < np.iinfo(np.int8).max:
                    df[col] = df[col].astype(np.int8)
                elif c_min > np.iinfo(np.int16).min and c_max < np.iinfo(np.int16).max:
                    df[col] = df[col].astype(np.int16)
                elif c_min > np.iinfo(np.int32).min and c_max < np.iinfo(np.int32).max:
                    df[col] = df[col].astype(np.int32)
                elif c_min > np.iinfo(np.int64).min and c_max < np.iinfo(np.int64).max:
                    df[col] = df[col].astype(np.int64)  
            else:
                if c_min > np.finfo(np.float32).min and c_max < np.finfo(np.float32).max:
                    df[col] = df[col].astype(np.float32)
                else:
                    df[col] = df[col].astype(np.float64)
        else:
            df[col] = df[col].astype('category')

    end_mem = df.memory_usage().sum() / 1024**2
    print('Memory usage after optimization is: {:.2f} MB'.format(end_mem))
    print('Decreased by {:.1f}%'.format(100 * (start_mem - end_mem) / start_mem))
    
    return df

# Leitura dos arquivos

In [23]:
path = Path("/content/drive/MyDrive/datasets/dados-enem/")
drive.mount('/content//drive')

Drive already mounted at /content//drive; to attempt to forcibly remount, call drive.mount("/content//drive", force_remount=True).


## Anotações:
* Testar: se vale a pena eliminar quem está ausente plotando o gráfico pra ver a nota desse grupo de pessoas
* Como fazer a conexão do jupyther com o SSH
* https://python.plainenglish.io/how-to-create-a-interative-map-using-plotly-express-geojson-to-brazil-in-python-fb5527ae38fc

## 1) Tratar dados
* EDA Inicial
* Tratar nulos (lembre-se de discutir e avaliar as melhores estratégias)
* Mapear os valores e OneHotEncoding 

## 2) Preprocessamento
* Remover colunas (correlacionadas [>80%], baixa variância, semântica)
* (Opcional) Aplicar PCA 
* (Opcional) Feature Engineering
* Standardize/Normalize
* Tratar dados desbalanceados

## 3) Modelo
* Regressão linear<br>
a. Realizar análise dos pesos<br>
b. Aplicar técnicas de regularização<br> 

* Árvore de Decisão <br>
a. Profundidade <br>
b. Avaliar os cortes (impureza de gini / entropia) <br>

* Naive Bayes <br>
a. Quais features afetam significativamente P(nota|feature)<br>
b. GaussianNaiveBayes x BernoulliNaiveBayes<br>

* SVM<br>
a. Avaliar o hiperplano gerado/ onde o corte é realizado <br>
b. avaliar diferentes kernels <br>

In [24]:
class Model:


  _algorithms = {
      
      'ElasticNet': {
          'estimator':ElasticNet(),
          'parameters':{
              'alpha':[0.001, 0.5, 1.0],
              'l1_ratio': [0, 0.5, 1.0]
          }},

      'DecisionTree': {
          'estimator':DecisionTreeRegressor(),
          'parameters':{
              'max_depth':[100, 90, 80, 70],
              'min_samples_leaf':[1, 10, 20, 50, 100]
          }},

      # 'RandomForest': {
      #     'estimator':RandomForestRegressor(),
      #     'parameters':{
      #         'n_estimators':[11, 31, 51],
      #         'max_depth':[100, 90, 80,],
      #         'min_samples_leaf':[1, 20, 100],
      #     }},

      # 'KNN': {
      #     'estimator':KNeighborsRegressor(),
      #     'parameters':{
      #         'n_neighbors':[5, 23, 47, 83],
      #         'weights':['uniform', 'distance'],
      #         'p':[1, 1.5, 2]
      #     }},

      # 'SVM': {
      #     'estimator':SVR(),
      #     'parameters':{
      #         'kernel':['rbf', 'poly'],
      #         'gamma':[0.01, 0.5, 1.0],
      #         'C':[10, 100, 1000]
      #     }}

  }

  def __init__(self, verbose=True):
    pass

  def load(self, path, verbose=True):

    self.train_df = pd.read_parquet(path/'train.parquet').sample(40000)
    self.test_df = pd.read_parquet(path/'test.parquet').sample(10000)

    if verbose:
      print("Quantidade inicial de elementos no treino:", len(self.train_df))
      print("Quantidade inicial de elementos no teste:", len(self.test_df))
        
    self.train_df.set_index("NU_INSCRICAO", inplace=True)
    self.test_df.set_index("NU_INSCRICAO", inplace=True)

    self._targets = [col for col in self.train_df.columns if "NU_NOTA" in col]



  def prepare(self,verbose=True):

    if verbose:
      print("Mapeando valores...")    
    self._map_values(verbose)

    if verbose:
      print("Criando novas colunas...")
    self._create_features(verbose)

    if verbose:
      print("Eliminando colunas...")
    self._clear_cols(verbose)

    if verbose:
      print("Aplicando get dummies...")
    self._create_dummies(verbose)

    if verbose:
      print("Selecionando features mais importantes")
    self._feature_selection(verbose)


  def tune(self, random_state=0, verbose=True):


    X, Y = self.train_df.drop(columns=self._targets), self.train_df[self._targets] 

    self._results = {}

    gscv = None

    for name, algorithm in self._algorithms.items():
      if verbose:
        print(name)

      self._results[name] = {} 

      for target in self._targets:
        
        gscv = GridSearchCV(algorithm['estimator'], algorithm['parameters'], verbose = 3,
                             scoring='neg_root_mean_squared_error', return_train_score=True)
        gscv.fit(X, Y[target])

        self._results[name][target] = {}
        self._results[name][target]['best_params'] = gscv.best_params_
        self._results[name][target]['best_score'] = gscv.best_score_
  
    return gscv

  def _to_json(self):

    with open('results.json', 'w') as fp:
      json.dump(self._results, fp)
    fp.close()

    with open('ranking.json', 'w') as fp:
      json.dump(self._ranking, fp)

  def ranking(self, verbose=True):

    self._selecteds = {}

    for algoritmo in self._results:

      for target in self._targets:

        if target not in self._selecteds:
          if verbose:
            print("Chave", target, "estava vazia, vamos colocar o algoritmo", algoritmo)
          self._selecteds[target] = algoritmo
        
        else:
          if self._results[algoritmo][target]['best_score'] > self._results[self._selecteds[target]][target]['best_score']:
            if verbose:
              print("O algoritmo", algoritmo, "se mostrou mais eficiente que o", self._selecteds[target])
            self._selecteds[target] = algoritmo

    self._to_json()


  def predict(self):
    for target in self._targets:
      print("Vamos prever", target, "com o algoritmo", self._selecteds[target], "e hiperparâmetros", self._results[self._selecteds[target]][target]['best_params'])
      # Aplicar o treinamento

  def correlation(self, save=False, plot=True):
    
    fig = px.imshow(self.train_df.corr())
    
    if plot:
      fig.show()

    if save:
      pass


  def plot(self, column):
    
    tmp = self.train_df[column].value_counts()
    fig = px.bar(x=tmp.index, y=tmp.values)
    fig.show()
    
    melted = pd.melt(self.train_df, id_vars=[column], value_vars=self._targets, var_name='TP_NOTA', value_name='NU_NOTA')
    fig=px.box(melted.sample(1000000), x='TP_NOTA', y='NU_NOTA', color=column)
    fig.show()

  def null_analysis(self, plot=True, save=False, verbose=True):
    
    null_count = self.train_df.isna().apply(np.sum, axis=0)/self.train_df.shape[0]
    null_percentage_train = (null_count.loc[null_count!=0]*100).sort_values()
    fig_train = px.bar(x=null_percentage_train.index, y=null_percentage_train.values, title="Porcentagem de valores nulos nos dados de treino")

    null_count = self.test_df.isna().apply(np.sum, axis=0)/self.test_df.shape[0]
    null_percentage_test = (null_count.loc[null_count!=0]*100).sort_values()
    fig_test = px.bar(x=null_percentage_test.index, y=null_percentage_test.values, title="Porcentagem de valores nulos nos dados de teste")

    if plot:
      fig_train.show()
      fig_test.show()

    if save:
      pass

  def _feature_selection(self, verbose):
    
    to_drop = []
    treshold = 0.05
    for col in self.train_df.columns[1:]:
       if self.train_df[col].std() < treshold:
         to_drop.append(col)
    
    self.train_df.drop(columns=to_drop, inplace=True)
    self.test_df.drop(columns=to_drop, inplace=True)
    if verbose:
      print("[VARIANCE TRESHOLD] Removendo colunas:", to_drop)

    #################################################################################

    correlation = self.train_df.corr().abs()

    upper_triangle = correlation.where(np.triu(np.ones(correlation.shape), k=1).astype(bool))

    # Considera apenas colunas de correlação mínima de 0.85
    to_drop = [column for column in upper_triangle.columns if any(upper_triangle[column] > 0.9)]
    
    self.train_df.drop(columns=to_drop, inplace=True)
    self.test_df.drop(columns=to_drop, inplace=True)

    if verbose:
      print('[HIGH CORRELATION] Eliminando colunas redundantes:', to_drop)

    


  def _clear_cols(self, verbose):
    
    null_count = self.train_df.isna().apply(np.sum, axis=0)/self.train_df.shape[0]
    null_percentage_train = (null_count.loc[null_count!=0]*100).sort_values()

    null_count = self.test_df.isna().apply(np.sum, axis=0)/self.test_df.shape[0]
    null_percentage_test = (null_count.loc[null_count!=0]*100).sort_values()

    to_drop_columns_train = list(null_percentage_train[null_percentage_train > 30].index)
    to_drop_columns_test = list(null_percentage_test[null_percentage_test > 30].index)

    if verbose:
      print("[NULLS] Colunas dropadas no treino:", sorted(to_drop_columns_train))
      print("[NULLS] Colunas dropadas no teste:", sorted(to_drop_columns_test))

    self.train_df.drop(columns=to_drop_columns_train, inplace=True)
    self.test_df.drop(columns=to_drop_columns_test, inplace=True)

    ###################################################################################################################

    to_drop = ['CO_MUNICIPIO_RESIDENCIA', 'NO_MUNICIPIO_RESIDENCIA', 'CO_UF_RESIDENCIA', 'CO_MUNICIPIO_NASCIMENTO', 'NO_MUNICIPIO_NASCIMENTO',
    'CO_UF_NASCIMENTO', 'SG_UF_NASCIMENTO', 'TP_ANO_CONCLUIU', 'IN_TREINEIRO', 'CO_MUNICIPIO_PROVA', 'NO_MUNICIPIO_PROVA', 'CO_UF_PROVA',
    'SG_UF_PROVA']

    self.train_df.drop(columns=to_drop, inplace=True)
    self.test_df.drop(columns=to_drop, inplace=True)

    if verbose:
      print(f'[DROP COLUMNS] Colunas retiradas por falta de relevânica:{[to_drop]}')

    ##################################################################################################################

    to_drop = self.train_df[(self.train_df['TP_STATUS_REDACAO'].isna()) & (self.train_df['TP_PRESENCA_CH']=='Presente')].index
    self.train_df.drop(to_drop, inplace=True)

    to_drop = self.test_df[(self.test_df['TP_STATUS_REDACAO'].isna()) & (self.test_df['TP_PRESENCA_CH']=='Presente')].index
    self.test_df.drop(to_drop, inplace=True)

    if verbose:
      print(f'[INCONSISTENCY] Removendo inconsistências.')

    ##################################################################################################################
    

    to_drop = ['NU_NOTA_MT', 'NU_NOTA_CH', 'NU_NOTA_CN', 'NU_NOTA_LC', 'NU_NOTA_REDACAO', 'TP_STATUS_REDACAO']
    self.train_df.dropna(subset=to_drop, inplace=True)

    try:
      self.test_df.dropna(subset=to_drop, inplace=True)
    except KeyError:
      pass #

    if verbose:
      print('[NULL TARGETS] Removendo valores nulos nas colunas-alvo')


    #####################################################################################################################

    self.train_df.drop(self.train_df[self.train_df['TP_STATUS_REDACAO'] != 'Sem problemas'].index, inplace=True)
    self.test_df.drop(self.test_df[self.test_df['TP_STATUS_REDACAO'] != 'Sem problemas'].index, inplace=True)

    if verbose:
      print('[::] Removendo redações que tiraram nota 0')


  def _create_dummies(self, verbose):

    cols = [col for col in self.train_df.columns if ((self.train_df[col].dtype == 'object') or (self.train_df[col].dtype.name == 'category'))]

    self.train_df = pd.get_dummies(self.train_df, columns=cols)
    self.test_df = pd.get_dummies(self.test_df, columns=cols)

    if verbose:
      print(f"[GET DUMMIES] Colunas categóricas convertidas: {cols}")


  def _create_features(self, verbose):

    new_columns = []
    filled_columns = []
    ############################################################################################

    uf_regiao = {
      'RR':'Norte', 'AP':'Norte', 'AM':'Norte', 'PA':'Norte', 'AC':'Norte', 'RO':'Norte', 'TO':'Norte', 'MA':'Nordeste',
      'PI':'Nordeste', 'CE':'Nordeste', 'RN':'Nordeste', 'PB':'Nordeste', 'PE':'Nordeste', 'AL':'Nordeste', 'SE':'Nordeste',
      'BA':'Nordeste', 'MT':'Centro-oeste', 'DF':'Centro-oeste', 'GO':'Centro-oeste', 'MS':'Centro-oeste', 'MG':'Sudeste',
      'ES':'Sudeste', 'RJ':'Sudeste', 'SP':'Sudeste', 'PR':'Sul', 'SC':'Sul', 'RS':'Sul', 
      }

    self.train_df['NO_REGIAO_RESIDENCIA'] = self.train_df['SG_UF_RESIDENCIA'].map(uf_regiao)
    self.test_df['NO_REGIAO_RESIDENCIA'] = self.test_df['SG_UF_RESIDENCIA'].map(uf_regiao)

    new_columns.append('NO_REGIAO_RESIDENCIA')

    ############################################################################################

    mean_score_per_reg = self.train_df.groupby("NO_REGIAO_RESIDENCIA")[self._targets].mean()
    for col in self._targets:
      self.train_df["REG_NOTA_"+col.split("_")[2]+"_MEDIA"] = self.train_df['NO_REGIAO_RESIDENCIA'].apply(
          lambda row: mean_score_per_reg[col][row]) 
      self.test_df["REG_NOTA_"+col.split("_")[2]+"_MEDIA"] = self.test_df['NO_REGIAO_RESIDENCIA'].apply(
          lambda row: mean_score_per_reg[col][row]) 

      new_columns.append("REG_NOTA_"+col.split("_")[2]+"_MEDIA")
    ############################################################################################
  
    
    self.train_df['TP_MINORIA_RACIAL'] = ((self.train_df['TP_COR_RACA'] != 'Branca').astype(int) + (self.train_df['TP_COR_RACA'] != 'Amarela').astype(int)) -1
    self.test_df['TP_MINORIA_RACIAL'] = ((self.test_df['TP_COR_RACA'] != 'Branca').astype(int) + (self.test_df['TP_COR_RACA'] != 'Amarela').astype(int)) -1

    new_columns.append('TP_MINORIA_RACIAL')
    ############################################################################################

    cols = [col for col in self.train_df.columns if (("IN_" in col) and ('TREINEIRO' not in col))]

    self.train_df['TP_SITUACAO_ESPECIAL'] = self.train_df[cols].any(axis=1)
    self.test_df['TP_SITUACAO_ESPECIAL'] = self.test_df[cols].any(axis=1)

    new_columns.append('TP_SITUACAO_ESPECIAL')

    #############################################################################################


    self.train_df['TP_SOLTEIRO'] = self.train_df['TP_ESTADO_CIVIL'] == 'Solteiro(a)'
    self.test_df['TP_SOLTEIRO'] = self.test_df['TP_ESTADO_CIVIL'] == 'Solteiro(a)'

    new_columns.append('TP_SOLTEIRO')

    #############################################################################################

    median_train = self.train_df.loc[self.train_df['NU_IDADE'].notnull(), 'NU_IDADE'].median()
    
    self.train_df['NU_IDADE'] = self.train_df['NU_IDADE'].fillna(median_train)
    self.test_df['NU_IDADE'] = self.test_df['NU_IDADE'].fillna(median_train)
  
    filled_columns.append("NU_IDADE")
    #############################################################################################

    if verbose:
      print(f'[FEATURE ENGINEERING] Novas colunas: {new_columns}')
      print(f'[INPUTATION] Colunas com valores nulos preenchidos: {filled_columns}')
      


  
  def _map_values(self, verbose):
    #################################################################
    rename = {0:"0",#np.NaN,
      1:"Solteiro(a)",
      2:"Casado(a)/Mora com companheiro(a)",
      3:"Divorciado(a)/Desquitado(a)/Separado(a)",
      4:"Viúvo(a)"}

    self.train_df['TP_ESTADO_CIVIL'] = self.train_df['TP_ESTADO_CIVIL'].map(rename)
    self.test_df['TP_ESTADO_CIVIL'] = self.test_df['TP_ESTADO_CIVIL'].map(rename)

    #################################################################
    rename = {0:"0",#np.NaN,
      1:"Branca",
      2:"Preta",
      3:"Parda",
      4:"Amarela",
      5:"Indígena"}

    self.train_df['TP_COR_RACA'] = self.train_df['TP_COR_RACA'].map(rename)
    self.test_df['TP_COR_RACA'] = self.test_df['TP_COR_RACA'].map(rename)

    #################################################################
    rename = {0:"0",#np.NaN,
      1:"Brasileiro(a)",
      2:"Brasileiro(a) Naturalizado(a)",
      3:"Estrangeiro(a)",
      4:"Brasileiro(a) Nato(a), nascido(a) no exterior"
      }

    self.train_df['TP_NACIONALIDADE'] = self.train_df['TP_NACIONALIDADE'].map(rename)
    self.test_df['TP_NACIONALIDADE'] = self.test_df['TP_NACIONALIDADE'].map(rename)

    #################################################################
    rename = {1:"Já concluí o Ensino Médio",
      2:"Estou cursando e concluirei o Ensino Médio no ano corrente",
      3:"Estou cursando e concluirei o Ensino Médio após o ano corrente",
      4:"Não concluí e não estou cursando o Ensino Médio"
      }

    self.train_df['TP_ST_CONCLUSAO'] = self.train_df['TP_ST_CONCLUSAO'].map(rename)
    self.test_df['TP_ST_CONCLUSAO'] = self.test_df['TP_ST_CONCLUSAO'].map(rename)

    #################################################################
    rename = {0:"0",#np.NaN,
      1:"2018",
      2:"2017",
      3:"2016",
      4:"2015",
      5:"2014",
      6:"2013",
      7:"2012",
      8:"2011",
      9:"2010",
      10:"2009",
      11:"2008",
      12:"2007",
      13:"Antes de 2007"}

    self.train_df['TP_ANO_CONCLUIU'] = self.train_df['TP_ANO_CONCLUIU'].map(rename)
    self.test_df['TP_ANO_CONCLUIU'] = self.test_df['TP_ANO_CONCLUIU'].map(rename)

    #################################################################
    rename = {1:"0",#np.NaN,
      2:"Pública",
      3:"Privada",
      4:"Exterior"}

    self.train_df['TP_ESCOLA'] = self.train_df['TP_ESCOLA'].map(rename)
    self.test_df['TP_ESCOLA'] = self.test_df['TP_ESCOLA'].map(rename)

    #################################################################
    rename = {1:"Federal",
      2:"Estadual",
      3:"Municipal",
      4:"Privada"}

    self.train_df['TP_DEPENDENCIA_ADM_ESC'] = self.train_df['TP_DEPENDENCIA_ADM_ESC'].map(rename)
    self.test_df['TP_DEPENDENCIA_ADM_ESC'] = self.test_df['TP_DEPENDENCIA_ADM_ESC'].map(rename)

    #################################################################
    rename = {1:"Ensino Regular",
      2:"Educação Especial - Modalidade Substitutiva",
      3:"Educação de Jovens e Adultos"}

    self.train_df['TP_ENSINO'] = self.train_df['TP_ENSINO'].map(rename)
    self.test_df['TP_ENSINO'] = self.test_df['TP_ENSINO'].map(rename)

    #################################################################
    rename = {0:"Ausente",
      1:"Presente",
      2:"Eliminado"}

    for c in [col for col in self.train_df.columns if "TP_PRESENCA" in col]:
      self.train_df[c] = self.train_df[c].map(rename)
      self.test_df[c] = self.test_df[c].map(rename)

    #################################################################
    rename = {
        1:"Sem problemas",
        2:"Anulada",
        3:"Copiou texto motivador",
        4:"Em branco",
        6:"Fuga ao tema",
        7:"Não atende tipo textual",
        8:"Texto insuficiente",
        9:"Parte desconectada"
      }

    self.train_df['TP_STATUS_REDACAO'] = self.train_df['TP_STATUS_REDACAO'].map(rename)
    self.test_df['TP_STATUS_REDACAO'] = self.test_df['TP_STATUS_REDACAO'].map(rename)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5,
        'F':6,
        'G':7,
        'H':0
    }

    self.train_df['Q001'] = self.train_df['Q001'].map(rename).astype(int)
    self.test_df['Q001'] = self.test_df['Q001'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5,
        'F':6,
        'G':7,
        'H':0
    }

    self.train_df['Q002'] = self.train_df['Q002'].map(rename).astype(int)
    self.test_df['Q002'] = self.test_df['Q002'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5,
        'F':0,
    }

    self.train_df['Q003'] = self.train_df['Q003'].map(rename).astype(int)
    self.test_df['Q003'] = self.test_df['Q003'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5,
        'F':0,
    }

    self.train_df['Q004'] = self.train_df['Q004'].map(rename).astype(int)
    self.test_df['Q004'] = self.test_df['Q004'].map(rename).astype(int)

    #Q005 já é numérica

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5,
        'F':6,
        'G':7,
        'H':8,
        'I':9,
        'J':10,
        'K':11,
        'L':12,
        'M':13,
        'N':14,
        'O':15,
        'P':16,
        'Q':17
    }

    self.train_df['Q006'] = self.train_df['Q006'].map(rename).astype(int)
    self.test_df['Q006'] = self.test_df['Q006'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
    }

    self.train_df['Q007'] = self.train_df['Q007'].map(rename).astype(int)
    self.test_df['Q007'] = self.test_df['Q007'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q008'] = self.train_df['Q008'].map(rename).astype(int)
    self.test_df['Q008'] = self.test_df['Q008'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q009'] = self.train_df['Q009'].map(rename).astype(int)
    self.test_df['Q009'] = self.test_df['Q009'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q010'] = self.train_df['Q010'].map(rename).astype(int)
    self.test_df['Q010'] = self.test_df['Q010'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q011'] = self.train_df['Q011'].map(rename).astype(int)
    self.test_df['Q011'] = self.test_df['Q011'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q012'] = self.train_df['Q012'].map(rename).astype(int)
    self.test_df['Q012'] = self.test_df['Q012'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q013'] = self.train_df['Q013'].map(rename).astype(int)
    self.test_df['Q013'] = self.test_df['Q013'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q014'] = self.train_df['Q014'].map(rename).astype(int)
    self.test_df['Q014'] = self.test_df['Q014'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q015'] = self.train_df['Q015'].map(rename).astype(int)
    self.test_df['Q015'] = self.test_df['Q015'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q016'] = self.train_df['Q016'].map(rename).astype(int)
    self.test_df['Q016'] = self.test_df['Q016'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q017'] = self.train_df['Q017'].map(rename).astype(int)
    self.test_df['Q017'] = self.test_df['Q017'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':0,
        'B':1,
    }

    self.train_df['Q018'] = self.train_df['Q018'].map(rename).astype(int)
    self.test_df['Q018'] = self.test_df['Q018'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q019'] = self.train_df['Q019'].map(rename).astype(int)
    self.test_df['Q019'] = self.test_df['Q019'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':0,
        'B':1,
    }

    self.train_df['Q020'] = self.train_df['Q020'].map(rename).astype(int)
    self.test_df['Q020'] = self.test_df['Q020'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':0,
        'B':1,
    }

    self.train_df['Q021'] = self.train_df['Q021'].map(rename).astype(int)
    self.test_df['Q021'] = self.test_df['Q021'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q022'] = self.train_df['Q022'].map(rename).astype(int)
    self.test_df['Q022'] = self.test_df['Q022'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':0,
        'B':1,
    }

    self.train_df['Q023'] = self.train_df['Q023'].map(rename).astype(int)
    self.test_df['Q023'] = self.test_df['Q023'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':1,
        'B':2,
        'C':3,
        'D':4,
        'E':5
    }

    self.train_df['Q024'] = self.train_df['Q024'].map(rename).astype(int)
    self.test_df['Q024'] = self.test_df['Q024'].map(rename).astype(int)

    #################################################################
    rename = {
        'A':0,
        'B':1,
    }

    self.train_df['Q025'] = self.train_df['Q025'].map(rename).astype(int)
    self.test_df['Q025'] = self.test_df['Q025'].map(rename).astype(int)

In [25]:
model = Model()
model.load(path)

Quantidade inicial de elementos no treino: 40000
Quantidade inicial de elementos no teste: 10000


In [26]:
model.prepare()

Mapeando valores...
Criando novas colunas...
[FEATURE ENGINEERING] Novas colunas: ['NO_REGIAO_RESIDENCIA', 'REG_NOTA_CN_MEDIA', 'REG_NOTA_CH_MEDIA', 'REG_NOTA_LC_MEDIA', 'REG_NOTA_MT_MEDIA', 'REG_NOTA_REDACAO_MEDIA', 'TP_MINORIA_RACIAL', 'TP_SITUACAO_ESPECIAL', 'TP_SOLTEIRO']
[INPUTATION] Colunas com valores nulos preenchidos: ['NU_IDADE']
Eliminando colunas...
[NULLS] Colunas dropadas no treino: ['CO_ESCOLA', 'CO_MUNICIPIO_ESC', 'CO_UF_ESC', 'NO_MUNICIPIO_ESC', 'SG_UF_ESC', 'TP_DEPENDENCIA_ADM_ESC', 'TP_ENSINO', 'TP_LOCALIZACAO_ESC', 'TP_SIT_FUNC_ESC']
[NULLS] Colunas dropadas no teste: ['CO_ESCOLA', 'CO_MUNICIPIO_ESC', 'CO_UF_ESC', 'NO_MUNICIPIO_ESC', 'SG_UF_ESC', 'TP_DEPENDENCIA_ADM_ESC', 'TP_ENSINO', 'TP_LOCALIZACAO_ESC', 'TP_SIT_FUNC_ESC']
[DROP COLUMNS] Colunas retiradas por falta de relevânica:[['CO_MUNICIPIO_RESIDENCIA', 'NO_MUNICIPIO_RESIDENCIA', 'CO_UF_RESIDENCIA', 'CO_MUNICIPIO_NASCIMENTO', 'NO_MUNICIPIO_NASCIMENTO', 'CO_UF_NASCIMENTO', 'SG_UF_NASCIMENTO', 'TP_ANO_CONCLUIU',

In [27]:
m = model.tune()

ElasticNet
Fitting 5 folds for each of 9 candidates, totalling 45 fits


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0;, score=(train=-65.040, test=-64.246) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0;, score=(train=-64.905, test=-64.787) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0;, score=(train=-64.646, test=-65.804) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0;, score=(train=-64.626, test=-65.849) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0;, score=(train=-64.968, test=-64.511) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-65.038, test=-64.248) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-64.903, test=-64.786) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-64.644, test=-65.805) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-64.624, test=-65.850) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-64.966, test=-64.512) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-65.037, test=-64.250) total time=   1.6s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-64.901, test=-64.789) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-64.641, test=-65.813) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-64.622, test=-65.853) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-64.964, test=-64.518) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.5, l1_ratio=0;, score=(train=-66.127, test=-65.291) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.5, l1_ratio=0;, score=(train=-65.983, test=-65.915) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.5, l1_ratio=0;, score=(train=-65.737, test=-66.826) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.5, l1_ratio=0;, score=(train=-65.729, test=-66.788) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.5, l1_ratio=0;, score=(train=-66.116, test=-65.156) total time=   2.0s
[CV 1/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-65.943, test=-65.076) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-65.794, test=-65.700) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-65.544, test=-66.629) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-65.530, test=-66.625) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-65.912, test=-64.995) total time=   0.3s
[CV 1/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-65.520, test=-64.596) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-65.379, test=-65.232) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-65.128, test=-66.236) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-65.102, test=-66.294) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-65.466, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=1.0, l1_ratio=0;, score=(train=-66.504, test=-65.684) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=1.0, l1_ratio=0;, score=(train=-66.359, test=-66.342) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=1.0, l1_ratio=0;, score=(train=-66.121, test=-67.206) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=1.0, l1_ratio=0;, score=(train=-66.120, test=-67.138) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=1.0, l1_ratio=0;, score=(train=-66.520, test=-65.464) total time=   2.0s
[CV 1/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-66.361, test=-65.525) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-66.214, test=-66.174) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-65.968, test=-67.045) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-65.956, test=-66.998) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-66.359, test=-65.335) total time=   0.3s
[CV 1/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-65.916, test=-65.012) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-65.747, test=-65.635) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-65.476, test=-66.584) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-65.447, test=-66.564) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-65.856, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


Fitting 5 folds for each of 9 candidates, totalling 45 fits


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0;, score=(train=-68.791, test=-69.427) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0;, score=(train=-69.029, test=-68.473) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0;, score=(train=-69.067, test=-68.291) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0;, score=(train=-68.705, test=-69.789) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0;, score=(train=-68.793, test=-69.404) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-68.789, test=-69.429) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-69.027, test=-68.471) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-69.065, test=-68.296) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-68.702, test=-69.793) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-68.791, test=-69.403) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-68.787, test=-69.431) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-69.026, test=-68.470) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-69.062, test=-68.304) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-68.700, test=-69.799) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-68.788, test=-69.405) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.5, l1_ratio=0;, score=(train=-69.804, test=-70.381) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.5, l1_ratio=0;, score=(train=-70.009, test=-69.637) total time=   2.1s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.5, l1_ratio=0;, score=(train=-70.121, test=-69.015) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.5, l1_ratio=0;, score=(train=-69.724, test=-70.583) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.5, l1_ratio=0;, score=(train=-69.828, test=-70.126) total time=   2.0s
[CV 1/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-69.612, test=-70.178) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-69.827, test=-69.404) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-69.927, test=-68.813) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-69.528, test=-70.427) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-69.623, test=-69.970) total time=   0.3s
[CV 1/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-69.227, test=-69.777) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-69.455, test=-68.895) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-69.521, test=-68.446) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-69.154, test=-70.135) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-69.215, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=1.0, l1_ratio=0;, score=(train=-70.166, test=-70.764) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=1.0, l1_ratio=0;, score=(train=-70.368, test=-70.082) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=1.0, l1_ratio=0;, score=(train=-70.497, test=-69.364) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=1.0, l1_ratio=0;, score=(train=-70.105, test=-70.903) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=1.0, l1_ratio=0;, score=(train=-70.214, test=-70.427) total time=   2.0s
[CV 1/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-70.009, test=-70.603) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-70.215, test=-69.881) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-70.339, test=-69.193) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-69.933, test=-70.776) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-70.042, test=-70.293) total time=   0.3s
[CV 1/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-69.571, test=-70.166) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-69.799, test=-69.288) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-69.893, test=-68.728) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-69.459, test=-70.400) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-69.575, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


Fitting 5 folds for each of 9 candidates, totalling 45 fits


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0;, score=(train=-52.145, test=-53.370) total time=   3.1s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0;, score=(train=-52.721, test=-51.046) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0;, score=(train=-52.372, test=-52.483) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0;, score=(train=-52.257, test=-52.967) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0;, score=(train=-52.315, test=-52.717) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-52.142, test=-53.374) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-52.719, test=-51.046) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-52.369, test=-52.484) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-52.254, test=-52.965) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-52.313, test=-52.717) total time=   1.7s
[CV 1/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-52.138, test=-53.384) total time=   1.7s
[CV 2/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-52.717, test=-51.046) total time=   1.7s
[CV 3/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-52.364, test=-52.496) total time=   1.6s
[CV 4/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-52.251, test=-52.963) total time=   1.6s
[CV 5/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-52.311, test=-52.716) total time=   1.6s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.5, l1_ratio=0;, score=(train=-53.125, test=-54.179) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.5, l1_ratio=0;, score=(train=-53.686, test=-51.900) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.5, l1_ratio=0;, score=(train=-53.337, test=-53.355) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.5, l1_ratio=0;, score=(train=-53.189, test=-53.932) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.5, l1_ratio=0;, score=(train=-53.296, test=-53.463) total time=   1.9s
[CV 1/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-52.958, test=-54.029) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-53.524, test=-51.744) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-53.180, test=-53.165) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-53.028, test=-53.786) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-53.132, test=-53.311) total time=   0.3s
[CV 1/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-52.601, test=-53.741) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-53.185, test=-51.418) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-52.838, test=-52.776) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-52.706, test=-53.426) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-52.777, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=1.0, l1_ratio=0;, score=(train=-53.488, test=-54.532) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=1.0, l1_ratio=0;, score=(train=-54.050, test=-52.266) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=1.0, l1_ratio=0;, score=(train=-53.696, test=-53.725) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=1.0, l1_ratio=0;, score=(train=-53.547, test=-54.268) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=1.0, l1_ratio=0;, score=(train=-53.665, test=-53.797) total time=   2.0s
[CV 1/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-53.370, test=-54.425) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-53.931, test=-52.153) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-53.589, test=-53.581) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-53.430, test=-54.172) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-53.547, test=-53.693) total time=   0.3s
[CV 1/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-52.958, test=-54.074) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-53.515, test=-51.758) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-53.194, test=-53.118) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-53.039, test=-53.790) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-53.132, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


Fitting 5 folds for each of 9 candidates, totalling 45 fits


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0;, score=(train=-89.694, test=-89.485) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0;, score=(train=-89.418, test=-90.636) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0;, score=(train=-89.729, test=-89.359) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0;, score=(train=-89.385, test=-90.723) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0;, score=(train=-89.790, test=-89.105) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-89.692, test=-89.487) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-89.416, test=-90.634) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-89.727, test=-89.366) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-89.383, test=-90.725) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-89.787, test=-89.113) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-89.691, test=-89.490) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-89.416, test=-90.635) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-89.724, test=-89.379) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-89.381, test=-90.731) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-89.784, test=-89.133) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.5, l1_ratio=0;, score=(train=-91.827, test=-91.574) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.5, l1_ratio=0;, score=(train=-91.510, test=-92.946) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.5, l1_ratio=0;, score=(train=-91.858, test=-91.402) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.5, l1_ratio=0;, score=(train=-91.544, test=-92.602) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.5, l1_ratio=0;, score=(train=-92.029, test=-90.529) total time=   1.9s
[CV 1/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-91.296, test=-91.021) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-90.980, test=-92.363) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-91.326, test=-90.853) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-90.995, test=-92.100) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-91.469, test=-90.064) total time=   0.3s
[CV 1/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-90.235, test=-89.923) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-89.900, test=-91.171) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-90.272, test=-89.740) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-89.908, test=-91.126) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-90.335, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=1.0, l1_ratio=0;, score=(train=-92.575, test=-92.345) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=1.0, l1_ratio=0;, score=(train=-92.252, test=-93.766) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=1.0, l1_ratio=0;, score=(train=-92.612, test=-92.138) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=1.0, l1_ratio=0;, score=(train=-92.309, test=-93.315) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=1.0, l1_ratio=0;, score=(train=-92.812, test=-91.191) total time=   1.9s
[CV 1/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-92.118, test=-91.874) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-91.806, test=-93.259) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-92.154, test=-91.667) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-91.838, test=-92.884) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-92.331, test=-90.787) total time=   0.3s
[CV 1/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-90.723, test=-90.411) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-90.433, test=-91.718) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-90.753, test=-90.232) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-90.416, test=-91.588) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-90.866, t

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


Fitting 5 folds for each of 9 candidates, totalling 45 fits


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0;, score=(train=-136.438, test=-139.652) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0;, score=(train=-136.953, test=-137.667) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0;, score=(train=-136.879, test=-137.968) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0;, score=(train=-137.439, test=-135.644) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0;, score=(train=-137.326, test=-136.128) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-136.432, test=-139.656) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-136.947, test=-137.668) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-136.873, test=-137.967) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-137.433, test=-135.644) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=0.5;, score=(train=-137.320, test=-136.125) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-136.429, test=-139.661) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-136.942, test=-137.675) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-136.867, test=-137.980) total time=   1.7s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-137.429, test=-135.647) total time=   1.8s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.001, l1_ratio=1.0;, score=(train=-137.315, test=-136.129) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=0.5, l1_ratio=0;, score=(train=-139.479, test=-142.466) total time=   3.1s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=0.5, l1_ratio=0;, score=(train=-139.947, test=-140.728) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=0.5, l1_ratio=0;, score=(train=-139.970, test=-140.421) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=0.5, l1_ratio=0;, score=(train=-140.506, test=-138.319) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=0.5, l1_ratio=0;, score=(train=-140.374, test=-138.824) total time=   2.0s
[CV 1/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-138.812, test=-141.811) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-139.291, test=-140.015) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-139.271, test=-139.819) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-139.840, test=-137.656) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=0.5;, score=(train=-139.698, test=-138.202) total time=   0.3s
[CV 1/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-137.094, test=-140.082) total time=   0.3s
[CV 2/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-137.585, test=-138.151) total time=   0.3s
[CV 3/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-137.473, test=-138.519) total time=   0.3s
[CV 4/5] END alpha=0.5, l1_ratio=1.0;, score=(train=-138.067, test=-136.026) total time=   0.3s
[CV 5/5] END alpha=0.5, l1_ratio=1.0;, sco

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 1/5] END alpha=1.0, l1_ratio=0;, score=(train=-140.312, test=-143.264) total time=   2.0s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 2/5] END alpha=1.0, l1_ratio=0;, score=(train=-140.767, test=-141.605) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 3/5] END alpha=1.0, l1_ratio=0;, score=(train=-140.826, test=-141.196) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 4/5] END alpha=1.0, l1_ratio=0;, score=(train=-141.339, test=-139.115) total time=   1.9s


  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


[CV 5/5] END alpha=1.0, l1_ratio=0;, score=(train=-141.216, test=-139.613) total time=   1.9s
[CV 1/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-139.790, test=-142.759) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-140.257, test=-141.032) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-140.276, test=-140.699) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-140.820, test=-138.608) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=0.5;, score=(train=-140.679, test=-139.131) total time=   0.3s
[CV 1/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-137.695, test=-140.649) total time=   0.3s
[CV 2/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-138.178, test=-138.682) total time=   0.3s
[CV 3/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-138.080, test=-138.949) total time=   0.3s
[CV 4/5] END alpha=1.0, l1_ratio=1.0;, score=(train=-138.715, test=-136.588) total time=   0.3s
[CV 5/5] END alpha=1.0, l1_ratio=1.0;, sco

  coef_, l1_reg, l2_reg, X, y, max_iter, tol, rng, random, positive


DecisionTree
Fitting 5 folds for each of 20 candidates, totalling 100 fits
[CV 1/5] END max_depth=100, min_samples_leaf=1;, score=(train=-0.000, test=-93.116) total time=   0.5s
[CV 2/5] END max_depth=100, min_samples_leaf=1;, score=(train=-0.363, test=-93.688) total time=   0.5s
[CV 3/5] END max_depth=100, min_samples_leaf=1;, score=(train=-0.363, test=-93.944) total time=   0.5s
[CV 4/5] END max_depth=100, min_samples_leaf=1;, score=(train=-0.363, test=-92.888) total time=   0.5s
[CV 5/5] END max_depth=100, min_samples_leaf=1;, score=(train=-0.000, test=-92.878) total time=   0.5s
[CV 1/5] END max_depth=100, min_samples_leaf=10;, score=(train=-55.363, test=-72.150) total time=   0.3s
[CV 2/5] END max_depth=100, min_samples_leaf=10;, score=(train=-55.463, test=-72.940) total time=   0.3s
[CV 3/5] END max_depth=100, min_samples_leaf=10;, score=(train=-54.944, test=-74.296) total time=   0.3s
[CV 4/5] END max_depth=100, min_samples_leaf=10;, score=(train=-55.240, test=-73.202) total tim

In [28]:
model.ranking()
model.predict()

Chave NU_NOTA_CN estava vazia, vamos colocar o algoritmo ElasticNet
Chave NU_NOTA_CH estava vazia, vamos colocar o algoritmo ElasticNet
Chave NU_NOTA_LC estava vazia, vamos colocar o algoritmo ElasticNet
Chave NU_NOTA_MT estava vazia, vamos colocar o algoritmo ElasticNet
Chave NU_NOTA_REDACAO estava vazia, vamos colocar o algoritmo ElasticNet
{'NU_NOTA_CN': 'ElasticNet', 'NU_NOTA_CH': 'ElasticNet', 'NU_NOTA_LC': 'ElasticNet', 'NU_NOTA_MT': 'ElasticNet', 'NU_NOTA_REDACAO': 'ElasticNet'}
Vamos prever NU_NOTA_CN com o algoritmo ElasticNet e hiperparâmetros {'alpha': 0.001, 'l1_ratio': 0}
Vamos prever NU_NOTA_CH com o algoritmo ElasticNet e hiperparâmetros {'alpha': 0.001, 'l1_ratio': 0}
Vamos prever NU_NOTA_LC com o algoritmo ElasticNet e hiperparâmetros {'alpha': 0.001, 'l1_ratio': 0}
Vamos prever NU_NOTA_MT com o algoritmo ElasticNet e hiperparâmetros {'alpha': 0.001, 'l1_ratio': 0}
Vamos prever NU_NOTA_REDACAO com o algoritmo ElasticNet e hiperparâmetros {'alpha': 0.001, 'l1_ratio': 0}

In [7]:
model._results

{'DecisionTree': {'NU_NOTA_CH': {'best_params': {'max_depth': 70,
    'min_samples_leaf': 100},
   'best_score': -71.36979527895173},
  'NU_NOTA_CN': {'best_params': {'max_depth': 100, 'min_samples_leaf': 100},
   'best_score': -66.06008092254851},
  'NU_NOTA_LC': {'best_params': {'max_depth': 100, 'min_samples_leaf': 100},
   'best_score': -54.38656482726235},
  'NU_NOTA_MT': {'best_params': {'max_depth': 100, 'min_samples_leaf': 100},
   'best_score': -93.01160204110991},
  'NU_NOTA_REDACAO': {'best_params': {'max_depth': 90,
    'min_samples_leaf': 100},
   'best_score': -140.58504289884291}},
 'ElasticNet': {'NU_NOTA_CH': {'best_params': {'alpha': 0.001, 'l1_ratio': 0},
   'best_score': -69.47536198096927},
  'NU_NOTA_CN': {'best_params': {'alpha': 0.001, 'l1_ratio': 0},
   'best_score': -64.47326888352583},
  'NU_NOTA_LC': {'best_params': {'alpha': 0.001, 'l1_ratio': 0},
   'best_score': -52.89958269748065},
  'NU_NOTA_MT': {'best_params': {'alpha': 0.001, 'l1_ratio': 0},
   'best