# <div style="color:#fff;display:fill;border-radius:10px;background-color:#000000;text-align:left;letter-spacing:0.1px;overflow:hidden;padding:20px;color:white;overflow:hidden;margin:0;font-size:100%"> - | Notebook resume</div>

<p style="font-size:15px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
Hello Kagglers, I just wanted to share with you another implementation of a trainer class like the one I did in my other Titanic notebook <a href="https://www.kaggle.com/code/maxdiazbattan/titanic-top-5-competition-class-v1-blending">[link]</a>, It came to me when I was studying Pytorch and using Tez (Pytorch trainer), I wanted to do something similar. For this update I modified how the trainer class takes the models, for which I created a nested class with all the different models. I really like the final result. Greetings to all! </p>


# <div style="color:#fff;display:fill;border-radius:10px;background-color:#000000;text-align:left;letter-spacing:0.1px;overflow:hidden;padding:20px;color:white;overflow:hidden;margin:0;font-size:100%"> - | Table of contents</div>

* [1-Libraries](#section-one)
* [2-Data loading](#section-two)
* [3-Folds creation](#section-three)
* [4-Exploratory data analysis (EDA)](#section-four)
* [5-Feature engineering](#section-five)
* [6-Feature selecion](#section-six)
* [7-Modeling](#section-seven)

<a id="section-one"></a>
# <div style="color:#fff;display:fill;border-radius:10px;background-color:#000000;text-align:left;letter-spacing:0.1px;overflow:hidden;padding:20px;color:white;overflow:hidden;margin:0;font-size:100%"> 1 | Libraries</div>

In [None]:
import numpy as np 
import pandas as pd 
from sklearn import model_selection, preprocessing, pipeline, metrics, impute, compose
import matplotlib.pyplot as plt
import seaborn as sns

In [None]:
import warnings
warnings.filterwarnings('ignore')

<a id="section-two"></a>
# <div style="color:#fff;display:fill;border-radius:10px;background-color:#000000;text-align:left;letter-spacing:0.1px;overflow:hidden;padding:20px;color:white;overflow:hidden;margin:0;font-size:100%"> 2 | Data loading</div>

In [None]:
train = pd.read_csv('../input/titanic/train.csv')
test = pd.read_csv('../input/titanic/test.csv')
submission = pd.read_csv('../input/titanic/gender_submission.csv')

<a id="section-three"></a>
# <div style="color:#fff;display:fill;border-radius:10px;background-color:#000000;text-align:left;letter-spacing:0.1px;overflow:hidden;padding:20px;color:white;overflow:hidden;margin:0;font-size:100%"> 3 | Folds creation</div>

<p style="font-size:15px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
Usually always it's recommended split the data in folds first </p>


In [None]:
train.Survived.value_counts()

<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
Because it's an imbalance problem I'm going to use stratified k fold</p>

In [None]:
kf = model_selection.StratifiedKFold(n_splits=5) 
train['kfold'] = -1
test['kfold'] = -1
def kfold (df):
    df = df.copy()
    # Shuffling the data
    df = df.sample(frac=1.0, random_state=0).reset_index(drop=True)
    for fold, (train_idx, test_idx) in enumerate(kf.split(X = df, y=df.Survived)):
        df.loc[test_idx, 'kfold'] = fold
        
    return df

In [None]:
train = kfold(train)

<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
For a better analysis I'm going to concat the 2 dataframes</p>

In [None]:
combined_df = pd.concat([train,test], axis = 0)

<a id="section-four"></a>
# <div style="color:#fff;display:fill;border-radius:10px;background-color:#000000;text-align:left;letter-spacing:0.1px;overflow:hidden;padding:20px;color:white;overflow:hidden;margin:0;font-size:100%"> 4 | Exploratory data analysis (EDA)</div>

In [None]:
combined_df.describe().T.style.bar(subset=['mean'], color='#205ff2')\
                            .background_gradient(subset=['std'], cmap='Reds')\
                            .background_gradient(subset=['50%'], cmap='coolwarm')

In [None]:
categoric_features = [feature for feature in train.columns if train[feature].dtype =='O']
numeric_features = [feature for feature in train.columns if feature not in categoric_features+['kfold']]

<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
Checking the categories length</p>

In [None]:
{feature: len(train[feature].unique()) for feature in train.select_dtypes('object')}

<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
Besides of this categories, we have also Pclass</p>

In [None]:
sns.set_style ('darkgrid')
sns.palplot(sns.color_palette('rainbow'))
sns.set_palette('rainbow')

In [None]:
plt.figure (figsize = (20,15))
for i, feature in enumerate (numeric_features):
    plt.subplot (4,2, i*1 + 1 )
    sns.histplot (data = train, x = train[feature], hue='Survived')

<div class="alert alert-info" style="border-radius:5px; font-size:15px; font-family:verdana; line-height: 1.7em">
<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px ">   
<b> Insights: </b> We can see how almost half of the first class people survived, as opposed to the third class where less than a third did. Traveling alone gives almost a 50% chance of survival. Age is a bit right skew, and Fare much more, candidate for a log transformation.</p>

In [None]:
plt.figure (figsize = (20,15))
for i, feature in enumerate (numeric_features):
    plt.subplot (4,2, i*1 + 1 )
    sns.boxplot (data = train, x = train[feature])

<div class="alert alert-info" style="border-radius:5px; font-size:15px; font-family:verdana; line-height: 1.7em">
<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
<b> Insights: </b> There is some outliers in the data in the columns Age, Sibsp, Parch, and Fare.</p>

In [None]:
missing = (combined_df.isna().mean() * 100).round(2).sort_values(ascending=False)

In [None]:
missing

In [None]:
plt.figure(figsize=(12,6))
sns.barplot(x = missing.index, y = missing.values, data=missing, edgecolor='black',linewidth=2)
plt.ylabel('% of Missing' ,weight='bold', size=13)
plt.title('Missing data',weight='bold', size=14);

<div class="alert alert-info" style="border-radius:5px; font-size:15px; font-family:verdana; line-height: 1.7em">
<p style="font-size:18px; font-family:verdana; line-height: 1.7em; border: ">   
<b> Insights: </b> Cabin it's the feature with most missing values, almost 78% of the data it's missing. </p>

In [None]:
plt.figure(figsize=(12,6))
sns.countplot(x = train["Survived"], hue = "Sex", data=train, edgecolor='black',linewidth=2)
plt.ylabel('Number of people' ,weight='bold', size=13)
plt.xlabel('Survived' ,weight='bold', size=13)
plt.title('Survival count by Gender',weight='bold', size=14);

In [None]:
plt.figure (figsize = (12,6))
sns.barplot(x = 'Sex', y ='Survived', data = train, edgecolor='black',linewidth=2);
plt.ylabel('Survival Probability' ,weight='bold', size=13)
plt.xlabel('Sex' ,weight='bold', size=13)
plt.title('Survival Probability by Gender',weight='bold', size=14);

<div class="alert alert-info" style="border-radius:5px; font-size:15px; font-family:verdana; line-height: 1.7em">
<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
<b> Insights: </b> By far women are more likely to survive. </p>

In [None]:
plt.figure(figsize=(12,6))
sns.countplot(x = train["Survived"], hue = "Pclass", data=train, edgecolor='black',linewidth=2)
plt.ylabel('Number of people',weight='bold', size=13)
plt.title('Survival count by Passenger class',weight='bold', size=14);

<div class="alert alert-info" style="border-radius:5px; font-size:15px; font-family:verdana; line-height: 1.7em">
<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
<b> Insights: </b> Rich people also has a greater opportunity to survive. </p>

In [None]:
plt.figure(figsize=(12,6))
sns.countplot(x = train["Survived"], hue = "SibSp", data=train, edgecolor='black',linewidth=2)
plt.ylabel('Number of people',weight='bold', size=13)
plt.title('Survival count by sibiling and spouses',weight='bold', size=14);

In [None]:
plt.figure(figsize=(12,6))
sns.countplot(x = train["Survived"], hue = "Parch", data=train, edgecolor='black',linewidth=2)
plt.xlabel('Survived',weight='bold', size=13)
plt.ylabel('Number of people',weight='bold', size=13)
plt.title('Survival count by Parch',weight='bold', size=14);

In [None]:
plt.figure(figsize=(12,6))
sns.swarmplot(data=train, x=(train['SibSp'] + train['Parch']), y=train['Fare'], hue=train['Pclass'])
plt.xlabel('Family Size',weight='bold', size=13)
plt.ylabel('Fare amount',weight='bold', size=13)
plt.title('Survival by ',weight='bold', size=14);

<div class="alert alert-info" style="border-radius:5px; font-size:15px; font-family:verdana; line-height: 1.7em">
<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
<b> Insights: </b> Travel alone or just with one Sibsp or Patch gives the highest chance to survive. This may also be due to the fact that the smaller families are the ones with more first- or second-class people. </p>

In [None]:
plt.figure(figsize=(12,6))
sns.boxplot(x = combined_df["Pclass"], y = combined_df["Age"], data = combined_df)
plt.xlabel('Pclass',weight='bold', size=13)
plt.ylabel('Age',weight='bold', size=13)
plt.title('Age by Passenger class',weight='bold', size=14);

<div class="alert alert-info" style="border-radius:5px; font-size:15px; font-family:verdana; line-height: 1.7em">
<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
<b> Insights: </b> The "oldest" people are the richest. </p>

<a id="section-five"></a>
# <div style="color:#fff;display:fill;border-radius:10px;background-color:#000000;text-align:left;letter-spacing:0.1px;overflow:hidden;padding:20px;color:white;overflow:hidden;margin:0;font-size:100%"> 5 | Feature engineering </div>

In [None]:
def preprocessing_inputs (df):
    df = df.copy()
    
    # Feature Engineering:
    
    # Name
    # Extracting the Name feature and creating a new feature just with the title 
    df['Title'] = df['Name'].apply(lambda x: x.split('.')[0]).apply(lambda x : x.split(',')[1])
    
    # Age
    # Creating a flag if the Age value is null
    df['AgeFlag'] = df['Age'].map(lambda x: 1 if pd.isnull(x) else 0)  
    
    # Filling the NA values
    df['Age'].fillna(df['Age'].median(), inplace = True)
    # Creating bins from the Age feature
    df['AgeBin'] = pd.cut(df['Age'].astype(int), 5, labels=False)
    
    # SibSp & Parch
    # Math transform on Sib and Parch
    df['Family'] = df['SibSp'] + df['Parch']
    
    # Ticket
    # Extracting the ticket number
    df['TicketNumber'] = df['Ticket'].apply(lambda x: x.split(' ')).apply(lambda x : x[1] if len (x) > 1 else x[0]).apply(lambda x: x[0])
    df['TicketNumber'].replace({'LINE': -1, 'SC/AH Basle 541': -1, 'L':-1, 'B':-1}, inplace=True)
    df['TicketNumber'] = df['TicketNumber'].astype(int)
    # Creating a flag if the Ticket value is null
    df['TicketFlag'] = df['Ticket'].apply(lambda x: 1 if x.isnumeric() else 0)
    # Extracting the first letter on the ticket feature
    df['TicketCode'] = df['Ticket'].apply(lambda x: ''.join(x.split(' ')[:-1]).replace('.','').replace('/','') if len(x.split(' ')[:-1]) > 0 else 'None')
    
    # Fare
    # Creating a feature by splitting the Fare in 3 different classes
    df['SocialClassByFare'] = df['Fare'].apply(lambda x : 'Rich' if x > df['Fare'].quantile(0.75) else ( 'Poor' if x < df['Fare'].quantile(0.25) else 'Midd' ))
    # Creating bins from the Age feature
    df['FareBin'] = pd.qcut(df['Fare'], 4, labels=False)
    
    # Cabin
    # Extracting the first letter on the Cabin feature
    df['CabinCode'] = df['Cabin'].apply(lambda x : str(x)).apply(lambda x: 'U' if x == 'nan' else x[0])
    # Extracting the length of the Cabin feature
    df['CabinLen'] = df['Cabin'].apply(lambda x: 0 if pd.isna(x) else len(x.split(' ')))
    # Creating a flag if the Cabin value is null
    df['CabinFlag'] = df['Cabin'].map(lambda x: 1 if pd.isnull(x) else 0)
    #df.drop('Cabin', inplace=True)
    
    # Embarked
    # Replacing the null values on Embarked by U
    df['Embarked'] = df['Embarked'].apply(lambda x : str(x)).apply(lambda x: 'U' if x == 'nan' else x)
    
    # Split the dataframe
    train = df.query("kfold != -1").copy()
    train['Survived'] = train['Survived'].astype(int)
    
    test = df.query("kfold == -1").copy()
    test.drop(['Survived', 'kfold'], axis = 1, inplace=True)
    
    return train, test

In [None]:
train_df, test_df = preprocessing_inputs(combined_df)

<a id="section-six"></a>
# <div style="color:#fff;display:fill;border-radius:10px;background-color:#000000;text-align:left;letter-spacing:0.1px;overflow:hidden;padding:20px;color:white;overflow:hidden;margin:0;font-size:100%"> 6 | Feature selection</div>

<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
 It's a very small dataset, so this feature selection part it's not so important, but I think it's a good practice to apply it for educational purposes anyway. </p>

<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px">  
Mutual Information </p>

In [None]:
xm = train.copy()
xm.dropna(inplace = True)

ym = xm.pop("Survived").copy()

# Label encoding for categoricals
for colname in xm.select_dtypes("object"):
    xm[colname], _ = xm[colname].factorize()

discrete_features = xm.dtypes == int

In [None]:
from sklearn.feature_selection import mutual_info_classif

def make_mi_scores(X, y, discrete_features):
    mi_scores = mutual_info_classif(X, y, discrete_features=discrete_features, n_neighbors = 5)
    mi_scores = pd.Series(mi_scores, name="MI Scores", index=X.columns)
    mi_scores = mi_scores.sort_values(ascending=False)
    return mi_scores

In [None]:
mi_scores = make_mi_scores(xm, ym, discrete_features)

In [None]:
def plot_mi_scores(scores):
    scores = scores.sort_values(ascending=True)
    width = np.arange(len(scores))
    ticks = list(scores.index)
    plt.barh(width, scores)
    plt.yticks(width, ticks)
    plt.title("Mutual Information Scores")


plt.figure(dpi=100, figsize=(8, 5))
plot_mi_scores(mi_scores)

<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px">  
Permutation feature importance </p>

In [None]:
from sklearn.ensemble import RandomForestClassifier

feature_names = [i for i in train.columns if train[i].dtype in [np.int64]]
X = train[feature_names]
train_X, val_X, train_y, val_y = model_selection.train_test_split(xm, ym, random_state=1)
my_model = RandomForestClassifier(n_estimators=100,
                                  random_state=0).fit(train_X, train_y)

In [None]:
import eli5
from eli5.sklearn import PermutationImportance

perm = PermutationImportance(my_model, random_state=1).fit(val_X, val_y)
eli5.show_weights(perm, top=10, feature_names = val_X.columns.tolist())

<div class="alert alert-info" style="border-radius:5px; font-size:15px; font-family:verdana; line-height: 1.7em">
<p style="font-size:18px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
<b> Insights: </b> For some reason mutual information and permutation feature importance takes PassengerId as relevant, most likely there is some leaking. We can see how Sex, Name (related with sex and social class), and Age are the top 3 most relevant features </p>

<a id="section-seven"></a>
# <div style="color:#fff;display:fill;border-radius:10px;background-color:#000000;text-align:left;letter-spacing:0.1px;overflow:hidden;padding:20px;color:white;overflow:hidden;margin:0;font-size:100%"> 7 | Modeling</div>

In [None]:
import xgboost as xgb
import catboost as cb
import lightgbm as lgb
from sklearn import linear_model

In [None]:
features = ['Pclass', 'Sex', 'Age', 'Ticket', 'Embarked', 'Title',
            'AgeFlag', 'AgeBin', 'Family', 'TicketNumber', 'TicketFlag','TicketCode', 
            'SocialClassByFare', 'FareBin', 'CabinCode', 'CabinLen', 'CabinFlag']

In [None]:
categoric_features = [feature for feature in train_df[features] if train_df[feature].dtype =='O']
numeric_features = [feature for feature in train_df[features] if feature not in categoric_features+['PassengerId','kfold','Survived']]  

In [None]:
ordinal_features = [feature for feature in train_df[categoric_features].columns if len(train_df[feature].unique()) <= 3 and feature not in ['Survived']]
high_card_features = [feature for feature in train_df[categoric_features].columns if len(train_df[feature].unique()) > 3 and feature not in ['Survived','PassengerId']+ordinal_features]


<p style="font-size:20px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
Trainer class </p>

In [None]:
class Trainer:
    
    """
    Args:
        - model: Any ML model to train.
        - model_name: The corresponding model name to be used to identify it in the training process.
        - fold: Fold number.
        - model_params: Hyperparameters of the respective model.
        
    """
    
    def __init__(self, model, model_name, fold, model_params=None):
     
        self.model_ = model
        self.model_name = model_name
        self.fold = fold
        self.model_params = model_params
        
        self.test_preds = []
              
    def fit(self, xtrain, ytrain, xvalid, yvalid):
        
        """
        Fits an instance of the model for a particular dataset.
        Args:
            - xtrain: Train data.
            - ytrain: Train target.
            - xvalid: Validation data.
            - yvalid: Validation target.
        """
        self.xtrain = xtrain
        self.ytrain = ytrain
        self.xvalid = xvalid
        self.yvalid = yvalid
        
        self.model_.fit(self.xtrain, self.ytrain, self.xvalid, self.yvalid)
        
        return self.model_
        
    def pred_evaluate(self, xtest):
        
        """
        Makes predictions for each model on the test data provided.
        Args:
            - xtest: Test data.
        """
        
        self.xtest = xtest
        
        self.preds_valid = self.model_.predict(self.xvalid)
        self.preds_test = self.model_.predict(self.xtest)
        
        pred = self.test_preds.append(self.preds_test) 
        score = metrics.accuracy_score(self.yvalid, self.preds_valid)      
        
        print(f'fold = {self.fold}, score = {score:.4f}')
    
    def blend(self, models):
        
        """
        Makes a blend of the trained models.
        Args:
            - models: Models to blend (dtype=list).
        """
        
        predictions = []
        for m in models:
            preds_test = m.predict(self.xtest)
            predictions.append(preds_test)
        
        fin_preds = np.mean(np.column_stack (predictions), axis=1).astype(int)
        return fin_preds

<p style="font-size:20px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
Models class</p>

In [None]:
class Models():
    
    """
    Nested class to wrapp all the models.
    """
    
    class XGBModel():
        """
        XGboost model implementation.
        """
        
        def fit(self, X_train, y_train, X_valid, y_valid, params={}):
            """
            Fits an instance of the model on the supplied data.
            Args:
                - X_train: Train data.
                - y_train: Train target.
                - X_valid: Validation data.
                - y_valid: Validation target.
            """

            self.model_ = xgb.XGBClassifier(objective="reg:squarederror",
                                            eval_metric='logloss',
                                            use_label_encoder=False,
                                            random_state=42,
                                            **params
                                        ) 

            self.model_.fit(X_train, 
                            y_train, 
                            early_stopping_rounds=20, 
                            eval_set=[(X_valid, y_valid)], 
                            verbose=False
                            )

            return self.model_

        def predict(self, dataset):

            if self.model_ is None:
                return None

            return self.model_.predict(dataset)
    
    class LGBMModel():
        """
        LGBM model implementation.
        """
        def fit(self, X_train, y_train, X_valid, y_valid, params={}):
            """
            Fits an instance of the model on the supplied data.
            Args:
                - X_train: Train data.
                - y_train: Train target.
                - X_valid: Validation data.
                - y_valid: Validation target.
            """
                        
            self.model_ = lgb.LGBMClassifier(objective='binary', 
                                             random_state=42,
                                             **params
                                             )

            self.model_.fit(X_train, 
                            y_train, 
                            early_stopping_rounds=20, 
                            eval_set=[(X_valid, y_valid)], 
                            verbose=False
                           )

            return self.model_

        def predict(self, dataset):

            if self.model_ is None:
                return None

            return self.model_.predict(dataset)
    
    class CTBModel():
        """
        Catboost model implementation.
        """
        def fit(self, X_train, y_train, X_valid, y_valid, params={}):
            """
            Fits an instance of the model on the supplied data.
            Args:
                - X_train: Train data.
                - y_train: Train target.
                - X_valid: Validation data.
                - y_valid: Validation target.
            """
            
            self.model_ = cb.CatBoostClassifier(random_state=42, **params) 

            self.model_.fit(X_train, 
                            y_train, 
                            early_stopping_rounds=20, 
                            eval_set=[(X_valid, y_valid)], 
                            verbose=False
                           )

            return self.model_

        def predict(self, dataset):

            if self.model_ is None:
                return None

            return self.model_.predict(dataset)

    @staticmethod
    def __iter__():
        """
        Iterate over the class atributes (Models) 
        """
        return iter([[getattr(Models, attr), attr] for attr in dir(Models) if not attr.startswith("__")])

In [None]:
models_params = {
                 ' CTBModel': {'iterations':1000},
                 'LGBMModel': {'n_estimators':1000, 'max_depth':5},
                 ' XGBModel': {'n_estimators':1000, 'max_depth':5},
                }

<p style="font-size:20px; font-family:verdana; line-height: 1.7em; margin-left:20px">   
Training loop </p>

In [None]:
models_trained = []

for mdls, params  in zip(Models().__iter__(), models_params.items()):
    
    # Instantiating the main class
    models = mdls[0]()
    name = mdls[1]
    
    # Splitting the models_params dict in name and values to feed the models
    params_name = params[0]
    params_vals = params[1]

    print(f' Model {name}')
    for fold in range(5):
        
        X_train = train_df[train_df.kfold != fold].reset_index(drop=True)
        X_valid = train_df[train_df.kfold == fold].reset_index(drop=True)
        
        X_test = test_df.copy()

        y_train = X_train['Survived']
        y_valid = X_valid['Survived']
        
        # Scaling
        scl = preprocessing.StandardScaler()
        X_train[numeric_features] = scl.fit_transform(X_train[numeric_features])
        X_valid[numeric_features] = scl.transform(X_valid[numeric_features])
        X_test[numeric_features] = scl.transform(X_test[numeric_features])
        
        # Imputing
        imp = impute.SimpleImputer(strategy='mean')
        X_train[numeric_features] = imp.fit_transform(X_train[numeric_features])
        X_valid[numeric_features] = imp.transform(X_valid[numeric_features])
        X_test[numeric_features] = imp.transform(X_test[numeric_features])

        # Encoding
            # Ordinal
        ord_enc = preprocessing.OrdinalEncoder()
        X_train[ordinal_features] = ord_enc.fit_transform(X_train[ordinal_features])
        X_valid[ordinal_features] = ord_enc.transform(X_valid[ordinal_features])
        X_test[ordinal_features] = ord_enc.transform(X_test[ordinal_features])
        
            # OHE
        ohe = preprocessing.OneHotEncoder(sparse=False, handle_unknown='ignore').fit(X_train[high_card_features])
        encoded_cols = list(ohe.get_feature_names(high_card_features))
    
        X_train[encoded_cols] = ohe.transform(X_train[high_card_features])
        X_valid[encoded_cols] = ohe.transform(X_valid[high_card_features])
        X_test[encoded_cols] = ohe.transform(X_test[high_card_features])
        
        # Preprocessed's dfs
        X_train = X_train[numeric_features+ordinal_features+encoded_cols]
        X_valid = X_valid[numeric_features+ordinal_features+encoded_cols]
        X_test = X_test[numeric_features+ordinal_features+encoded_cols]
        
        # Trainer class initialization
        trainer = Trainer(model=models, model_name=name,fold=fold, model_params=params_vals)
        
        # Fit the trainer
        model_trained = trainer.fit(X_train, y_train, X_valid, y_valid)
        trainer.pred_evaluate(X_test)
    print()
        
    models_trained.append(model_trained)
    blend = trainer.blend(models_trained)

In [None]:
submission.Survived = blend

In [None]:
submission.to_csv('submission.csv', index=False)

<p style="font-size:20px; font-family:verdana; line-height: 1.7em">   
Future work try to add more features, thanks for read my notebook. Greetings! </p>