## Description <a name="introduction"></a>

Below is my solution. I used LightGBM, initially I experimented with 3 different models.  Random Forest, the default GBDT , and Gradient based one sided sampling. GBDT, generated the best AUC followed by GOSS then Random Forest. Despite, GBDT yielding the best AUC I decided stack the 3 diverse and different models to archive a much high AUC. This second model used was a Logistic regression used to generate the submission predictions.

In [None]:
import pandas as pd
import numpy as np
import sklearn.model_selection as ms
from sklearn.metrics import accuracy_score, roc_auc_score
from lightgbm import LGBMClassifier
from collections import namedtuple
from sklearn.linear_model import LogisticRegression
from sklearn.linear_model import SGDClassifier


In [None]:
import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))


folds = 10
train = pd.read_csv("/kaggle/input/tabular-playground-series-sep-2021/train.csv")
test = pd.read_csv("/kaggle/input/tabular-playground-series-sep-2021/test.csv")
x = pd.DataFrame(train.drop(columns=["claim", "id"]))
y = train["claim"]


In [None]:

# x, weight_x, y, weight_y = ms.train_test_split(X, Y, test_size=.05, shuffle=True, random_state=0)

test = test
xval = pd.DataFrame(test.drop(columns=["id"]))
df_split = ms.StratifiedKFold(n_splits=folds, shuffle=True)

## Hyperparameters
Parameters were determined using randomized search and grid search. Below are the parameters for the 3 models. 

In [None]:
dt = namedtuple("dt", "model_ best_para")
para = []

para.append(dt(model_="goss", best_para={"boosting_type":"goss", "objective": "cross_entropy", "n_estimators": 878,
                                         "lambda_l1": 0.02119367084330647, 
                                         "lambda_l2": 9.259284311814404e-05, 
                                         "num_leaves": 85, "min_child_samples": 42, "verbose" :-1}))
para.append(dt(model_="rf", best_para={"boosting_type":"rf","n_estimators": 327,
                                       "lambda_l1": 0.00012043760866269098, 
                                       "lambda_l2": 6.649019338833096e-06, 
                                       "num_leaves": 246, "min_child_samples": 99,
                                       "feature_fraction": 0.7734184326473208,
                                       "bagging_fraction": 0.999835036473764, "bagging_freq": 3, "verbose" :-1}))
para.append(dt(model_="gbdt", best_para={"boosting_type":"gbdt","n_estimators": 499, 
                                         "lambda_l1": 1.0450194511913434e-06,
                                         "lambda_l2": 2.2690854683431152e-07, 
                                         "num_leaves": 110, "min_child_samples": 14, 
                                         "feature_fraction": 0.7468626653258925,
                                         "bagging_fraction": 0.9944777742119832,
                                         "bagging_freq": 4, "verbose" :-1}))

para = pd.DataFrame(para)

# arrays to hold meta data, and weights
meta_val = np.zeros((len(xval.index) * len(para.index), folds))
meta_val_ave = np.zeros((len(xval.index), len(para.index)))
val_len = len(xval.index)


train_meta = np.zeros((len(x.index), len(para.index) + 1))


## Stage 1 Base models
Stage 1, using 9 folds to fit base models with the remaining fold to create meta data. Also, this last fold is used to generate meta predictions using each base model. Finally, the meta data is used to fit the meta model in stage two and the predictions are generated using the meta prediction from the base models.

In [None]:

start = 0
end = 0
for counter, (trn, val) in enumerate(df_split.split(x, y)):
    end += len(val)
    train_meta[start:end, 0] = y.iloc[val].values

    for p in para.itertuples():
        model = LGBMClassifier(n_jobs=-1, **p.best_para)
        model.fit(x.iloc[trn, :], y.iloc[trn])
        train_meta[start:end, p.Index + 1] = model.predict_proba(x.iloc[val, :])[:, 1]
        meta_val[val_len * p.Index:val_len * (p.Index + 1), counter] = model.predict_proba(xval)[:, 1]
    start +=len(val)

    if counter == folds - 1:

        for r in range(0,len(para.index)):
            mv = meta_val[val_len * r:val_len * (r + 1),]
            meta_val_ave[:, r] = np.mean(mv, axis=1)



## Stage 2 Meta Model
Meta data is used to fit Logistics type model to meta data. Then prediction are made using the prediction data from the base models. 

In [None]:

meta_model = SGDClassifier(max_iter=10000, loss='log')
meta_model.fit(train_meta[:, 1:], train_meta[:, 0])
pred = meta_model.predict_proba(meta_val_ave)[:, 1]


In [None]:

final = pd.DataFrame(test["id"])
final = final.merge(pd.DataFrame(pred), right_index=True, left_index=True)
final.columns = ["id", "claim"]
final.to_csv("final.csv", index=False)

