# Tabular Playground - XGBoost

The goto method for most tabular data problems is [XGBoost](https://xgboost.readthedocs.io/en/latest/), so lets see how it performs on this leaderboard

# Dataset

In [None]:
import numpy  as np 
import pandas as pd 
import re
from xgboost import XGBRegressor
import sklearn

pd.options.display.max_rows = 6

In [None]:
train_df = pd.read_csv('../input/tabular-playground-series-jan-2021/train.csv', index_col='id')
test_df  = pd.read_csv('../input/tabular-playground-series-jan-2021/test.csv',  index_col='id')

columns = test_df.columns
X       = train_df[columns]
Y       = train_df['target']
X_train, X_valid, Y_train, Y_valid = sklearn.model_selection.train_test_split(X, Y, test_size=0.01, random_state=42)
X_test  = test_df[columns]

display('train_df')
display( train_df )
# display('test_df')
# display( test_df )

# XGBoost

To start with, lets just try out the XGBoost default settings

Documention
- https://xgboost.readthedocs.io/en/latest/python/python_api.html#module-xgboost.sklearn
- https://xgboost.readthedocs.io/en/latest/parameter.html

In [None]:
xgb = XGBRegressor(
    n_jobs=-1,
    verbosity=0,
    random_state=42
)
xgb.fit(
    X_train, Y_train, 
    # eval_set=[(X_valid, Y_valid)]
)

In [None]:
rmse = sklearn.metrics.mean_squared_error(Y_valid, xgb.predict(X_valid), squared=False)
print(rmse)

# Submission

In [None]:
predictions   = xgb.predict(X_test)

submission_df = pd.read_csv('../input/tabular-playground-series-jan-2021/sample_submission.csv', index_col='id')
submission_df['target'] = predictions
submission_df.to_csv('submission.csv')
!head submission.csv

# Further Reading

This notebook is part of a series exploring the [Tabular Playground](https://www.kaggle.com/c/tabular-playground-series-jan-2021)
- 0.72935 - [scikit-learn Ensemble](https://www.kaggle.com/jamesmcguigan/tabular-playground-scikit-learn-ensemble)
- 0.71423 - [Fast.ai Tabular Solver](https://www.kaggle.com/jamesmcguigan/fast-ai-tabular-solver)
- 0.70426 - [XGBoost](https://www.kaggle.com/jamesmcguigan/tabular-playground-xgboost)