# Introduction


**What?** Feature Interaction Constraints

**Reference[1]:** https://coderzcolumn.com/tutorials/machine-learning/xgboost-an-in-depth-guide-python#6<br>
**Reference[2]:** https://xgboost.readthedocs.io/en/latest/tutorials/feature_interaction_constraint.html<br>



# Import modules

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
import xgboost as xgb
import sklearn
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score

# Import dataset


- Boston Housing Dataset: It's a regression problem dataset which has information about a various attribute of houses in Boston and their price in dollar. 
- This will be used for regression tasks.



In [2]:
boston = load_boston()

# Print just the lines from 5 to 29
for line in boston.DESCR.split("\n")[5:29]:
    print(line)

**Data Set Characteristics:**  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
        - B        1000(Bk - 0.63)^2 where Bk is the

In [3]:
boston_df = pd.DataFrame(data=boston.data, columns = boston.feature_names)
# Add one column with the target
boston_df["Price"] = boston.target
boston_df.head()

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,Price
0,0.00632,18.0,2.31,0.0,0.538,6.575,65.2,4.09,1.0,296.0,15.3,396.9,4.98,24.0
1,0.02731,0.0,7.07,0.0,0.469,6.421,78.9,4.9671,2.0,242.0,17.8,396.9,9.14,21.6
2,0.02729,0.0,7.07,0.0,0.469,7.185,61.1,4.9671,2.0,242.0,17.8,392.83,4.03,34.7
3,0.03237,0.0,2.18,0.0,0.458,6.998,45.8,6.0622,3.0,222.0,18.7,394.63,2.94,33.4
4,0.06905,0.0,2.18,0.0,0.458,7.147,54.2,6.0622,3.0,222.0,18.7,396.9,5.33,36.2


In [6]:
# Splitting the dataset
X_train, X_test, Y_train, Y_test = train_test_split(boston.data, boston.target, train_size=0.90, random_state=42)

In [7]:
print("Train/Test Sizes : ", X_train.shape, X_test.shape, Y_train.shape, Y_test.shape, "\n")

Train/Test Sizes :  (455, 13) (51, 13) (455,) (51,) 



In [8]:
dmat_train = xgb.DMatrix(X_train, Y_train, feature_names=boston.feature_names)
dmat_test = xgb.DMatrix(X_test, Y_test, feature_names=boston.feature_names)

# Feature Interaction Constraints


- When xgboost creates a tree during the training process it takes into consideration all feature interactions.
- By default, all features can be present in any node of the decision tree. 
- We can force xgboost to keep a list of features in subsequent nodes by giving it a list of indices of features in the dataset.
- We can give list of list to **interaction_constraints** parameter of **train()** method. 
- For instance, in list [0,1,2,11] features 0,1,2,11 and 12 into one list hence these features will interact with one another when creating a tree but not with other features hence tree will have only these features. 



In [10]:
tweedie_booster = xgb.train({'max_depth': 3, 'eta': 1, 'objective': 'reg:tweedie',
                             'tree_method':'hist', 'nthread':4,
                             'interaction_constraints':[[0,1,2,11,12], [3, 4],[6,10], [5,9], [7,8]]},
                    dmat_train,
                    evals=[(dmat_train, "train"), (dmat_test, "test")])

[0]	train-tweedie-nloglik@1.5:28.32970	test-tweedie-nloglik@1.5:26.66488
[1]	train-tweedie-nloglik@1.5:19.31425	test-tweedie-nloglik@1.5:18.56556
[2]	train-tweedie-nloglik@1.5:18.75812	test-tweedie-nloglik@1.5:18.15946
[3]	train-tweedie-nloglik@1.5:18.73722	test-tweedie-nloglik@1.5:18.18582
[4]	train-tweedie-nloglik@1.5:18.72045	test-tweedie-nloglik@1.5:18.18661
[5]	train-tweedie-nloglik@1.5:18.71539	test-tweedie-nloglik@1.5:18.18151
[6]	train-tweedie-nloglik@1.5:18.71035	test-tweedie-nloglik@1.5:18.16513
[7]	train-tweedie-nloglik@1.5:18.70579	test-tweedie-nloglik@1.5:18.15365
[8]	train-tweedie-nloglik@1.5:18.70398	test-tweedie-nloglik@1.5:18.15155
[9]	train-tweedie-nloglik@1.5:18.69945	test-tweedie-nloglik@1.5:18.15776


In [11]:
print("\nTrain RMSE : ",tweedie_booster.eval(dmat_train))
print("Test  RMSE : ",tweedie_booster.eval(dmat_test))


Train RMSE :  [0]	eval-tweedie-nloglik@1.5:18.699446
Test  RMSE :  [0]	eval-tweedie-nloglik@1.5:18.157764


In [12]:
print("\nTest  R2 Score : %.2f"%r2_score(Y_test, tweedie_booster.predict(dmat_test)))
print("Train R2 Score : %.2f"%r2_score(Y_train, tweedie_booster.predict(dmat_train)))


Test  R2 Score : 0.78
Train R2 Score : 0.94


# Using the scikitlearn API

In [14]:
xgb_regressor = xgb.XGBRegressor(max_depth=3, eta=1, objective='reg:tweedie',
                                 interaction_constraints=[[0,1,2,11,12], [3, 4],[6,10], [5,9], [7,8]])

xgb_regressor.fit(X_train, Y_train,
                  eval_set=[(X_test, Y_test)], eval_metric="rmse",
                  early_stopping_rounds=5, verbose=1)

[0]	validation_0-rmse:19.40547
Will train until validation_0-rmse hasn't improved in 5 rounds.
[1]	validation_0-rmse:9.42298
[2]	validation_0-rmse:3.54240
[3]	validation_0-rmse:4.78789
[4]	validation_0-rmse:5.06168
[5]	validation_0-rmse:5.65695
[6]	validation_0-rmse:5.04555
[7]	validation_0-rmse:4.97560
Stopping. Best iteration:
[2]	validation_0-rmse:3.54240



XGBRegressor(base_score=0.5, booster='gbtree', colsample_bylevel=1,
             colsample_bynode=1, colsample_bytree=1, eta=1, gamma=0, gpu_id=-1,
             importance_type='gain',
             interaction_constraints=[[0, 1, 2, 11, 12], [3, 4], [6, 10],
                                      [5, 9], [7, 8]],
             learning_rate=1, max_delta_step=0, max_depth=3, min_child_weight=1,
             missing=nan, monotone_constraints='()', n_estimators=100, n_jobs=0,
             num_parallel_tree=1, objective='reg:tweedie', random_state=0,
             reg_alpha=0, reg_lambda=1, scale_pos_weight=None, subsample=1,
             tree_method='exact', validate_parameters=1, verbosity=None)

In [15]:
print("Test  R2 Score : %.2f"%xgb_regressor.score(X_test, Y_test))
print("Train R2 Score : %.2f"%xgb_regressor.score(X_train, Y_train))

Test  R2 Score : 0.80
Train R2 Score : 0.79


# Conclusion


- XGBoost offers 2 APIs, it up to you tp decide which one to use. 

