# Customer Churn Prediction (Classification)
- Goal: Predict whether a customer will be a repeat buyer or not.
- Input features: average spend, frequency, basket size, country, product categories.
- Models: Logistic Regression, Decision Tree, Random Forest, XGBoost.

 # Overview
- Build a machine learning pipeline that uses historical transaction data to predict whether a customer will become a repeat buyer.
- Repeat buyers drive a significant share of revenue — identifying them early allows the business to run targeted retention campaigns, personalized offers, and allocate marketing spend more efficiently.

# How this will benefit the client
- Personalized Offers — Customers identified as likely to repeat can receive relevant loyalty offers, tailored product recommendations, or exclusive perks — improving their experience.
-  Proactive Retention — Customers flagged as unlikely to repeat can be engaged with reminders, special discounts, or support outreach to re-engage them — reducing churn.

# Check
- For example: if the business typically sees 30% repeat buyers, targeting at-risk customers could raise this by 5–10%, directly boosting lifetime value.

# Safe Pipeline

# Feature Engineering
- AverageOrderValue: Total spend ÷ number of orders.
- Frequency: Number of orders.
- BasketSize: Average quantity per order.
- Country: Categorical feature.
- Possible: Top product category bought

# Library

In [1]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, confusion_matrix

##### Load the data 

In [3]:
df = pd.read_excel('Online Retail.xlsx')
df = df[df['CustomerID'].notnull()]
df = df[(df['Quantity'] > 0) & (df['UnitPrice'] > 0)]
df['TotalSpending'] = df['Quantity'] * df['UnitPrice']
df.head(10)

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country,TotalSpending
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,2010-12-01 08:26:00,2.55,17850.0,United Kingdom,15.3
1,536365,71053,WHITE METAL LANTERN,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom,20.34
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,2010-12-01 08:26:00,2.75,17850.0,United Kingdom,22.0
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom,20.34
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,2010-12-01 08:26:00,3.39,17850.0,United Kingdom,20.34
5,536365,22752,SET 7 BABUSHKA NESTING BOXES,2,2010-12-01 08:26:00,7.65,17850.0,United Kingdom,15.3
6,536365,21730,GLASS STAR FROSTED T-LIGHT HOLDER,6,2010-12-01 08:26:00,4.25,17850.0,United Kingdom,25.5
7,536366,22633,HAND WARMER UNION JACK,6,2010-12-01 08:28:00,1.85,17850.0,United Kingdom,11.1
8,536366,22632,HAND WARMER RED POLKA DOT,6,2010-12-01 08:28:00,1.85,17850.0,United Kingdom,11.1
9,536367,84879,ASSORTED COLOUR BIRD ORNAMENT,32,2010-12-01 08:34:00,1.69,13047.0,United Kingdom,54.08


In [4]:
# First Purchase Per Customer
df['InvoiceDate'] = pd.to_datetime(df['InvoiceDate'])

In [5]:
# Sort by InvoiceDate
df_sorted = df.sort_values(['CustomerID', 'InvoiceDate'])

In [6]:
# Get the first invoice only
first_orders = df_sorted.groupby('CustomerID').first().reset_index()

##### Create the buyer flag

In [8]:
# Count total invoices per customer
invoice = df.groupby('CustomerID')['InvoiceNo'].nunique().reset_index()
invoice['RepeatBuyer'] = invoice['InvoiceNo'].apply(lambda x: 1 if x > 1 else 0)
invoice

Unnamed: 0,CustomerID,InvoiceNo,RepeatBuyer
0,12346.0,1,0
1,12347.0,7,1
2,12348.0,4,1
3,12349.0,1,0
4,12350.0,1,0
...,...,...,...
4333,18280.0,1,0
4334,18281.0,1,0
4335,18282.0,2,1
4336,18283.0,16,1


In [10]:
# Merge the flag to the orders
orders = first_orders.merge(invoice[['CustomerID', 'RepeatBuyer']], on='CustomerID', how='left')
orders.head(10)


Unnamed: 0,CustomerID,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,Country,TotalSpending,RepeatBuyer
0,12346.0,541431,23166,MEDIUM CERAMIC TOP STORAGE JAR,74215,2011-01-18 10:01:00,1.04,United Kingdom,77183.6,0
1,12347.0,537626,85116,BLACK CANDELABRA T-LIGHT HOLDER,12,2010-12-07 14:57:00,2.1,Iceland,25.2,1
2,12348.0,539318,84992,72 SWEETHEART FAIRY CAKE CASES,72,2010-12-16 19:09:00,0.55,Finland,39.6,1
3,12349.0,577609,23112,PARISIENNE CURIO CABINET,2,2011-11-21 09:51:00,7.5,Italy,15.0,0
4,12350.0,543037,21908,CHOCOLATE THIS WAY METAL SIGN,12,2011-02-02 16:01:00,2.1,Norway,25.2,0
5,12352.0,544156,21380,WOODEN HAPPY BIRTHDAY GARLAND,6,2011-02-16 12:33:00,2.95,Norway,17.7,1
6,12353.0,553900,37449,CERAMIC CAKE STAND + HANGING CAKES,2,2011-05-19 17:47:00,9.95,Bahrain,19.9,0
7,12354.0,550911,23201,JUMBO BAG ALPHABET,10,2011-04-21 13:11:00,2.08,Spain,20.8,0
8,12355.0,552449,22693,GROW A FLYTRAP OR SUNFLOWER IN TIN,24,2011-05-09 13:49:00,1.25,Bahrain,30.0,0
9,12356.0,541430,22138,BAKING SET 9 PIECE RETROSPOT,24,2011-01-18 09:50:00,4.25,Portugal,102.0,1


##### Feature Engineering

In [14]:
# Create features from first order
# Total basket size
# Note: 'Quantity' is total units in first invoice → OK

le = LabelEncoder()
orders['CountryEncoded'] = le.fit_transform(first_orders['Country'])

In [19]:
# Separate labels from features
X = orders[['Quantity', 'TotalSpending', 'CountryEncoded']]
y = orders['RepeatBuyer']

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)

In [20]:
# Picking the algorithm
rf = RandomForestClassifier(n_estimators=100, random_state=42)
rf.fit(X_train, y_train)


0,1,2
,n_estimators,100
,criterion,'gini'
,max_depth,
,min_samples_split,2
,min_samples_leaf,1
,min_weight_fraction_leaf,0.0
,max_features,'sqrt'
,max_leaf_nodes,
,min_impurity_decrease,0.0
,bootstrap,True


In [21]:
y_pred = rf.predict(X_test)

In [22]:
print(classification_report(y_test, y_pred))
print(confusion_matrix(y_test, y_pred))

              precision    recall  f1-score   support

           0       0.40      0.19      0.26       357
           1       0.68      0.86      0.76       728

    accuracy                           0.64      1085
   macro avg       0.54      0.52      0.51      1085
weighted avg       0.59      0.64      0.60      1085

[[ 67 290]
 [101 627]]


# Improvement Pipeline

In [32]:
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score
import xgboost as xgb  

#### Part 1:  Logistic Regression

In [35]:
logreg = LogisticRegression(class_weight='balanced', max_iter=1000)
logreg.fit(X_train, y_train)
y_pred_lr = logreg.predict(X_test)
print("\n🔵 Logistic Regression:")
print(f"Accuracy: {accuracy_score(y_test, y_pred_lr):.2f}")
print(classification_report(y_test, y_pred_lr))


🔵 Logistic Regression:
Accuracy: 0.45
              precision    recall  f1-score   support

           0       0.32      0.56      0.40       357
           1       0.65      0.40      0.50       728

    accuracy                           0.45      1085
   macro avg       0.48      0.48      0.45      1085
weighted avg       0.54      0.45      0.47      1085



### Part 2: Decision Tree

In [36]:
dtree = DecisionTreeClassifier(max_depth=5, class_weight='balanced', random_state=42)
dtree.fit(X_train, y_train)
y_pred_dt = dtree.predict(X_test)
print("\n🌳 Decision Tree:")
print(f"Accuracy: {accuracy_score(y_test, y_pred_dt):.2f}")
print(classification_report(y_test, y_pred_dt))


🌳 Decision Tree:
Accuracy: 0.56
              precision    recall  f1-score   support

           0       0.35      0.37      0.36       357
           1       0.68      0.66      0.67       728

    accuracy                           0.56      1085
   macro avg       0.51      0.51      0.51      1085
weighted avg       0.57      0.56      0.57      1085



# Part 3: Random Forest

In [37]:
rf = RandomForestClassifier(n_estimators=100, max_depth=5, class_weight='balanced', random_state=42)
rf.fit(X_train, y_train)
y_pred_rf = rf.predict(X_test)
print("\n🌲 Random Forest:")
print(f"Accuracy: {accuracy_score(y_test, y_pred_rf):.2f}")
print(classification_report(y_test, y_pred_rf))



🌲 Random Forest:
Accuracy: 0.53
              precision    recall  f1-score   support

           0       0.31      0.35      0.33       357
           1       0.66      0.62      0.64       728

    accuracy                           0.53      1085
   macro avg       0.49      0.49      0.49      1085
weighted avg       0.55      0.53      0.54      1085



# Part 4: XGBoost

In [38]:
xgb_model = xgb.XGBClassifier(use_label_encoder=False, eval_metric='logloss')
xgb_model.fit(X_train, y_train)
y_pred_xgb = xgb_model.predict(X_test)
print("\n🚀 XGBoost:")
print(f"Accuracy: {accuracy_score(y_test, y_pred_xgb):.2f}")
print(classification_report(y_test, y_pred_xgb))


🚀 XGBoost:
Accuracy: 0.65
              precision    recall  f1-score   support

           0       0.38      0.11      0.18       357
           1       0.68      0.91      0.78       728

    accuracy                           0.65      1085
   macro avg       0.53      0.51      0.48      1085
weighted avg       0.58      0.65      0.58      1085



Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)


# Part 4: Voting Classifier

In [39]:
from sklearn.ensemble import VotingClassifier

In [40]:
voting = VotingClassifier(
    estimators=[
        ('lr', logreg),
        ('dt', dtree),
        ('rf', rf),
        ('xgb', xgb_model)
    ],
    voting='soft'  # 'soft' uses predicted probabilities,  better than 'hard' for imbalanced tasks
)

In [41]:
# Train
voting.fit(X_train, y_train)

Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)


0,1,2
,estimators,"[('lr', ...), ('dt', ...), ...]"
,voting,'soft'
,weights,
,n_jobs,
,flatten_transform,True
,verbose,False

0,1,2
,penalty,'l2'
,dual,False
,tol,0.0001
,C,1.0
,fit_intercept,True
,intercept_scaling,1
,class_weight,'balanced'
,random_state,
,solver,'lbfgs'
,max_iter,1000

0,1,2
,criterion,'gini'
,splitter,'best'
,max_depth,5
,min_samples_split,2
,min_samples_leaf,1
,min_weight_fraction_leaf,0.0
,max_features,
,random_state,42
,max_leaf_nodes,
,min_impurity_decrease,0.0

0,1,2
,n_estimators,100
,criterion,'gini'
,max_depth,5
,min_samples_split,2
,min_samples_leaf,1
,min_weight_fraction_leaf,0.0
,max_features,'sqrt'
,max_leaf_nodes,
,min_impurity_decrease,0.0
,bootstrap,True

0,1,2
,objective,'binary:logistic'
,base_score,
,booster,
,callbacks,
,colsample_bylevel,
,colsample_bynode,
,colsample_bytree,
,device,
,early_stopping_rounds,
,enable_categorical,False


In [42]:
# Prediction
y_pred_voting = voting.predict(X_test)

In [43]:
print("\n🤝 Voting Classifier:")
print(f"Accuracy: {accuracy_score(y_test, y_pred_voting):.2f}")
print(classification_report(y_test, y_pred_voting))


🤝 Voting Classifier:
Accuracy: 0.65
              precision    recall  f1-score   support

           0       0.41      0.17      0.24       357
           1       0.68      0.88      0.77       728

    accuracy                           0.65      1085
   macro avg       0.54      0.52      0.50      1085
weighted avg       0.59      0.65      0.59      1085



# Use Stacking Classifier

In [45]:
from sklearn.ensemble import StackingClassifier

In [46]:
meta_learner = LogisticRegression()

In [48]:
base_estimators = [
    ('lr', logreg),
    ('dt', dtree),
    ('rf', rf),
    ('xgb', xgb_model)
]

In [49]:
stacking = StackingClassifier(
    estimators=base_estimators,
    final_estimator=meta_learner,
    passthrough=True  # optional: pass original features to meta learner
)

In [50]:
# Train
stacking.fit(X_train, y_train)

Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
STOP: TOTAL NO. OF ITERATIONS REACHED LIMIT

Increase the number of iterations to improve the convergence (max_iter=100).
You might also want to scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


0,1,2
,estimators,"[('lr', ...), ('dt', ...), ...]"
,final_estimator,LogisticRegression()
,cv,
,stack_method,'auto'
,n_jobs,
,passthrough,True
,verbose,0

0,1,2
,penalty,'l2'
,dual,False
,tol,0.0001
,C,1.0
,fit_intercept,True
,intercept_scaling,1
,class_weight,'balanced'
,random_state,
,solver,'lbfgs'
,max_iter,1000

0,1,2
,criterion,'gini'
,splitter,'best'
,max_depth,5
,min_samples_split,2
,min_samples_leaf,1
,min_weight_fraction_leaf,0.0
,max_features,
,random_state,42
,max_leaf_nodes,
,min_impurity_decrease,0.0

0,1,2
,n_estimators,100
,criterion,'gini'
,max_depth,5
,min_samples_split,2
,min_samples_leaf,1
,min_weight_fraction_leaf,0.0
,max_features,'sqrt'
,max_leaf_nodes,
,min_impurity_decrease,0.0
,bootstrap,True

0,1,2
,objective,'binary:logistic'
,base_score,
,booster,
,callbacks,
,colsample_bylevel,
,colsample_bynode,
,colsample_bytree,
,device,
,early_stopping_rounds,
,enable_categorical,False

0,1,2
,penalty,'l2'
,dual,False
,tol,0.0001
,C,1.0
,fit_intercept,True
,intercept_scaling,1
,class_weight,
,random_state,
,solver,'lbfgs'
,max_iter,100


In [51]:
# Predict
y_pred_stacking = stacking.predict(X_test)

In [52]:
print(" Stacking Classifier:")
print(f"Accuracy: {accuracy_score(y_test, y_pred_stacking):.2f}")
print(classification_report(y_test, y_pred_stacking))

 Stacking Classifier:
Accuracy: 0.67
              precision    recall  f1-score   support

           0       0.00      0.00      0.00       357
           1       0.67      1.00      0.80       728

    accuracy                           0.67      1085
   macro avg       0.34      0.50      0.40      1085
weighted avg       0.45      0.67      0.54      1085



## Summary Message

In [25]:
from sklearn.metrics import accuracy_score

In [23]:
# Variables
y_true = y_test
y_pred = y_pred

In [24]:
# Get the full classification report as a dict
report = classification_report(y_true, y_pred, output_dict=True)

In [26]:
# Pull the specific values
accuracy = accuracy_score(y_true, y_pred)
recall_0 = report['0']['recall']
recall_1 = report['1']['recall']

In [28]:
# Formatting
accuracy_pct = accuracy * 100
recall_0_pct = recall_0 * 100
recall_1_pct = recall_1 * 100

In [29]:
message = (
    f"The final repeat purchase classifier achieves ~{accuracy_pct:.0f}% accuracy. "
    f"The model performs well at identifying repeat buyers (Recall = {recall_1_pct:.0f}%), "
    f"but has lower performance on predicting one-time buyers (Recall = {recall_0_pct:.0f}%). "
    "This suggests the model can reliably flag customers with high repeat potential, "
    "but there is room to improve precision and detection of likely one-time buyers."
)

In [30]:
print("Final Model Summary:\n")
print(message)

Final Model Summary:

The final repeat purchase classifier achieves ~64% accuracy. The model performs well at identifying repeat buyers (Recall = 86%), but has lower performance on predicting one-time buyers (Recall = 19%). This suggests the model can reliably flag customers with high repeat potential, but there is room to improve precision and detection of likely one-time buyers.


# New Stacking pipeline

In [58]:
from sklearn.ensemble import StackingClassifier, RandomForestClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report
import xgboost as xgb

# New Features

In [59]:
orders['AvgUnitPrice'] = orders['TotalSpending'] / orders['Quantity']
orders['InvoiceDate'] = pd.to_datetime(orders['InvoiceDate'])
orders['DayOfWeek'] = orders['InvoiceDate'].dt.dayofweek  # 0 = Monday
orders['Hour'] = orders['InvoiceDate'].dt.hour


In [62]:
# Separate labels from features
X = orders[['Quantity', 'TotalSpending', 'AvgUnitPrice', 'CountryEncoded', 'DayOfWeek', 'Hour']]
y = orders['RepeatBuyer']

In [80]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42, stratify=y)

In [81]:
estimators = [
    ('lr', LogisticRegression(max_iter=500, class_weight='balanced')),
    ('dt', DecisionTreeClassifier(max_depth=5, class_weight='balanced')),
    ('rf', RandomForestClassifier(n_estimators=100, max_depth=7, class_weight='balanced')),
    ('svc', SVC(probability=True)),
    ('xgb', xgb.XGBClassifier(use_label_encoder=False, eval_metric='logloss'))
]


In [82]:
meta = LogisticRegression(max_iter=1000)

In [83]:
stacking = StackingClassifier(
    estimators=estimators,
    final_estimator=meta,
    passthrough=True
)

In [84]:
stacking.fit(X_train, y_train)
y_pred_stack = stacking.predict(X_test)

Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)


In [85]:
print("tacking Classifier (improved):")
print(f"Accuracy: {accuracy_score(y_test, y_pred_stack):.2f}")
print(classification_report(y_test, y_pred_stack))

tacking Classifier (improved):
Accuracy: 0.66
              precision    recall  f1-score   support

           0       0.00      0.00      0.00       448
           1       0.66      1.00      0.79       854

    accuracy                           0.66      1302
   macro avg       0.33      0.50      0.40      1302
weighted avg       0.43      0.66      0.52      1302



  >>> y_pred = np.array(['cat', 'pig', 'dog', 'cat', 'cat', 'dog'])
  >>> y_pred = np.array(['cat', 'pig', 'dog', 'cat', 'cat', 'dog'])
  >>> y_pred = np.array(['cat', 'pig', 'dog', 'cat', 'cat', 'dog'])


# Grid Search for XG Boost

In [86]:
print(X_train.dtypes)

Quantity            int64
TotalSpending     float64
AvgUnitPrice      float64
CountryEncoded      int64
DayOfWeek           int32
Hour                int32
dtype: object


In [87]:
import xgboost as xgb
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import classification_report, accuracy_score


# ✅ Base XGB Classifier
xgb_clf = xgb.XGBClassifier(use_label_encoder=False, eval_metric='logloss', random_state=42)

# ✅ Sensible parameter grid
param_grid = {
    'n_estimators': [50, 100, 200],
    'max_depth': [3, 5, 7],
    'learning_rate': [0.01, 0.1, 0.2],
    'subsample': [0.8, 1],
    'colsample_bytree': [0.8, 1]
}

# ✅ GridSearchCV
grid_search = GridSearchCV(
    estimator=xgb_clf,
    param_grid=param_grid,
    scoring='accuracy',   # or 'f1' if you prefer
    cv=3,
    verbose=1,
    n_jobs=1
)

grid_search.fit(X_train, y_train)

print("\n✅ Best Parameters Found:")
print(grid_search.best_params_)

# ✅ Use best model
best_xgb = grid_search.best_estimator_

# Predict on test set
y_pred_xgb = best_xgb.predict(X_test)

print("\n✅ Final XGBoost Metrics:")
print(f"Accuracy: {accuracy_score(y_test, y_pred_xgb):.2f}")
print(classification_report(y_test, y_pred_xgb))

Fitting 3 folds for each of 108 candidates, totalling 324 fits


Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
Parameters: { "use_label_encoder" } are not used.



✅ Best Parameters Found:
{'colsample_bytree': 0.8, 'learning_rate': 0.01, 'max_depth': 3, 'n_estimators': 200, 'subsample': 1}

✅ Final XGBoost Metrics:
Accuracy: 0.65
              precision    recall  f1-score   support

           0       0.29      0.01      0.02       448
           1       0.66      0.99      0.79       854

    accuracy                           0.65      1302
   macro avg       0.47      0.50      0.40      1302
weighted avg       0.53      0.65      0.52      1302



Parameters: { "use_label_encoder" } are not used.

  bst.update(dtrain, iteration=i, fobj=obj)
