**Competition Name:**  
*CIBMTR - Equity in post-HCT Survival Predictions*

**Objective:**  
The competition seeks to improve the prediction of transplant survival rates for allogeneic hematopoietic stem cell transplantation (HCT) patients. The emphasis is on generating predictions that are not only accurate but also equitable across diverse racial and demographic groups.

# Inference Code

**Libraries and Load Train Data**

In [23]:
import numpy as np
import pandas as pd
import joblib
import xgboost as xgb
import lightgbm as lgb
import catboost as cb
import os

# Load Test Data
test = pd.read_csv("/content/test.csv")

# Ensure model directory exists
MODEL_DIR = "/content/models"

**Feature Engineering**

In [24]:
# Feature Engineering (Same as Training)
def add_features(df):
    df['donor_age_hct_diff'] = df['donor_age'] - df['age_at_hct']
    df['comorbidity_karnofsky_ratio'] = df['comorbidity_score'] / (df['karnofsky_score'] + 1)
    df['year_hct_adjusted'] = df['year_hct'] - 2000
    df['is_cyto_score_same'] = (df['cyto_score'] == df['cyto_score_detail']).astype(int)
    return df

test = add_features(test)

**Encode Categorical Features**

In [25]:
# Encode categorical features (Same as Training)
categorical_cols = test.select_dtypes(include=['object', 'category']).columns
for col in categorical_cols:
    test[col] = test[col].astype('category').cat.codes  # Convert to integer encoding

FEATURES = [col for col in test.columns if col not in ["ID"]]

**Load Models & Make Predictions**

In [26]:
# Load Models & Make Predictions
final_preds = np.zeros(len(test))

for fold in range(15):
    print(f"Loading Models for Fold {fold}")

    model_xgb = joblib.load(f"{MODEL_DIR}/xgb_fold{fold}.pkl")
    model_lgb = joblib.load(f"{MODEL_DIR}/lgb_fold{fold}.pkl")
    model_cat = cb.CatBoostRegressor()
    model_cat.load_model(f"{MODEL_DIR}/cat_fold{fold}.cbm")

    final_preds += model_xgb.predict(test[FEATURES]) * 0.4 / 15
    final_preds += model_lgb.predict(test[FEATURES]) * 0.4 / 15
    final_preds += model_cat.predict(test[FEATURES]) * 0.2 / 15


Loading Models for Fold 0
Loading Models for Fold 1
Loading Models for Fold 2
Loading Models for Fold 3
Loading Models for Fold 4
Loading Models for Fold 5
Loading Models for Fold 6
Loading Models for Fold 7
Loading Models for Fold 8
Loading Models for Fold 9
Loading Models for Fold 10
Loading Models for Fold 11
Loading Models for Fold 12
Loading Models for Fold 13
Loading Models for Fold 14


**Save Submission File**

In [27]:
# Save Submission File
submission = pd.DataFrame({"ID": test["ID"], "prediction": final_preds})
submission.to_csv("submission.csv", index=False)
print("Inference complete. Submission saved.")

Inference complete. Submission saved.


**Submission File**

In [28]:
file=pd.read_csv('submission.csv')
file

Unnamed: 0,ID,prediction
0,28800,-2.780118
1,28801,1.590262
2,28802,-2.795214
