# 🔍 Cryptic Hotspot Prediction Using Optimized AdaBoost Model

This notebook loads a pre-trained AdaBoost model and predicts cryptic hotspots based on user-provided feature data.

## 📝 How to Use
1. Prepare your input feature file (CSV format).
2. Please make sure your feature file has the **same column names and order** as the provided `sample_features.csv`. You can use it as a template to format your own data.
3. Run all cells below to get prediction results.
4. Output will be saved as `prediction_output.csv`.

In [21]:
#  Import libraries
import pandas as pd
import joblib

#  Optional: display all columns
pd.set_option('display.max_columns', None)

In [23]:
#  Load input feature file
input_file = "sample_features.csv"  # Replace with your filename

df_input = pd.read_csv(input_file)
X_input = df_input.drop(columns=["PatchID"], errors="ignore")  # Adjust if needed
df_input.head()

Unnamed: 0,PatchID,ProbeNumber,A00,A01,A20,A37,B71,E20,gfe,size,protrusion,convexity,compactness,hydrophobicity,charge_density,flexibility_A00,flexibility_A01,flexibility_A20,flexibility_A37,flexibility_B71,flexibility_E20,size_water,protrusion_water,convexity_water,compactness_water,hydrophobicity_water,charge_density_water,flexibility_water
0,Patch1,5,1,1,1,1,1,0,-2.2454,2291.9334,0.4592,0.0188,8.8962,0.2158,-0.0164,1.587,1.349,1.447,1.4,1.505,1.593,1619.207,0.514,0.019,7.556,0.314,-0.017,1.672
1,Patch2,1,0,0,0,0,1,0,-2.754,1419.269,0.611,0.021,7.009,0.347,-0.016,2.094,1.755,1.883,1.872,1.971,2.043,1021.397,0.675,0.02,5.647,0.369,-0.018,2.002
2,Patch3,6,1,1,1,1,1,1,-2.516833,3710.2625,0.412,0.018,11.135,0.5485,-0.0165,1.308,1.12,1.172,1.222,1.303,1.355,2210.005,0.49,0.017,10.089,0.541,-0.016,1.554
3,Patch4,5,1,1,1,1,1,0,-2.5166,1804.2706,0.1732,0.0188,9.031,0.3452,-0.0204,1.003,0.974,0.742,0.994,0.723,1.006,1085.87,0.22,0.019,8.028,0.308,-0.021,1.213
4,Patch5,3,0,1,1,0,1,0,-2.242,1983.486333,0.345,0.02,8.119667,0.361667,-0.016333,0.849,0.903,0.889,0.847,0.851,0.825,1142.965,0.321,0.019,6.502,0.335,-0.019,0.844


In [25]:
#  Load pre-trained model
model = joblib.load("cryptothml.pkl")

In [27]:
#  Make predictions
df_input["predicted_label"] = model.predict(X_input)
df_input["predicted_proba"] = model.predict_proba(X_input)[:, 1]

In [29]:
#  Save and display results
df_input.to_csv("prediction_output.csv", index=False)
df_input[["PatchID", "predicted_label", "predicted_proba"]].head()

Unnamed: 0,PatchID,predicted_label,predicted_proba
0,Patch1,0,0.126321
1,Patch2,0,0.163373
2,Patch3,0,0.428299
3,Patch4,0,0.149053
4,Patch5,0,0.129218
