# **Machine Learning Project**

Topic- A Machine Learning Framework for Predicting Adverse Drug Reactions Using Pharmacogenomics and Chemical Structure Embeddings

Team Members: Sakshi Balerao and Alekya Meka

Course Name: Data Science

Professor Name: Iqtidar

Date:   

## **Problem Statement**

Adverse Drug Reactions (ADRs) are unintended and harmful effects caused by medication. Predicting ADRs early can significantly improve patient safety and reduce healthcare costs.

This project aims to build a machine learning framework that can predict ADRs based on pharmacogenomic features (like drug names and reactions from the FAERS dataset). By leveraging models like Logistic Regression, XGBoost, and Neural Networks, we explore how well ADRs can be identified from structured and textual drug data.

## **Dataset Description**

We use the **FAERS (FDA Adverse Event Reporting System)** dataset, which includes:

- `drugname`: Name of the administered drug
- `pt`: Preferred term side effects (ADR)
- We group ADRs by drug and apply multi-label encoding using `MultiLabelBinarizer`.

Total examples: ~2000  
ADR labels: 500+ (filtered)  
Type: Multi-label classification

In [None]:
import pandas as pd
import zipfile
import os
from google.colab import drive

# Mount Google Drive
drive.mount('/content/drive')

# Path to extracted files folder for one quarter (UPDATE THIS PATH)
data_folder = '/content/drive/MyDrive/faers_ascii_2024q3/ASCII'

# Load demo data
demo_df = pd.read_csv(os.path.join(data_folder, 'DEMO24Q3.txt'), sep='$', encoding='latin1', low_memory=False)

# Load drug data
drug_df = pd.read_csv(os.path.join(data_folder, 'DRUG24Q3.txt'), sep='$', encoding='latin1', low_memory=False)

# Load reaction data
reac_df = pd.read_csv(os.path.join(data_folder, 'REAC24Q3.txt'), sep='$', encoding='latin1', low_memory=False)
print(drug_df.columns)
# Filter suspect drugs only
suspect_drugs = drug_df[drug_df['role_cod'] == 'PS']

# Join suspect drugs with reactions on ISR
drug_reac_df = pd.merge(suspect_drugs[['primaryid', 'drugname']], reac_df[['primaryid', 'pt']], on='primaryid')

# View head
print(drug_reac_df.head())

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Index(['primaryid', 'caseid', 'drug_seq', 'role_cod', 'drugname', 'prod_ai',
       'val_vbm', 'route', 'dose_vbm', 'cum_dose_chr', 'cum_dose_unit',
       'dechal', 'rechal', 'lot_num', 'exp_dt', 'nda_num', 'dose_amt',
       'dose_unit', 'dose_form', 'dose_freq'],
      dtype='object')
   primaryid drugname                             pt
0  100138392    EMEND               Drug interaction
1  100138392    EMEND                 Renal disorder
2  100138392    EMEND      Hepatic function abnormal
3  100138392    EMEND  Liver function test increased
4  100138392    EMEND                 Rhabdomyolysis


# Preprocessing & Future Engineering

Drugs were grouped with associated ADRs.

- Multi-label binarization was used to transform side effects into binary labels.

- Drug names were converted into token features using `CountVectorizer`.

This simplified representation enabled us to train models using standard classification techniques.

In [None]:
import pandas as pd
import zipfile
import os
from google.colab import drive

# Mount Google Drive
drive.mount('/content/drive')

# Path to extracted files folder for one quarter (UPDATE THIS PATH)
data_folder = '/content/drive/MyDrive/faers_ascii_2024Q4/ASCII'

# Load demo data
demo_df = pd.read_csv(os.path.join(data_folder, 'DEMO24Q4.txt'), sep='$', encoding='latin1', low_memory=False)

# Load drug data
drug_df = pd.read_csv(os.path.join(data_folder, 'DRUG24Q4.txt'), sep='$', encoding='latin1', low_memory=False)

# Load reaction data
reac_df = pd.read_csv(os.path.join(data_folder, 'REAC24Q4.txt'), sep='$', encoding='latin1', low_memory=False)

print(drug_df.columns)

# Filter suspect drugs only
suspect_drugs = drug_df[drug_df['role_cod'] == 'PS']

# Join suspect drugs with reactions on ISR
drug_reac_df = pd.merge(suspect_drugs[['primaryid', 'drugname']], reac_df[['primaryid', 'pt']], on='primaryid')

# View head
print(drug_reac_df.head())

adr_counts = drug_reac_df.groupby(['drugname', 'pt']).size().reset_index(name='count')

# Sort top ADRs for a given drug
print(adr_counts[adr_counts['drugname'] == 'ASPIRIN'].sort_values('count', ascending=False).head(10))

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Index(['primaryid', 'caseid', 'drug_seq', 'role_cod', 'drugname', 'prod_ai',
       'val_vbm', 'route', 'dose_vbm', 'cum_dose_chr', 'cum_dose_unit',
       'dechal', 'rechal', 'lot_num', 'exp_dt', 'nda_num', 'dose_amt',
       'dose_unit', 'dose_form', 'dose_freq'],
      dtype='object')
   primaryid    drugname                          pt
0  100100247  ADALIMUMAB  Malignant melanoma stage I
1  100100247  ADALIMUMAB             Keratoacanthoma
2  100100247  ADALIMUMAB    Blood pressure increased
3  100100247  ADALIMUMAB             Keratoacanthoma
4  100100247  ADALIMUMAB              Hyperkeratosis
      drugname                                pt  count
29038  ASPIRIN                       Haemorrhage     13
29028  ASPIRIN      Gastrointestinal haemorrhage     12
29016  ASPIRIN  Foetal exposure during pregnancy     11
29006  ASPIRIN         Exposure during p

In [None]:
import pandas as pd
import zipfile
import os
from google.colab import drive

# Mount Google Drive
drive.mount('/content/drive')

# Path to extracted files folder for one quarter (UPDATE THIS PATH)
data_folder = '/content/drive/MyDrive/faers_ascii_2025Q1/ASCII'

# Load demo data
demo_df = pd.read_csv(os.path.join(data_folder, 'DEMO25Q1.txt'), sep='$', encoding='latin1', low_memory=False)

# Load drug data
drug_df = pd.read_csv(os.path.join(data_folder, 'DRUG25Q1.txt'), sep='$', encoding='latin1', low_memory=False)

# Load reaction data
reac_df = pd.read_csv(os.path.join(data_folder, 'REAC25Q1.txt'), sep='$', encoding='latin1', low_memory=False)

print(drug_df.columns)

# Filter suspect drugs only
suspect_drugs = drug_df[drug_df['role_cod'] == 'PS']

# Join suspect drugs with reactions on ISR
drug_reac_df = pd.merge(suspect_drugs[['primaryid', 'drugname']], reac_df[['primaryid', 'pt']], on='primaryid')

# View head
print(drug_reac_df.head())

adr_counts = drug_reac_df.groupby(['drugname', 'pt']).size().reset_index(name='count')

# Sort top ADRs for a given drug
print(adr_counts[adr_counts['drugname'] == 'ASPIRIN'].sort_values('count', ascending=False).head(10))

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
Index(['primaryid', 'caseid', 'drug_seq', 'role_cod', 'drugname', 'prod_ai',
       'val_vbm', 'route', 'dose_vbm', 'cum_dose_chr', 'cum_dose_unit',
       'dechal', 'rechal', 'lot_num', 'exp_dt', 'nda_num', 'dose_amt',
       'dose_unit', 'dose_form', 'dose_freq'],
      dtype='object')
   primaryid   drugname                                           pt
0  100294532  LETROZOLE                                     Asthenia
1  100294532  LETROZOLE                     Breast cancer metastatic
2  100294532  LETROZOLE  Palmar-plantar erythrodysaesthesia syndrome
3  100294532  LETROZOLE                          Metastases to liver
4  100294532  LETROZOLE                    Metastases to lymph nodes
      drugname                                    pt  count
28556  ASPIRIN            Toxicity to various agents     56
28431  ASPIRIN          Gastrointestinal haemorr

In [None]:
# Group by drugname and aggregate adverse reactions
grouped = drug_reac_df.groupby('drugname')['pt'].agg(lambda x: '|'.join(x.unique())).reset_index()
grouped.columns = ['Drug', 'ADR'] # Rename columns for clarity

print("Grouped data preview:")
print(grouped.head())
print(grouped.shape)


Grouped data preview:
                                                Drug  \
0                .ALPHA.1-PROTEINASE INHIBITOR HUMAN   
1  .DELTA.8-TETRAHYDROCANNABINOL\CANNABIDIOL\HERBALS   
2              .DELTA.8-TETRAHYDROCANNABINOL\HERBALS   
3       .DELTA.9-TETRAHYDROCANNABINOLIC ACID\HERBALS   
4  .DELTA.9-TETRAHYDROCANNABINOL\CANNABIDIOL\HERBALS   

                                                 ADR  
0  Dyspnoea|Insurance issue|Hypotension|Nasophary...  
1                                             Injury  
2  Malaise|Unresponsive to stimuli|Substance use|...  
3  Product complaint|Dysgeusia|Anxiety|Product la...  
4  Dysarthria|Erythema|Feeling hot|Pruritus|Skin ...  
(4787, 2)


In [None]:
print(X.shape)
print(y.shape)

NameError: name 'X' is not defined

In [None]:
from sklearn.preprocessing import MultiLabelBinarizer

# Assuming 'grouped' is your DataFrame with 'Drug' and 'ADR' columns
# ADRs are pipe-separated strings

# Convert pipe-separated ADRs to lists
grouped['ADR_list'] = grouped['ADR'].str.split('|')

# Flatten the list of lists to count all unique ADRs
all_adrs = [item for sublist in grouped['ADR_list'] for item in sublist]

# Get top 50 ADRs (modify based on your distribution)
top_adrs = pd.Series(all_adrs).value_counts().head(50).index.tolist()

# Filter ADRs in each row to keep only top 50
grouped['ADR_list_filtered'] = grouped['ADR_list'].apply(lambda adrs: [adr for adrs_item in adrs for adr in [adrs_item] if adr in top_adrs])

# Remove entries with no top 50 ADRs left
grouped = grouped[grouped['ADR_list_filtered'].map(len) > 0].copy()

# Display the head of the modified grouped DataFrame
print("Grouped data after filtering top ADRs:")
print(grouped.head())
print(grouped.shape)

Grouped data after filtering top ADRs:
                                                Drug  \
0                .ALPHA.1-PROTEINASE INHIBITOR HUMAN   
2              .DELTA.8-TETRAHYDROCANNABINOL\HERBALS   
3       .DELTA.9-TETRAHYDROCANNABINOLIC ACID\HERBALS   
4  .DELTA.9-TETRAHYDROCANNABINOL\CANNABIDIOL\HERBALS   
6                               7-HYDROXYMITRAGYNINE   

                                                 ADR  \
0  Dyspnoea|Insurance issue|Hypotension|Nasophary...   
2  Malaise|Unresponsive to stimuli|Substance use|...   
3  Product complaint|Dysgeusia|Anxiety|Product la...   
4  Dysarthria|Erythema|Feeling hot|Pruritus|Skin ...   
6  Brain fog|Depression|Adverse drug reaction|Ski...   

                                            ADR_list  \
0  [Dyspnoea, Insurance issue, Hypotension, Nasop...   
2  [Malaise, Unresponsive to stimuli, Substance u...   
3  [Product complaint, Dysgeusia, Anxiety, Produc...   
4  [Dysarthria, Erythema, Feeling hot, Pruritus, ...   
6  [Bra

# **Baseline Model**

 Logistic Regression

 We used `MultiOutputClassifier` wrapping Logistic Regression as our baseline. It gave us an initial benchmark for performance.

Evaluation Metric: **Micro F1-score**

In [None]:
mlb = MultiLabelBinarizer()
y = mlb.fit_transform(grouped['ADR_list_filtered'])  # multi-label binary matrix

In [None]:
from sklearn.feature_extraction.text import CountVectorizer

vectorizer = CountVectorizer()
X = vectorizer.fit_transform(grouped['Drug'])

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.multioutput import MultiOutputClassifier

model = MultiOutputClassifier(LogisticRegression(max_iter=1000))
model.fit(X_train, y_train)


In [None]:
from sklearn.metrics import classification_report, f1_score

y_pred = model.predict(X_test)
print(classification_report(y_test, y_pred, target_names=mlb.classes_))
print("F1 Score (micro):", f1_score(y_test, y_pred, average='micro'))


                                      precision    recall  f1-score   support

                Abdominal discomfort       0.00      0.00      0.00       165
                      Abdominal pain       0.00      0.00      0.00       178
                Abdominal pain upper       0.00      0.00      0.00       182
                             Anaemia       0.00      0.00      0.00       149
                             Anxiety       0.43      0.02      0.03       195
                          Arthralgia       0.33      0.01      0.01       196
                            Asthenia       0.25      0.01      0.02       234
                           Back pain       0.00      0.00      0.00       164
                            COVID-19       0.00      0.00      0.00       154
                          Chest pain       0.33      0.01      0.01       169
                Condition aggravated       0.38      0.02      0.04       249
                        Constipation       0.00      0.00      

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


# **Advanced Models**

## Additional Models

We implemented the following models beyond the baseline:

1. **XGBoost Classifier** (multi-output)
2. **Neural Network using Keras**
3. **CatBoost Classifier** (multi-output)
4. (Optional Bonus) **Naive Bayes** and **SVM**

These were chosen for their robustness with sparse/text data and their capability to handle multi-label classification.


In [None]:
from sklearn.feature_extraction.text import CountVectorizer

# Turn drug names into simple binary features
vectorizer = CountVectorizer()
X = vectorizer.fit_transform(grouped['Drug'])  # shape: (n_samples, n_unique_drugs)

# **Deep Leaning Model**

We use a basic neural network (Multi-Layer Perceptron) to predict ADRs using only drug names converted to binary features via `CountVectorizer`.

The model uses 2 hidden layers with dropout regularization and sigmoid activation for multi-label classification. Although it trains correctly, the results show limited predictive power from drug names alone, suggesting the need for structural or genomic features for better ADR detection.

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.model_selection import train_test_split

# Split dataset
X_train, X_test, y_train, y_test = train_test_split(X.toarray(), y, test_size=0.2, random_state=42)

# Define model
model = Sequential([
    Dense(512, input_dim=X_train.shape[1], activation='relu'),
    Dropout(0.3),
    Dense(256, activation='relu'),
    Dropout(0.3),
    Dense(y.shape[1], activation='sigmoid')  # multi-label
])

# Compile
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Train
early_stop = EarlyStopping(patience=5, monitor='val_loss', restore_best_weights=True)
model.fit(X_train, y_train, epochs=50, batch_size=128, validation_split=0.2, callbacks=[early_stop])


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/50
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 92ms/step - accuracy: 0.0910 - loss: 0.6732 - val_accuracy: 0.0736 - val_loss: 0.5785
Epoch 2/50
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 59ms/step - accuracy: 0.0864 - loss: 0.5519 - val_accuracy: 0.0736 - val_loss: 0.5605
Epoch 3/50
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 50ms/step - accuracy: 0.0696 - loss: 0.5193 - val_accuracy: 0.0736 - val_loss: 0.5632
Epoch 4/50
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 57ms/step - accuracy: 0.0800 - loss: 0.4854 - val_accuracy: 0.0736 - val_loss: 0.5587
Epoch 5/50
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 64ms/step - accuracy: 0.0896 - loss: 0.4495 - val_accuracy: 0.0736 - val_loss: 0.5607
Epoch 6/50
[1m20/20[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 59ms/step - accuracy: 0.0962 - loss: 0.4148 - val_accuracy: 0.0736 - val_loss: 0.5647
Epoch 7/50
[1m20/20[0m [32m━━━━

<keras.src.callbacks.history.History at 0x7d947251e190>

# **Neural Network (Deep Leaning)**

In this final model, we use a neural network trained to predict multiple adverse drug reactions (ADRs) simultaneously. We apply a threshold of 0.2 to convert probability outputs into binary class labels, which helps improve recall for rare ADRs.

We then evaluate model performance using a classification report and the **F1 Score (micro-average)**. This model achieves the best overall performance, with an F1 Score of **0.45**, indicating strong predictive power for ADR detection across 50+ conditions.

In [None]:
from sklearn.metrics import classification_report, f1_score

y_pred_prob = model.predict(X_test)
y_pred = (y_pred_prob > 0.2).astype(int)

print(classification_report(y_test, y_pred, target_names=mlb.classes_))
print("F1 Score (micro):", f1_score(y_test, y_pred, average='micro'))


[1m25/25[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step
                                      precision    recall  f1-score   support

                Abdominal discomfort       0.24      0.74      0.37       165
                      Abdominal pain       0.25      0.78      0.38       178
                Abdominal pain upper       0.26      0.73      0.38       182
                             Anaemia       0.23      0.78      0.35       149
                             Anxiety       0.28      0.76      0.41       195
                          Arthralgia       0.29      0.76      0.42       196
                            Asthenia       0.33      0.87      0.48       234
                           Back pain       0.26      0.79      0.40       164
                            COVID-19       0.25      0.79      0.38       154
                          Chest pain       0.26      0.75      0.38       169
                Condition aggravated       0.35      0.91      0.51 

# **XGBoost Classifier (Advacned Model)**

we use **XGBoost (Extreme Gradient Boosting)** as our and most advanced model. It is known for high accuracy and works well with structured (tabular) data. We wrap it inside a `MultiOutputClassifier` to handle multi-label prediction of multiple adverse drug reactions (ADRs) simultaneously.

We also apply a **probability threshold of 0.3** (instead of 0.5) to increase recall for rare ADRs. Finally, we evaluate the model using a **classification report** and **F1 Score (micro average)** to measure overall prediction performance.

In [None]:
from xgboost import XGBClassifier
from sklearn.multioutput import MultiOutputClassifier
import numpy as np

xgb = XGBClassifier(
    objective='binary:logistic',
    eval_metric='logloss',
    use_label_encoder=False,
    n_estimators=100,
    max_depth=6,
    learning_rate=0.1
)

multi_xgb = MultiOutputClassifier(xgb, n_jobs=-1)
multi_xgb.fit(X_train, y_train)

# Get probabilities for each label; this returns a list of arrays, one array per label
y_pred_proba_list = multi_xgb.predict_proba(X_test)

# Convert list of arrays (shape: n_samples x 2) to a numpy array of shape (n_samples, n_labels)
# We want the probability of class 1, which is column 1 in each array
y_pred_proba = np.array([proba[:, 1] for proba in y_pred_proba_list]).T

# Apply a lower threshold to convert probabilities to binary predictions
threshold = 0.3
y_pred = (y_pred_proba >= threshold).astype(int)

print(y_pred)
print(classification_report(y_test, y_pred, target_names=mlb.classes_))
print("F1 Score (micro):", f1_score(y_test, y_pred, average='micro'))


[[0 0 0 ... 1 0 0]
 [0 0 0 ... 1 0 0]
 [0 0 0 ... 0 0 0]
 ...
 [0 0 0 ... 1 0 0]
 [0 0 0 ... 1 0 0]
 [0 0 0 ... 1 0 0]]
                                      precision    recall  f1-score   support

                Abdominal discomfort       0.36      0.03      0.06       165
                      Abdominal pain       0.00      0.00      0.00       178
                Abdominal pain upper       0.00      0.00      0.00       182
                             Anaemia       0.50      0.01      0.01       149
                             Anxiety       0.40      0.03      0.06       195
                          Arthralgia       0.00      0.00      0.00       196
                            Asthenia       0.33      0.91      0.49       234
                           Back pain       0.00      0.00      0.00       164
                            COVID-19       0.00      0.00      0.00       154
                          Chest pain       0.20      0.01      0.01       169
                Condi

  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


# **Multinomial Naive Bayes**

To adapt it for multi-label ADR (Adverse Drug Reaction) prediction, we used `MultiOutputClassifier`, which fits one classifier per target label (side effect). This allows us to predict multiple ADRs for each drug.


In [None]:
from sklearn.naive_bayes import MultinomialNB
from sklearn.multioutput import MultiOutputClassifier

nb = MultiOutputClassifier(MultinomialNB())
nb.fit(X_train, y_train)
y_pred = nb.predict(X_test)
print(classification_report(y_test, y_pred, target_names=mlb.classes_))
print("F1 Score (micro):", f1_score(y_test, y_pred, average='micro'))

                                      precision    recall  f1-score   support

                Abdominal discomfort       0.15      0.02      0.03       165
                      Abdominal pain       0.34      0.06      0.10       178
                Abdominal pain upper       0.21      0.02      0.04       182
                             Anaemia       0.26      0.04      0.07       149
                             Anxiety       0.48      0.10      0.17       195
                          Arthralgia       0.31      0.05      0.09       196
                            Asthenia       0.24      0.08      0.12       234
                           Back pain       0.16      0.02      0.03       164
                            COVID-19       0.20      0.03      0.05       154
                          Chest pain       0.30      0.04      0.06       169
                Condition aggravated       0.42      0.08      0.13       249
                        Constipation       0.25      0.03      

# **Support Vecort Machine**


To further strengthen our analysis, we implemented a **Support Vector Machine** (SVM) using `LinearSVC` within a `OneVsRestClassifier` wrapper. SVMs are well-suited for high-dimensional spaces and are effective in text-based feature classification like drug name embeddings.

We used the **multi-label classification strategy** and evaluated the model using a detailed classification report and the **micro-averaged F1-score**.

**Result:**
- Micro F1-score: ~0.21



In [None]:
from sklearn.svm import LinearSVC
from sklearn.multiclass import OneVsRestClassifier

svc = OneVsRestClassifier(LinearSVC(max_iter=3000))
svc.fit(X_train, y_train)
y_pred = svc.predict(X_test)
print(classification_report(y_test, y_pred, target_names=mlb.classes_))
print("F1 Score (micro):", f1_score(y_test, y_pred, average='micro'))

                                      precision    recall  f1-score   support

                Abdominal discomfort       0.27      0.08      0.12       165
                      Abdominal pain       0.32      0.08      0.13       178
                Abdominal pain upper       0.40      0.09      0.14       182
                             Anaemia       0.29      0.08      0.13       149
                             Anxiety       0.51      0.15      0.23       195
                          Arthralgia       0.39      0.08      0.13       196
                            Asthenia       0.32      0.09      0.14       234
                           Back pain       0.28      0.07      0.11       164
                            COVID-19       0.24      0.05      0.08       154
                          Chest pain       0.33      0.09      0.15       169
                Condition aggravated       0.39      0.11      0.17       249
                        Constipation       0.22      0.07      

In [None]:
!pip install catboost
from catboost import CatBoostClassifier
from sklearn.multioutput import MultiOutputClassifier

cat = CatBoostClassifier(
    iterations=100,
    depth=4,
    learning_rate=0.1,
    loss_function='CrossEntropy',
    task_type='CPU', # Changed from 'GPU' to 'CPU'
    verbose=False
)

multi_cat = MultiOutputClassifier(cat, n_jobs=-1)
multi_cat.fit(X_train, y_train)

y_pred = multi_cat.predict(X_test)
print(classification_report(y_test, y_pred, target_names=mlb.classes_))
print("F1 Score (micro):", f1_score(y_test, y_pred, average='micro'))

Collecting catboost
  Downloading catboost-1.2.8-cp311-cp311-manylinux2014_x86_64.whl.metadata (1.2 kB)
Downloading catboost-1.2.8-cp311-cp311-manylinux2014_x86_64.whl (99.2 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m99.2/99.2 MB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: catboost
Successfully installed catboost-1.2.8


KeyboardInterrupt: 