**Therapeutics Data Commons (TDC) Drug-Drug Interaction (DDI) dataset**

Why this dataset? 
1. Comprehensive: It includes a large number of drug pairs with interaction information.
2. Well-Curated: It’s manually validated, ensuring high-quality data.
3. Multiple Interaction Types: You can classify interactions into Safe, Moderate, or Severe.
4. Easily Accessible: Freely available for research purposes.
5. Structured for ML: Suitable for training machine learning models like Decision Trees and Association Rules.


**BioSNAP Drug-Drug Interaction**

Why It’s a Good Fit?
- Graph-based dataset showing real-world drug interactions.
- Provides network relationships (useful for ML).

Best Use Case
- Great for Association Rule Mining (finding frequent drug combinations).


**DDInter Database**

- Well-annotated dataset with risk levels & mechanisms.
- Provides adverse effects for each drug pair.

-- Useful for Decision Trees (to classify interactions as Safe/Moderate/Severe).

ddinter_downloads_code_A.csv (Alimentary Tract & Metabolism) ---> Why? Includes drugs for diabetes, digestion, and metabolism, which are widely used.

ddinter_downloads_code_B.csv (Blood & Blood Forming Organs) ---> Why? Covers anticoagulants and blood pressure medications, which frequently have interactions

ddinter_downloads_code_L.csv (Antineoplastic & Immunomodulating Agents) ---> Why? Includes chemotherapy and immune-related drugs, critical for serious drug interactions.


**DrugBank Drug-Drug Interaction (TDC)**

- Curated dataset sourced from FDA & Health Canada.
- Includes interaction severity & detailed explanations.

- Can be used for training Decision Trees & validating model accuracy.

In [1]:
import pandas as pd

# Load DDInter CSV files
ddinter_a = pd.read_csv("ddinter_downloads_code_A.csv")
#ddinter_b = pd.read_csv("ddinter_downloads_code_B.csv")
#ddinter_l = pd.read_csv("ddinter_downloads_code_L.csv")

# Load DrugBank dataset
#drugbank_ddi = pd.read_csv("DDI_data.csv")

# Load BioSNAP dataset (TSV format)
#biosnap_ddi = pd.read_csv("biosnap.tsv", sep="\t")  # Ensure correct filename


In [2]:
# Display first few rows of each dataset
print("DDInter A:\n\n", ddinter_a.head())

DDInter A:

    DDInterID_A              Drug_A  DDInterID_B        Drug_B     Level
0  DDInter1263          Naltrexone     DDInter1      Abacavir  Moderate
1     DDInter1            Abacavir  DDInter1348      Orlistat  Moderate
2    DDInter58  Aluminum hydroxide   DDInter582  Dolutegravir     Major
3   DDInter112          Aprepitant   DDInter582  Dolutegravir     Minor
4   DDInter138         Attapulgite   DDInter582  Dolutegravir     Major


In [7]:
# Standardize column names
ddinter_a.rename(columns={"DDInterID_A": "Drug_A_ID", "Drug_A": "Drug_A_Name", 
                          "DDInterID_B": "Drug_B_ID", "Drug_B": "Drug_B_Name", 
                          "Level": "Interaction"}, inplace=True)


In [9]:
# Convert text fields to lowercase
ddinter_a["Drug_A_Name"] = ddinter_a["Drug_A_Name"].str.lower()
ddinter_a["Drug_B_Name"] = ddinter_a["Drug_B_Name"].str.lower()
ddinter_a["Interaction"] = ddinter_a["Interaction"].str.lower()

In [10]:
# Remove duplicates
ddinter_a.drop_duplicates(inplace=True)

In [11]:
# Handle missing values (fill unknown names where applicable)
ddinter_a.fillna("Unknown", inplace=True)

In [12]:
# Save the cleaned dataset
ddinter_a.to_csv("cleaned_drug_interactions.csv", index=False)

In [13]:
ddinter_a.head()

Unnamed: 0,Drug_A_ID,Drug_A_Name,Drug_B_ID,Drug_B_Name,Interaction
0,DDInter1263,naltrexone,DDInter1,abacavir,moderate
1,DDInter1,abacavir,DDInter1348,orlistat,moderate
2,DDInter58,aluminum hydroxide,DDInter582,dolutegravir,major
3,DDInter112,aprepitant,DDInter582,dolutegravir,minor
4,DDInter138,attapulgite,DDInter582,dolutegravir,major


In [14]:
#this ensures that every Drug_A_ID has a corresponding Drug_A_Name
inconsistent_rows = ddinter_a[ddinter_a['Drug_A_ID'].str.contains("DDInter") & ddinter_a['Drug_A_Name'].isnull()]
print("Inconsistent Rows in Drug_A:", inconsistent_rows)

Inconsistent Rows in Drug_A: Empty DataFrame
Columns: [Drug_A_ID, Drug_A_Name, Drug_B_ID, Drug_B_Name, Interaction]
Index: []


In [15]:
id_name_mismatch = ddinter_a.groupby("Drug_A_ID")["Drug_A_Name"].nunique()
print(id_name_mismatch[id_name_mismatch > 1])  # If a Drug_A_ID has multiple names, it's an issue

Series([], Name: Drug_A_Name, dtype: int64)


In [16]:
# Identify the most common name per Drug_A_ID
most_common_name = ddinter_a.groupby("Drug_A_ID")["Drug_A_Name"].agg(lambda x: x.value_counts().idxmax())

# Map the most common name to all instances
ddinter_a["Drug_A_Name"] = ddinter_a["Drug_A_ID"].map(most_common_name)

# Repeat for Drug_B
most_common_name_b = ddinter_a.groupby("Drug_B_ID")["Drug_B_Name"].agg(lambda x: x.value_counts().idxmax())
ddinter_a["Drug_B_Name"] = ddinter_a["Drug_B_ID"].map(most_common_name_b)

In [17]:
id_name_mismatch = ddinter_a.groupby("Drug_A_ID")["Drug_A_Name"].nunique()
print(id_name_mismatch[id_name_mismatch > 1])  # Should return an empty Series if fixed


Series([], Name: Drug_A_Name, dtype: int64)


In [18]:
id_name_mismatch = ddinter_a.groupby("Drug_B_ID")["Drug_B_Name"].nunique()
print(id_name_mismatch[id_name_mismatch > 1])  # If a Drug_B_ID has multiple names, it's an issue

Series([], Name: Drug_B_Name, dtype: int64)


In [19]:
ddinter_a.head()

Unnamed: 0,Drug_A_ID,Drug_A_Name,Drug_B_ID,Drug_B_Name,Interaction
0,DDInter1263,naltrexone,DDInter1,abacavir,moderate
1,DDInter1,abacavir,DDInter1348,orlistat,moderate
2,DDInter58,aluminum hydroxide,DDInter582,dolutegravir,major
3,DDInter112,aprepitant,DDInter582,dolutegravir,minor
4,DDInter138,attapulgite,DDInter582,dolutegravir,major


In [20]:
print(ddinter_a["Interaction"].unique())

['moderate' 'major' 'minor' 'unknown']


These additional interaction labels are not uniformly categorized, making it difficult for the ML model to learn effectively.

In [21]:
# Define a mapping for standardization
def categorize_interaction(interaction):
    interaction = interaction.lower()  # Convert to lowercase for consistency
    if interaction in ["moderate", "major", "minor"]:
        return interaction  # Keep existing correct labels
    elif "bleeding" in interaction or "severe" in interaction or "toxicity" in interaction:
        return "major"  # Assign to major if it indicates severe effects
    elif "metabolism" in interaction or "therapeutic" in interaction or "moderate" in interaction:
        return "moderate"  # Assign moderate if it affects drug metabolism or efficacy
    elif "unknown" in interaction:
        return "minor"  # Assume unknown interactions have minimal effect
    elif "Unknown" in interaction:
        return "minor"
    else:
        return "minor"  # Default to minor if unclear

# Apply the function
ddinter_a["Interaction"] = ddinter_a["Interaction"].apply(categorize_interaction)

# Verify unique categories after cleaning
print(ddinter_a["Interaction"].unique())  # Should now only have ['moderate', 'major', 'minor']


['moderate' 'major' 'minor']


Feature Encoding - i.e. I will be converting categorical data to numerical format

Use LabelEncoder to transform Drug_A_Name and Drug_B_Name into integers.

Also Converting Interaction Labels (minor, moderate, major) into numeric values
minor → 0, moderate → 1, major → 2

In [22]:
from sklearn.preprocessing import LabelEncoder

# Initialize LabelEncoder
drug_encoder = LabelEncoder()
interaction_encoder = LabelEncoder()

In [23]:
ddinter_a.head()

Unnamed: 0,Drug_A_ID,Drug_A_Name,Drug_B_ID,Drug_B_Name,Interaction
0,DDInter1263,naltrexone,DDInter1,abacavir,moderate
1,DDInter1,abacavir,DDInter1348,orlistat,moderate
2,DDInter58,aluminum hydroxide,DDInter582,dolutegravir,major
3,DDInter112,aprepitant,DDInter582,dolutegravir,minor
4,DDInter138,attapulgite,DDInter582,dolutegravir,major


In [24]:
# Get all unique drug names from both columns
all_drugs = pd.concat([ddinter_a["Drug_A_Name"], ddinter_a["Drug_B_Name"]]).unique()

# Fit the encoder on all unique drugs
drug_encoder.fit(all_drugs)

In [25]:
# Fit on all unique drug names from both columns
#drug_encoder = LabelEncoder()
#drug_encoder.fit(ddinter_a["Drug_A_Name"].tolist() + ddinter_a["Drug_B_Name"].tolist())

In [26]:
ddinter_a["Drug_A_Encoded"] = drug_encoder.transform(ddinter_a["Drug_A_Name"])
ddinter_a["Drug_B_Encoded"] = drug_encoder.transform(ddinter_a["Drug_B_Name"])
# Now transform both columns safely


In [27]:
# Encode interaction severity
ddinter_a["Interaction"] = interaction_encoder.fit_transform(ddinter_a["Interaction"])

In [28]:
# Save encoded dataset
#ddinter_a.to_csv("encoded_drug_interactions.csv", index=False)

**Drug_A_Name	-> Readable drug name (for reference)**

**Drug_B_Name	-> Readable drug name (for reference)**

**Drug_A_Encoded	-> Numerical version of Drug_A_Name (for ML models)**

**Drug_B_Encoded	-> Numerical version of Drug_B_Name (for ML models)**

**Interaction	-> Original risk category (minor/moderate/major)**

**Interaction_Encoded	-> Numerical version of Interaction (0,1,2) for ML**

In [29]:
ddinter_a.head()

Unnamed: 0,Drug_A_ID,Drug_A_Name,Drug_B_ID,Drug_B_Name,Interaction,Drug_A_Encoded,Drug_B_Encoded
0,DDInter1263,naltrexone,DDInter1,abacavir,2,1120,0
1,DDInter1,abacavir,DDInter1348,orlistat,2,0,1196
2,DDInter58,aluminum hydroxide,DDInter582,dolutegravir,0,52,510
3,DDInter112,aprepitant,DDInter582,dolutegravir,1,102,510
4,DDInter138,attapulgite,DDInter582,dolutegravir,0,127,510


Imagine you work in a supermarket where customers buy different items. You notice that many people who buy bread also buy butter. 

Now, wouldn't it be smart if the store placed bread and butter together or recommended butter when someone picks up bread? This would help customers find what they need and make shopping easier! 

**How does this relate to medicine?**

Doctors and pharmacists need to be very careful when giving two medicines together. Some medicines work well together, like Vitamin D and Calcium (which help your bones). But some can be dangerous if taken together, like Aspirin and Warfarin, which can cause too much bleeding! 

**How does Apriori help?**

Apriori is a smart tool that looks for patterns in large amounts of data—just like spotting the "bread & butter" combo in shopping!

It scans medical records and finds which drugs are often taken together.
If a dangerous combination appears too often and leads to bad effects, it raises an alert!

Doctors can then avoid prescribing those risky drug pairs, keeping patients safe and healthy!

**In Short:**

Apriori finds patterns in data (shopping carts or drug interactions).
It helps stores recommend products OR helps doctors avoid dangerous medicine combinations!

In [30]:
ddinter_a.count()

Drug_A_ID         56367
Drug_A_Name       56367
Drug_B_ID         56367
Drug_B_Name       56367
Interaction       56367
Drug_A_Encoded    56367
Drug_B_Encoded    56367
dtype: int64

In [31]:
from mlxtend.frequent_patterns import apriori, association_rules

# Step 1: Convert Data to a Transaction Format
# Each row should represent a drug pair and whether an interaction exists
transactions = ddinter_a.groupby(["Drug_A_Encoded", "Drug_B_Encoded"])["Interaction"].sum().unstack().fillna(0)

In [32]:
transactions

Drug_B_Encoded,0,1,2,6,7,8,9,10,11,12,...,1747,1748,1749,1750,1751,1752,1753,1754,1755,1756
Drug_A_Encoded,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,1.0,0.0,1.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0
5,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
7,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1752,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1753,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1754,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1755,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [33]:
transactions = transactions.map(lambda x: 1 if x > 0 else 0).astype(bool)


In [34]:
transactions

Drug_B_Encoded,0,1,2,6,7,8,9,10,11,12,...,1747,1748,1749,1750,1751,1752,1753,1754,1755,1756
Drug_A_Encoded,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
0,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
3,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
4,False,False,False,False,False,False,False,False,False,False,...,True,False,True,True,False,False,False,False,False,False
5,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
7,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1752,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1753,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1754,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False
1755,False,False,False,False,False,False,False,False,False,False,...,False,False,False,False,False,False,False,False,False,False


In [35]:
top_5 = transactions.head()

In [36]:
#batch_size = 10

#all_frequent_itemsets = []

#for i in range(0, len(transactions), batch_size):
#    batch = transactions.iloc[i:i + batch_size]
#    frequent_itemsets = apriori(batch, min_support=0.01, use_colnames=True)
#    all_frequent_itemsets.append(frequent_itemsets)

In [37]:
# Step 2: Apply Apriori Algorithm
frequent_itemsets = apriori(top_5, min_support=0.01, use_colnames=True) # 1% minimum support


In [38]:
frequent_itemsets

Unnamed: 0,support,itemsets
0,0.2,(18)
1,0.2,(77)
2,0.4,(186)
3,0.2,(217)
4,0.2,(222)
...,...,...
1058830,0.2,"(274, 1302, 922, 1191, 1063, 1325, 1326, 1080,..."
1058831,0.2,"(274, 1302, 922, 1191, 1063, 1325, 1326, 1080,..."
1058832,0.2,"(1302, 922, 1191, 1063, 1325, 1326, 1080, 953,..."
1058833,0.2,"(274, 1302, 922, 1191, 1063, 1325, 1326, 1080,..."


In [39]:
frequent_itemsets.head(100)

Unnamed: 0,support,itemsets
0,0.2,(18)
1,0.2,(77)
2,0.4,(186)
3,0.2,(217)
4,0.2,(222)
...,...,...
95,0.2,"(274, 555)"
96,0.2,"(274, 626)"
97,0.4,"(274, 725)"
98,0.2,"(274, 732)"


In [40]:
frequent_itemsets.count()

support     1058835
itemsets    1058835
dtype: int64

In [41]:
run_frequent_itemsets = frequent_itemsets.head(10000)

In [42]:
run_frequent_itemsets

Unnamed: 0,support,itemsets
0,0.2,(18)
1,0.2,(77)
2,0.4,(186)
3,0.2,(217)
4,0.2,(222)
...,...,...
9995,0.2,"(1191, 1326, 626, 953, 186)"
9996,0.2,"(1507, 1191, 626, 953, 186)"
9997,0.2,"(1227, 626, 1302, 953, 186)"
9998,0.2,"(1227, 1325, 626, 953, 186)"


In [43]:
# Step 3️: Extract Association Rules
rules = association_rules(run_frequent_itemsets, metric="lift", min_threshold=1.2)


In [44]:
import pandas as pd

# Display top rules
pd.set_option("display.max_columns", None)  # Show all columns
print(rules.head())  # Print top rules

# Save rules to a CSV file 
rules.to_csv("apriori_rules.csv", index=False)

  antecedents consequents  antecedent support  consequent support  support  \
0       (458)        (18)                 0.2                 0.2      0.2   
1        (18)       (458)                 0.2                 0.2      0.2   
2        (18)       (631)                 0.2                 0.2      0.2   
3       (631)        (18)                 0.2                 0.2      0.2   
4        (18)      (1487)                 0.2                 0.2      0.2   

   confidence  lift  representativity  leverage  conviction  zhangs_metric  \
0         1.0   5.0               1.0      0.16         inf            1.0   
1         1.0   5.0               1.0      0.16         inf            1.0   
2         1.0   5.0               1.0      0.16         inf            1.0   
3         1.0   5.0               1.0      0.16         inf            1.0   
4         1.0   5.0               1.0      0.16         inf            1.0   

   jaccard  certainty  kulczynski  
0      1.0        1.0     

In [45]:
encoded_drugs = [458, 18, 631, 1487]  # Replace with actual encoded drug IDs
decoded_drugs = drug_encoder.inverse_transform(encoded_drugs)
print("Decoded drug names:", decoded_drugs)

Decoded drug names: ['dexfenfluramine' 'acetylsalicylic acid' 'fenfluramine' 'sibutramine']


In [46]:
print(frequent_itemsets.columns)


Index(['support', 'itemsets'], dtype='object')


In [47]:
frequent_itemsets

Unnamed: 0,support,itemsets
0,0.2,(18)
1,0.2,(77)
2,0.4,(186)
3,0.2,(217)
4,0.2,(222)
...,...,...
1058830,0.2,"(274, 1302, 922, 1191, 1063, 1325, 1326, 1080,..."
1058831,0.2,"(274, 1302, 922, 1191, 1063, 1325, 1326, 1080,..."
1058832,0.2,"(1302, 922, 1191, 1063, 1325, 1326, 1080, 953,..."
1058833,0.2,"(274, 1302, 922, 1191, 1063, 1325, 1326, 1080,..."


In [48]:
run_frequent_itemsets

Unnamed: 0,support,itemsets
0,0.2,(18)
1,0.2,(77)
2,0.4,(186)
3,0.2,(217)
4,0.2,(222)
...,...,...
9995,0.2,"(1191, 1326, 626, 953, 186)"
9996,0.2,"(1507, 1191, 626, 953, 186)"
9997,0.2,"(1227, 626, 1302, 953, 186)"
9998,0.2,"(1227, 1325, 626, 953, 186)"


In [49]:
# Decode antecedents and consequents
rules["antecedents_decoded"] = rules["antecedents"].apply(lambda x: [drug_encoder.inverse_transform([i])[0] for i in list(x)])
rules["consequents_decoded"] = rules["consequents"].apply(lambda x: [drug_encoder.inverse_transform([i])[0] for i in list(x)])



In [50]:
import pandas as pd

# Display top rules
pd.set_option("display.max_columns", None)  # Show all columns
print(rules.head())  # Print top rules


  antecedents consequents  antecedent support  consequent support  support  \
0       (458)        (18)                 0.2                 0.2      0.2   
1        (18)       (458)                 0.2                 0.2      0.2   
2        (18)       (631)                 0.2                 0.2      0.2   
3       (631)        (18)                 0.2                 0.2      0.2   
4        (18)      (1487)                 0.2                 0.2      0.2   

   confidence  lift  representativity  leverage  conviction  zhangs_metric  \
0         1.0   5.0               1.0      0.16         inf            1.0   
1         1.0   5.0               1.0      0.16         inf            1.0   
2         1.0   5.0               1.0      0.16         inf            1.0   
3         1.0   5.0               1.0      0.16         inf            1.0   
4         1.0   5.0               1.0      0.16         inf            1.0   

   jaccard  certainty  kulczynski     antecedents_decoded  \
0

In [53]:
# Create a mapping dictionary from (Drug_A, Drug_B) to Interaction
interaction_mapping = ddinter_a.set_index(["Drug_A_Name", "Drug_B_Name"])["Interaction"].to_dict()

# Function to get interaction type
def get_interaction(row):
    drug_a = row["antecedents_decoded"][0]  # Convert list to single value
    drug_b = row["consequents_decoded"][0]  # Convert list to single value
    return interaction_mapping.get((drug_a, drug_b), "Unknown")  # Get interaction or return "Unknown"

# Apply function to each row in `rules`
rules["interaction_type"] = rules.apply(get_interaction, axis=1)

# Display updated rules with interactions
print(rules[["antecedents_decoded", "consequents_decoded", "interaction_type"]].head())




      antecedents_decoded     consequents_decoded interaction_type
0       [dexfenfluramine]  [acetylsalicylic acid]          Unknown
1  [acetylsalicylic acid]       [dexfenfluramine]                2
2  [acetylsalicylic acid]          [fenfluramine]          Unknown
3          [fenfluramine]  [acetylsalicylic acid]                2
4  [acetylsalicylic acid]           [sibutramine]                2


**Insights from the Association Rule Mining Results on Drug Interactions**

The analysis has uncovered strong relationships between different drugs based on association rule mining. Here are the key findings:

1️⃣ **Strong Drug Associations with High Lift (5.0)**

The Lift metric (5.0) across all discovered rules means that the presence of the antecedent drug increases the likelihood of the consequent drug appearing by 5 times compared to random chance.
This is significant, as it suggests that these drugs frequently appear together in medical prescriptions, purchases, or usage patterns.
Example:

**Antecedent:** dexfenfluramine

**Consequent:** acetylsalicylic acid

**Lift: 5.0** → Strong co-occurrence

**Interpretation:**
A high lift value suggests that if a patient is taking dexfenfluramine, there’s a high probability that they are also prescribed or using acetylsalicylic acid. This could indicate common co-prescriptions, but it also raises safety concerns if these drugs have adverse interactions.


2️⃣ **Potentially Harmful Drug Interactions Identified**

The interaction type column categorizes some associations as "2", which likely represents known adverse drug interactions.

Example of a Potentially Risky Interaction:

Acetylsalicylic acid -> Dexfenfluramine	-> 2 (Risky)
Acetylsalicylic acid -> Sibutramine -> 2 (Risky)
Fenfluramine -> Acetylsalicylic acid -> 2 (Risky)

**Interpretation:**

Dexfenfluramine + Acetylsalicylic Acid: Acetylsalicylic acid (Aspirin) is a common blood thinner. Dexfenfluramine is an appetite suppressant that was withdrawn from the market due to cardiovascular risks. Potential Issue: Combining these could increase the risk of bleeding or cardiovascular events.

Sibutramine + Acetylsalicylic Acid: Sibutramine is another appetite suppressant that was linked to heart attacks and strokes. Potential Issue: Aspirin’s effect on blood thinning could aggravate cardiovascular side effects.


💡 **Actionable Takeaway:**

Healthcare professionals should exercise caution when prescribing these drugs together. If a patient is on acetylsalicylic acid (aspirin), appetite suppressants like dexfenfluramine or sibutramine should be avoided due to potential adverse effects.

3️⃣ **Unknown Interactions – Potential Research Opportunities**

Some interactions in the dataset are labeled as "Unknown", meaning there are no documented interactions yet.

🔹 **Why This Matters:**

Just because a drug interaction is "unknown" doesn’t mean it’s safe.
Future clinical studies could validate whether these unknown interactions pose any risks.
Researchers and pharmacists should investigate these cases further.


4️⃣ **Applications & Use Cases**

🔹 **For Pharmacists & Medical Professionals:**

This data can be used in clinical decision support systems (CDSS) to warn doctors or pharmacists when risky drug combinations appear.
Could be integrated into electronic health records (EHRs) to prevent harmful prescriptions.

🔹 **For AI & Machine Learning in Healthcare:**

AI models can be trained using this dataset to predict new drug interactions based on historical data.
Unknown interactions could be validated through machine learning models that analyze patient records.

🔹 **For Drug Regulation & Policy Makers:**

Regulatory bodies (FDA, WHO) can flag these interactions for further review.
If new risks are discovered, they can issue safety guidelines to restrict certain drug combinations.
