<a href="https://colab.research.google.com/github/rosehunnie/NLP/blob/main/CompeleteDropoutinsights.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Step 1: Install Necessary Libraries**

In [18]:
!pip install mlxtend -qqq
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
from mlxtend.preprocessing import TransactionEncoder

In [30]:
# Load the dropout dataset
df = pd.read_csv('/content/sample_data/DropOut.csv')  # Adjust path if needed

# Select categorical columns of interest
cols = ["Mother's occupation",  "Target"]

# Convert all values to strings for clean formatting
df_filtered = df[cols].astype(str)

# Convert each row into a list of "attribute=value" items
transactions = df_filtered.apply(lambda row: [f"{col}={row[col]}" for col in df_filtered.columns], axis=1).tolist()

# Transaction Encoding
te = TransactionEncoder()
te_ary = te.fit(transactions).transform(transactions)
df_encoded = pd.DataFrame(te_ary, columns=te.columns_)

# Apply Apriori Algorithm
frequent_itemsets = apriori(df_encoded, min_support=0.01, use_colnames=True)

# Generate Association Rules
rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)

# Sort and display top rules
rules.sort_values('lift', ascending=False).head()


Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,representativity,leverage,conviction,zhangs_metric,jaccard,certainty,kulczynski
4,(Target=Dropout),(Mother's occupation=12),0.321203,0.015823,0.011528,0.03589,2.268262,1.0,0.006446,1.020814,0.823712,0.035417,0.02039,0.382231
5,(Mother's occupation=12),(Target=Dropout),0.015823,0.321203,0.011528,0.728571,2.268262,1.0,0.006446,2.500833,0.568123,0.035417,0.600133,0.382231
1,(Mother's occupation=1),(Target=Dropout),0.03255,0.321203,0.022378,0.6875,2.140394,1.0,0.011923,2.172152,0.550722,0.067531,0.539627,0.378585
0,(Target=Dropout),(Mother's occupation=1),0.321203,0.03255,0.022378,0.069669,2.140394,1.0,0.011923,1.039899,0.784912,0.067531,0.038368,0.378585
7,(Target=Enrolled),(Mother's occupation=3),0.179476,0.071881,0.017631,0.098237,1.366665,1.0,0.00473,1.029227,0.326976,0.075435,0.028397,0.17176


Insights found:

4. Antecedents: Target=Dropout
Consequents: Mother's occupation =12
Lift: 2.268
Confidence: 0.0358 = 3.6%
TargetStudents who dropped out are 2.26 more likely to have a mother with occupation = 12
only 3.6% of dropouts actually had this occupation

5. Antecedents: Mother's occupation = 12
Consequents: Target=Dropout
Lift: 2.268
Confidence: 0.728 = 72.8%
If student's mother occupation is 12, there is a 72% chance that student dropped out.

1. Antecedents: Mother's occupation = 1
Consequents: Target = Dropout
Lift: 2.140
Confidence: 0.687 = 68.7%
If student's mother occupation is 1, there is a 68% chance of student dropping out.

