---
title: "Association Rule Mining"
---

## Association Rule Mining:

- Association rule mining (ARM) is a data analysis technique for uncovering relationships among variables in large datasets. Essentially, it operates on the principle of identifying frequent patterns, which are combinations of items that appear together in the dataset with regularity.

- The method revolves around a key data mining algorithm known as the Frequent Pattern Mining (FPM) algorithm. FPM's core objective is to reveal correlations and associations between items that may not be immediately obvious. This technique surfaces patterns that indicate if the presence of one set of items is related to the presence of another set.

- These patterns are typically expressed through association rules, a form of rule-based machine learning. Association rules are useful not only for uncovering anomalies and regularities within data but also for a variety of other applications. These applications range widely, from analysis in retail and marketing, such as basket analysis and cross-marketing strategies, to more technical realms like software bug tracking.

- The discovery of frequent itemsets using algorithms like Apriori is integral to many data mining initiatives. These itemsets form the foundation for more complex analysis, such as sequence discovery, interesting pattern recognition, and, most notably, the mining of association rules.

- A practical example of association rules in action is their use in understanding consumer behavior in retail settings. Such rules might reveal the frequency and conditions under which certain products are purchased together, offering valuable insights for marketing strategies and inventory management. 

## Importing Libraries

In [4]:
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
from mlxtend.preprocessing import TransactionEncoder

## Reading in Data

In [5]:
merged_data = pd.read_csv('../data/Final_Data.csv')

In [6]:
merged_data['Movement_binned'] = pd.cut(merged_data['Movement'], bins=[-float('inf'), 0, float('inf')], labels=['Down', 'Up'])

merged_data['Sentiment'] = merged_data['TRAIN'].map({1: 'Positive', 0: 'Neutral', -1: 'Negative'})

transactions = merged_data.groupby('Date')[['Movement_binned', 'Sentiment']].agg(lambda x: x.tolist())

te = TransactionEncoder()
te_ary = te.fit(transactions).transform(transactions)
df = pd.DataFrame(te_ary, columns=te.columns_)

frequent_itemsets = apriori(df, min_support=0.01, use_colnames=True)

rules = association_rules(frequent_itemsets, metric="confidence", min_threshold=0.1)

print(rules)


       antecedents                     consequents  antecedent support  \
0              (M)                             (_)                0.04   
1              (_)                             (M)                0.04   
2              (M)                             (b)                0.04   
3              (b)                             (M)                0.04   
4              (M)                             (d)                0.04   
...            ...                             ...                 ...   
173469         (o)  (e, n, d, t, v, i, M, _, m, b)                0.04   
173470         (M)  (e, n, d, t, v, i, o, _, m, b)                0.04   
173471         (_)  (e, n, d, t, v, i, o, M, m, b)                0.04   
173472         (m)  (e, n, d, t, v, i, o, M, _, b)                0.08   
173473         (b)  (e, n, d, t, v, i, o, M, _, m)                0.04   

        consequent support  support  confidence  lift  leverage  conviction  \
0                     0.04     0

## Conclusion

- As we can see, the dataset gathered from my project does not really suit ARM as it does not have categorical variables. Stock values and Text are not the target features which apriori expects. Therefore this technique does not work correctly.