NAME:-SAURABH RAJENDRA SABLE
ROLL NO :- RBTL22CB076
SUBJECT:- MACHINE LEARNING
DATASET :- MONKEYPOX DATASET

Aim :
The aim of the Market Basket Analysis project is to employ the Apriori algorithm to discover interesting associations and 
patterns within a retail dataset. The project focuses on understanding the relationships between products frequently purchased 
together by customers. By analyzing these associations, the goal is to provide valuable insights that can be used to enhance 
marketing strategies, optimize product placements, and improve overall customer experience.

Objectives:

Implement the Apriori algorithm to identify frequent itemsets.
Extract meaningful association rules based on the identified itemsets.
Evaluate and interpret the discovered patterns to gain actionable insights.
Apply the insights to enhance business strategies, such as product bundling, cross-selling, and targeted marketing.

Problem Statement:
In the context of retail and e-commerce, understanding customer purchasing behavior is crucial for optimizing business strategies. One effective approach is Market Basket Analysis, which aims to uncover associations and patterns among products frequently purchased together. The Apriori algorithm is a widely used method for this analysis due to its ability to efficiently identify frequent itemsets and extract meaningful association rules

Theory:

The Apriori algorithm, proposed by Agrawal and Srikant in 1994, is a fundamental algorithm for association rule mining. The theory behind Apriori is based on the Apriori property, which states that if an itemset is frequent, then all of its subsets must also be frequent. The algorithm uses a breadth-first search strategy to discover frequent itemsets and generate association rules.

Support and Confidence: Central to the Apriori algorithm are two key metrics—support and confidence. Support measures the frequency of occurrence of an itemset, while confidence measures the likelihood that an association rule holds true.

Pruning: To optimize the mining process, the Apriori algorithm uses a pruning technique. Infrequent itemsets are eliminated, reducing the search space and improving computational efficiency.

Association Rules: Once frequent itemsets are identified, association rules are generated. These rules express relationships between items, providing insights into customer behaviors and preferences.

Parameter Tuning: The performance of the Apriori algorithm is influenced by user-defined parameters, such as the minimum support threshold. Theoretical considerations guide the selection of these parameters to balance the trade-off between discovering meaningful patterns and avoiding spurious associations.

In [2]:
from mlxtend.preprocessing import TransactionEncoder
from mlxtend.frequent_patterns import apriori, association_rules
import pandas as pd

In [10]:
transactions = [
    ['bread', 'milk', 'eggs'],
    ['bread', 'butter'],
    ['milk', 'butter'],
    ['bread', 'milk', 'butter'],
    ['bread', 'milk'],
]

In [11]:
te = TransactionEncoder()
te_ary = te.fit(transactions).transform(transactions)
df = pd.DataFrame(te_ary, columns=te.columns_)


In [12]:
frequent_itemsets = apriori(df, min_support=0.3, use_colnames=True)

In [13]:
rules = association_rules(frequent_itemsets, metric="confidence", min_threshold=0.7)

In [14]:
print("Frequent Itemsets:")
print(frequent_itemsets)

print("\nAssociation Rules:")
print(rules)

Frequent Itemsets:
   support         itemsets
0      0.8          (bread)
1      0.6         (butter)
2      0.8           (milk)
3      0.4  (butter, bread)
4      0.6    (milk, bread)
5      0.4   (butter, milk)

Association Rules:
  antecedents consequents  antecedent support  consequent support  support  \
0      (milk)     (bread)                 0.8                 0.8      0.6   
1     (bread)      (milk)                 0.8                 0.8      0.6   

   confidence    lift  leverage  conviction  zhangs_metric  
0        0.75  0.9375     -0.04         0.8          -0.25  
1        0.75  0.9375     -0.04         0.8          -0.25  


Conclusion :
In conclusion, the Market Basket Analysis project successfully utilized the Apriori algorithm to uncover meaningful associations
and patterns in the retail dataset. The identified frequent itemsets and association rules provide valuable information about 
customer purchasing behaviors.