Skip to content

Implement Apriori Algorithm for Association Rule Mining #21

@noahgift

Description

@noahgift

Problem Statement

Apriori discovers frequent itemsets and association rules in transactional data. Used for market basket analysis. Currently missing from aprender.

Use Cases:

  • Market basket analysis ("customers who bought X also bought Y")
  • Recommendation systems
  • Cross-selling strategies
  • Web usage mining

Example Rules:

  • {milk, bread} → {butter} (support=0.3, confidence=0.8)
  • "30% of transactions contain milk, bread, butter"
  • "80% of transactions with milk and bread also have butter"

Proposed Solution

Implement Apriori algorithm following EXTREME TDD.

Algorithm

Steps:

  1. Find frequent 1-itemsets (items above min_support)
  2. Generate candidate k-itemsets from frequent (k-1)-itemsets
  3. Prune candidates using Apriori principle:
    • If itemset infrequent, all supersets are infrequent
  4. Generate association rules from frequent itemsets
  5. Filter rules by min_confidence

Implementation

API Design:

pub struct Apriori {
    min_support: f32,
    min_confidence: f32,
    frequent_itemsets: Option<Vec<ItemSet>>,
    rules: Option<Vec<AssociationRule>>,
}

pub struct ItemSet {
    items: Vec<usize>,
    support: f32,
}

pub struct AssociationRule {
    antecedent: Vec<usize>,  // If
    consequent: Vec<usize>,  // Then
    support: f32,
    confidence: f32,
    lift: f32,
}

impl Apriori {
    pub fn fit(&mut self, transactions: &[Vec<usize>]) -> Result<(), &'static str>;
    pub fn frequent_itemsets(&self) -> &[ItemSet];
    pub fn association_rules(&self) -> &[AssociationRule];
}

Success Criteria

  • ✅ Apriori with frequent itemset mining
  • ✅ Association rule generation
  • ✅ Support, confidence, lift metrics
  • ✅ 10+ tests (including retail dataset)
  • ✅ Zero clippy warnings
  • ✅ Example: examples/market_basket.rs

Estimated Effort

Timeline: 3-4 days
Complexity: Medium (combinatorial enumeration, pruning)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions