# **LCOFI Algorithm Example**

This document explains the implementation of the **LCOFI (Logic Circuit Optimization Frequent Itemset)** algorithm for mining frequent itemsets from a transactional dataset using a graph-based approach.

---

## **Overview**
The **LCOFI algorithm** uses a bipartite graph representation of transactions and items to efficiently mine frequent itemsets. By dynamically pruning infrequent candidates during traversal, it reduces computational overhead and improves performance, especially for large or sparse datasets.

---

## **Key Features**
1. **Graph Representation**:
   - Transactions and items are represented as nodes.
   - Edges connect transactions to the items they contain.

2. **Candidate Generation**:
   - Candidates are generated by combining frequent subsets iteratively.

3. **Support Counting**:
   - Support is calculated using graph traversal to count the occurrences of itemsets in transactions.

4. **Dynamic Pruning**:
   - Itemsets that do not meet the minimum support threshold are pruned dynamically during the process.

---

## **Implementation Details**
### **Steps of the Algorithm**
1. **Generate Candidate Itemsets**:
   - Combine frequent subsets to generate candidate itemsets for the next level.

2. **Count Support**:
   - Check how many transactions contain each candidate itemset and calculate its support.

3. **Prune Infrequent Itemsets**:
   - Remove candidates that do not meet the minimum support threshold.

4. **Iterate for Larger Itemsets**:
   - Repeat the process for itemsets of increasing sizes until no more frequent itemsets can be generated.

---

### **Code**
The code demonstrates:
- Generating candidate itemsets.
- Counting support.
- Building a bipartite graph for transactions and items.
- Iteratively mining frequent itemsets.

### **Input Dataset**
Sample transactions:
```plaintext
Transaction 1: {A, B, C}
Transaction 2: {A, C}
Transaction 3: {B, C, D}
Transaction 4: {A, C, D}
Transaction 5: {B, C}
```

### Parameters
- transactions: The list of transactions, where each transaction is a set of items.
- min_support: The minimum support threshold (e.g., 0.6).



In [1]:
import itertools
import networkx as nx

# Sample transactional dataset
transactions = [
    {'A', 'B', 'C'},
    {'A', 'C'},
    {'B', 'C', 'D'},
    {'A', 'C', 'D'},
    {'B', 'C'},
]

# Parameters
min_support = 0.6  # Minimum support threshold

# Step 1: Generate candidate itemsets
def generate_candidates(frequent_itemsets, size):
    """Generate candidate itemsets of a specific size."""
    return set(
        frozenset(x) for x in itertools.combinations(set(itertools.chain(*frequent_itemsets)), size)
    )

# Step 2: Count support
def count_support(itemsets, transactions):
    """Count the support of itemsets."""
    support_counts = {item: 0 for item in itemsets}
    for transaction in transactions:
        for item in itemsets:
            if item.issubset(transaction):
                support_counts[item] += 1
    return {item: count for item, count in support_counts.items() if count / len(transactions) >= min_support}

# Step 3: Generate frequent itemsets
def lcofi(transactions, min_support):
    """LCOFI algorithm for frequent itemsets using graph representation."""
    # Step 3.1: Represent transactions as a bipartite graph
    G = nx.Graph()
    for i, transaction in enumerate(transactions):
        for item in transaction:
            G.add_edge(f"Transaction_{i}", item)

    # Step 3.2: Start with single-item itemsets
    single_items = {frozenset([node]) for node in G if G.degree[node] > 0 and not node.startswith("Transaction")}
    frequent_itemsets = count_support(single_items, transactions)

    all_frequent_itemsets = [frequent_itemsets]

    # Step 3.3: Iteratively generate larger itemsets
    k = 2
    while frequent_itemsets:
        candidates = generate_candidates(frequent_itemsets, k)
        frequent_itemsets = count_support(candidates, transactions)
        if frequent_itemsets:
            all_frequent_itemsets.append(frequent_itemsets)
        k += 1

    return all_frequent_itemsets

# Run the algorithm
frequent_itemsets = lcofi(transactions, min_support)

# Output results
print("Frequent Itemsets:")
for k, itemsets in enumerate(frequent_itemsets, start=1):
    print(f"Level {k}:")
    for itemset, support in itemsets.items():
        print(f"  {set(itemset)}: {support / len(transactions):.2f}")

Frequent Itemsets:
Level 1:
  {'C'}: 1.00
  {'A'}: 0.60
  {'B'}: 0.60
Level 2:
  {'C', 'B'}: 0.60
  {'C', 'A'}: 0.60


In [5]:
import pandas as pd
from mlxtend.frequent_patterns import association_rules

# Run the LCOFI algorithm to get frequent itemsets
frequent_itemsets = lcofi(transactions, min_support)

# Flatten frequent itemsets into a single dictionary
flat_itemsets = {}
for level in frequent_itemsets:
    flat_itemsets.update(level)

# Total number of transactions
num_transactions = len(transactions)

# Prepare the frequent itemsets DataFrame
data = {
    'itemsets': list(flat_itemsets.keys()),
    'support': [support / num_transactions for support in flat_itemsets.values()]  # Normalize support
}
frequent_itemsets_df = pd.DataFrame(data)

# Filter out low or zero-support itemsets
frequent_itemsets_df = frequent_itemsets_df[frequent_itemsets_df['support'] > 0]

# Add a small epsilon to avoid division by zero
epsilon = 1e-10
frequent_itemsets_df['support'] += epsilon

# Generate association rules
rules = association_rules(frequent_itemsets_df, metric="confidence", min_threshold=0.6, num_itemsets=num_transactions)

# Output rules
print("Association Rules:")
for _, rule in rules.iterrows():
    print(f"{set(rule['antecedents'])} => {set(rule['consequents'])} "
          f"(Support: {rule['support']:.2f}, Confidence: {rule['confidence']:.2f}, Lift: {rule['lift']:.2f})")

Association Rules:
{'C'} => {'B'} (Support: 0.60, Confidence: 0.60, Lift: 1.00)
{'B'} => {'C'} (Support: 0.60, Confidence: 1.00, Lift: 1.00)
{'C'} => {'A'} (Support: 0.60, Confidence: 0.60, Lift: 1.00)
{'A'} => {'C'} (Support: 0.60, Confidence: 1.00, Lift: 1.00)
