β¨ NEW: Clean, organized structure with interactive terminal interface!
All results displayed in terminal - no file clutter!
A comprehensive data mining project implementing three algorithms for frequent itemset mining and association rule discovery.
- 
Three Mining Algorithms: - Brute Force (custom implementation)
- Apriori (mlxtend library)
- FP-Growth (mlxtend library)
 
- 
Five Transactional Databases: - Amazon (Technology)
- BestBuy (Electronics)
- Walmart (Groceries)
- Target (Clothing)
- Costco (Household)
 
- 
Interactive Interface: - User-friendly menu system
- Real-time results display
- Configurable parameters
- No file clutter (results shown in terminal)
 
Data mining/
βββ main.py              # Interactive interface (START HERE)
βββ data/                # Transaction databases (CSV files)
β   βββ Amazon_transactions.csv
β   βββ BestBuy_transactions.csv
β   βββ Walmart_transactions.csv
β   βββ Target_transactions.csv
β   βββ Costco_transactions.csv
βββ src/                 # Source code
β   βββ generate_databases.py
β   βββ brute_force_mining.py
β   βββ library_based_mining.py
β   βββ run_brute_force.py
βββ docs/                # Documentation
    βββ README.md
    βββ DATA_CREATION_REPORT.md
    βββ BRUTE_FORCE_ALGORITHM_REPORT.md
    βββ ALGORITHM_COMPARISON_REPORT.md
    βββ ... (other reports)
π See GETTING_STARTED.md for detailed walkthrough with examples!
pip install mlxtend pandas numpypython main.py- Welcome Screen - Introduction to the tool
- Database Selection - Choose from 5 databases
- Algorithm Selection - Pick Brute Force, Apriori, or FP-Growth
- Configuration - Set minimum support and confidence
- Results Display - View frequent itemsets and association rules in terminal
- Repeat or Exit - Run another analysis or quit
SELECT DATABASE
1. Amazon      - Technology & Electronics
2. BestBuy     - Consumer Electronics
3. Walmart     - Groceries
4. Target      - Clothing & Fashion
5. Costco      - Household Items
Enter database number: 1
SELECT ALGORITHM
1. Brute Force
2. Apriori
3. FP-Growth
Enter algorithm number: 2
CONFIGURATION
Minimum support (0-1, default 0.2): 0.2
Minimum confidence (0-1, default 0.6): 0.6
[Results displayed in terminal...]
FREQUENT ITEMSETS
  Total frequent itemsets found: 18
  π¦ 1-Itemsets (12 found):
     {HDMI_Cable} - Count: 10, Support: 0.4000
     {Router} - Count: 10, Support: 0.4000
     {USB_Cable} - Count: 10, Support: 0.4000
     ...
ASSOCIATION RULES
  Total rules generated: 15
  Top 15 rules:
  Rule                                                Supp     Conf     Lift
  {Laptop} β {Keyboard}                             0.2000   1.0000   5.0000
  {Mouse} β {Keyboard}                              0.2000   1.0000   5.0000
  ...
cd src
python generate_databases.pyThis creates new deterministic transaction data in the data/ folder.
cd src
# Brute Force only
python run_brute_force.py
# Apriori + FP-Growth
python library_based_mining.py- Range: 0.0 to 1.0 (percentage) or integer (absolute count)
- Default: 0.2 (20%)
- Effect: Lower = more itemsets found, slower execution
- Range: 0.0 to 1.0
- Default: 0.6 (60%)
- Effect: Lower = more rules generated
- Shows exhaustive search approach
- Demonstrates exponential complexity
- Educational baseline for comparison
- Illustrates pruning optimization
- Uses downward closure property
- Standard industry algorithm
- Tree-based pattern growth
- Most efficient for large datasets
- Modern mining approach
Detailed reports are available in the docs/ folder:
- DATA_CREATION_REPORT.md - How databases were created
- BRUTE_FORCE_ALGORITHM_REPORT.md - Brute force implementation details
- ALGORITHM_COMPARISON_REPORT.md - Performance comparison of all algorithms
- QUICK_START_GUIDE.md - Detailed usage instructions
All three algorithms produce identical results (verified):
- β Same frequent itemsets
- β Same association rules
- β Same support/confidence/lift values
- Brute Force: 0.050s
- Apriori: 0.013s (3.85x faster)
- FP-Growth: 0.017s (2.94x faster)
Find items frequently bought together to create bundles.
Place related products near each other based on co-occurrence.
Suggest items based on shopping cart contents.
Predict demand for complementary products.
pip install mlxtendMake sure you're running main.py from the project root directory.
Run python src/generate_databases.py to recreate them.
Educational project for data mining coursework.
Data Mining Project - Frequent Itemset Mining Implementation
Start mining: python main.py π