## Important Note:

#### Please ensure that your dataset is formatted as a list of items, where each transaction is represented as a comma-separated values
For example:
- `olive oil, frozen smoothie, green tea, whole wheat flour, salmon`

Each transaction should be on a new line in the input file. 
<!-- This format is required for the FP-Growth and Apriori algorithms to function correctly. -->


In [8]:
# All Imports
import pandas as pd
import numpy as np

import mlxtend
from mlxtend.frequent_patterns import apriori, association_rules
from mlxtend.preprocessing import TransactionEncoder

### Reading the Dataset
To read your own dataset, you can modify the file path in the following code:

In [None]:
df = pd.read_csv("Example_Dataset.csv", header=None)
df

# Pre-Processing the Data

In [10]:
preprocessed_df = df.copy()

# Creating a new column which contains all the different items as a list
list_of_items_col = 'List of Items'
if list_of_items_col not in preprocessed_df.columns:
    preprocessed_df[list_of_items_col] = preprocessed_df.apply(lambda x: x.dropna().tolist(), axis = 1)

# Dropping all other columns
preprocessed_df = preprocessed_df[[list_of_items_col]]
# preprocessed_df


# Encoding 
transactions = preprocessed_df[list_of_items_col].tolist()
transaction_encode = TransactionEncoder()
transaction_encode.fit(transactions)

encoded_transactions = transaction_encode.transform(transactions)

preprocessed_df = pd.DataFrame(encoded_transactions, columns=transaction_encode.columns_)

The `preprocessed_df` DataFrame now contains the encoded transactions where each item is represented as a column header. Each row indicates the presence (True) or absence (False) of items in that specific transaction. 

For example, the structure will look like this:

| avocado | honey  | mineral water | frozen smoothie | eggs  | spinach | burgers |
|---------|--------|---------------|-----------------|-------|---------|---------|
| True    | False  | True          | False           | True  | False   | True    |
| False   | True   | False         | True            | False | True    | False   |


In [None]:
preprocessed_df

# Implementing Apriori Algorithm

### For More Information
Please refer to the README file for a comprehensive explanation of **support** and **confidence** metrics


## Inputting the Minimum Support Count

When prompted to enter the **Minimum Support Count**, please provide the value as a decimal rather than a percentage.

For example:
- Enter **0.01** for 1% support, not **1%** or **1**.

In [None]:
minimum_support_value = input("Input the Minimum Support Count: ")

frequent_items = apriori(preprocessed_df, min_support = float(minimum_support_value), use_colnames=True)
frequent_items

## Inputting the Confidence Threshold

When prompted to enter the **Confidence Threshold**, please provide the value as a decimal rather than a percentage. 

For example:
- Enter **0.75** for 75% confidence, not **75%** or **0.75**.

This ensures the application correctly interprets your input for accurate analysis.


In [None]:
confidence_threshold = input("Input the Confidence Threshold")

rules_df = association_rules(frequent_items, min_threshold = float(confidence_threshold))
rules_df