<a href="https://colab.research.google.com/github/yoseforaz0990/ML-templates/blob/main/association_rule_learning/apriori.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

| Step                                              | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
|---------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Install apyori library                           | The apyori library is required to perform the Apriori algorithm, which is used for association rule mining. If you don't have it installed, you can install it using !pip install apyori. This step is not shown in the code but needs to be executed before running the code.                                                                                                                                                                                                                                                                                                                   |
| Importing the libraries                         | Import the required libraries for data preprocessing and the Apriori algorithm. The code uses pandas for data manipulation and apyori for Apriori implementation.                                                                                                                                                                                                                                                                                                                                                                                                                                |
| Importing the dataset                           | Load the dataset 'Market_Basket_Optimisation.csv', which contains market basket data. Each row represents a transaction (a customer's purchase history), and each column represents an item that was purchased.                                                                                                                                                                                                                                                                                                                                                                                     |
| Data Preprocessing                              | Preprocess the dataset to prepare it for the Apriori algorithm. The code creates a list called transactions, where each element is a list of items purchased in a transaction. The transactions list will be used as input to the Apriori algorithm.                                                                                                                                                                                                                                                                                                                                               |
| Training the Apriori model on the dataset       | Apply the Apriori algorithm to the dataset to find frequent itemsets and association rules. The apyori.apriori function is used for this purpose. The parameters min_support, min_confidence, min_lift, min_length, and max_length control the minimum support, minimum confidence, minimum lift, minimum and maximum number of items in a rule, respectively. The algorithm finds all the itemsets that satisfy the specified minimum support and generates association rules based on the minimum confidence and lift thresholds. |
| Displaying the first results                    | The variable results holds the output of the Apriori algorithm, which includes frequent itemsets and association rules. The code prints the first results obtained directly from the output of the apriori function.                                                                                                                                                                                                                                                                                                                           |
| Putting the results into a Pandas DataFrame     | The function inspect is defined to organize the results obtained from the Apriori algorithm into a Pandas DataFrame. The DataFrame will have columns for the left-hand side (antecedent), right-hand side (consequent), support, confidence, and lift values of each association rule. The inspect function extracts this information from the results variable.                                                                                                                                                                                                                           |
| Displaying the results non-sorted               | The DataFrame resultsinDataFrame is created using the inspect function. This DataFrame contains the association rules and their corresponding support, confidence, and lift values. The code displays the DataFrame, showing the results in a tabular format.                                                                                                                                                                                                                                                                       |
| Displaying the results sorted by descending lifts | Finally, the code sorts the association rules by descending lift values and displays the top 10 rules with the highest lifts. Sorting the rules by lift allows us to focus on the most significant relationships between items. Higher lift values indicate stronger relationships, meaning that the items in the rule are more likely to be purchased together. The code uses the nlargest function on the DataFrame to achieve this sorting.                                                                                                                                                         |


In [None]:
# Install apyori if not already installed
!pip install apyori

# Importing the libraries
import pandas as pd
from apyori import apriori

# Data Preprocessing
dataset = pd.read_csv('Market_Basket_Optimisation.csv', header=None)
transactions = []
for i in range(0, 7501):
    transactions.append([str(dataset.values[i, j]) for j in range(0, 20)])

# Training the Apriori model on the dataset
rules = apriori(transactions=transactions, min_support=0.003, min_confidence=0.2, min_lift=3, min_length=2, max_length=2)

# Displaying the first results coming directly from the output of the apriori function
results = list(rules)

# Putting the results well organized into a Pandas DataFrame
def inspect(results):
    lhs = [tuple(result[2][0][0])[0] for result in results]
    rhs = [tuple(result[2][0][1])[0] for result in results]
    supports = [result[1] for result in results]
    confidences = [result[2][0][2] for result in results]
    lifts = [result[2][0][3] for result in results]
    return list(zip(lhs, rhs, supports, confidences, lifts))

resultsinDataFrame = pd.DataFrame(inspect(results), columns=['Left Hand Side', 'Right Hand Side', 'Support', 'Confidence', 'Lift'])

# Displaying the results non-sorted
resultsinDataFrame

# Displaying the results sorted by descending lifts
resultsinDataFrame.nlargest(n=10, columns='Lift')
