<a href="https://colab.research.google.com/github/soumaya287/Unsupervised_Association-Rules/blob/main/unsupervised_learning_part2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Unsupervised Association Rules
As we have said previously, unsupervised learning is all about exploring our data:

One of the ways to do that is extracting the rules within the data, this process is called Association Rules.

Association rules are created by searching data for frequent if-then patterns.

Using these rules we can take out valuable information.

The rules within the data can be the type of “what items from a market basket are bought together” or “what videos from a playlist are chosen together”.

<img src="https://imgur.com/BFSplyY.png" width=400>

## Recommendation System
Association rule is widely used in recommendation systems and market basket analysis:

* Recommendation systems: Such as YouTube and Spotify, are applications of association rules.

* Using a set of rules, we can figure out which videos or playlists are often chosen together.

* Later on we can recommend users new videos based on the rules we created.

# Market Basket Analysis
**Market basket analysis**: Supermarkets also use **association rules** for developing successful marketing strategies.

Let's say we have a list of transactions of different customers, if we know what items are always bought together we can **place** them next to each other or we can **sell** them together for a different price.

<img src="https://imgur.com/rwnbdzA.png">

# Apriori Concept
Apriori is one of the algorithms that we can use for market basket analysis.

Apriori is based on 3 metrics:

1. Support

2. Confidence

3. Lift

So let’s start by explaining those metrics.

# 1. Support
###**Support:** quantify how many times an item or an itemset appear in a set of transactions.

In other words, support quantifies the **frequency** of an itemset.

### For example:

How many transactions have apples out of all transactions ?

Support for {apple} = (Transactions Apple)/(Total Transactions)= 4/8 = 0.5

=>** 50 %** of transactions contain apples.

<img src="https://imgur.com/qxx6UXG.png" width=300>

# 2. Confidence
###**Confidence:** After buying item X what’s the probability of buying item Y.

####For example: Out of all transactions that have apple how many have of them have Beer?
on dirait probabilité(have beer/apple)=prob(intersection)/prob(apple)

Confidencefor{Apple→Beer}=transaction(Apple⋒Beer)/transactionApple=3/4=0.75


=> 75% of the transactions containing apples have a beer

This means that we have a 0.75 chance to buy beer after buying apples.

Note here that **⋒**  means **"and"**



# 3. Lift
###**Lift:** What’s the probability of buying items X and Y together rather than just buying item X.

For example:

* Lift(Apple → Beer) = Support(Apple and Beer)/(Support (Apple)*Support(Beer))

* The lift of {apple->beer} = (3/8)/((4/8)*(6/8))=1

=> It’s more likely to buy X and Y rather than X alone.

<img src="https://imgur.com/JToFXeg.png">

# Apriori Library
To understand the Apriori algorithm we are going to try it out together on a dataset, but first, we have to install the right library:

<img src="https://imgur.com/jsn434I.png">

##**Mlxtend** **is a Python library containing useful tools for the day-to-day data science tasks.**

To install it, execute:

in the cmd: pip install mlxtend
or in the Anaconda prompt: conda install -c conda-forge mlxtend

# Apriori Situation
Let’s say you are a machine learning engineer working for a supermarket.

Your objective is to explore data and extract the most valuable informations you can find.

Your employer gave you a dataset that looks like this:

<img src="https://imgur.com/6wY4fmp.png">

Every inner list represents a transaction or a purchase made by a customer .

# Apriori Preparation
In order to use apriori function, we need to transform our dataset into a **one-hot encoded dataframe**.

**Transaction Encoder** creates a Numpy array from a List and “One hot” encodes it but in a True/False format and not in 1s and 0s.

In [None]:
dataset=[['Milk','Onion','Nutmeg','Kidney Beans','Eggs','Yogurt'],
         ['Dill','Onion','Nutmeg','Kidney Beans','Eggs','Yogurt'],
         ['Milk','Apple','Kidney Beans','Eggs'],
         ['Milk','Unicorn','Corn','Kidney Beans','Yogurt'],
         ['Corn','Onion','Onion','Kidney Beans','Ice cream','Eggs']]
import mlxtend
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
te=TransactionEncoder()
te_ary=te.fit(dataset).transform(dataset)    #Apply one-hot-encoding on our dataset
df=pd.DataFrame(te_ary, columns=te.columns_)  #Creating a new DataFrame from our Numpy array
df

Unnamed: 0,Apple,Corn,Dill,Eggs,Ice cream,Kidney Beans,Milk,Nutmeg,Onion,Unicorn,Yogurt
0,False,False,False,True,False,True,True,True,True,False,True
1,False,False,True,True,False,True,False,True,True,False,True
2,True,False,False,True,False,True,True,False,False,False,False
3,False,True,False,False,False,True,True,False,False,True,True
4,False,True,False,True,True,True,False,False,True,False,False


### Support Code
Let’s select itemsets with a minimum of 60% Support

Apriori returns by default the column indice of the item .For example (3) means Eggs.

In [None]:
from mlxtend.frequent_patterns import apriori
apriori(df, min_support=0.6)

Unnamed: 0,support,itemsets
0,0.8,(3)
1,1.0,(5)
2,0.6,(6)
3,0.6,(8)
4,0.6,(10)
5,0.8,"(3, 5)"
6,0.6,"(8, 3)"
7,0.6,"(5, 6)"
8,0.6,"(8, 5)"
9,0.6,"(10, 5)"


In [None]:
#support with column names
frequent_itemsets=apriori(df, min_support=0.6, use_colnames=True) #Instead of column indices we can use column names.
frequent_itemsets

Unnamed: 0,support,itemsets
0,0.8,(Eggs)
1,1.0,(Kidney Beans)
2,0.6,(Milk)
3,0.6,(Onion)
4,0.6,(Yogurt)
5,0.8,"(Kidney Beans, Eggs)"
6,0.6,"(Onion, Eggs)"
7,0.6,"(Kidney Beans, Milk)"
8,0.6,"(Kidney Beans, Onion)"
9,0.6,"(Kidney Beans, Yogurt)"


# Confidence Code
In case we want to extract rules based on other metrics like confidence, we can use association_rules from mlxtend.frequent_patterns library.

In [None]:
from mlxtend.frequent_patterns import association_rules 
association_rules(frequent_itemsets,metric="confidence",min_threshold=0.7) # associate itemsets with confidence over 70%.

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
0,(Kidney Beans),(Eggs),1.0,0.8,0.8,0.8,1.0,0.0,1.0
1,(Eggs),(Kidney Beans),0.8,1.0,0.8,1.0,1.0,0.0,inf
2,(Onion),(Eggs),0.6,0.8,0.6,1.0,1.25,0.12,inf
3,(Eggs),(Onion),0.8,0.6,0.6,0.75,1.25,0.12,1.6
4,(Milk),(Kidney Beans),0.6,1.0,0.6,1.0,1.0,0.0,inf
5,(Onion),(Kidney Beans),0.6,1.0,0.6,1.0,1.0,0.0,inf
6,(Yogurt),(Kidney Beans),0.6,1.0,0.6,1.0,1.0,0.0,inf
7,"(Onion, Kidney Beans)",(Eggs),0.6,0.8,0.6,1.0,1.25,0.12,inf
8,"(Kidney Beans, Eggs)",(Onion),0.8,0.6,0.6,0.75,1.25,0.12,1.6
9,"(Onion, Eggs)",(Kidney Beans),0.6,1.0,0.6,1.0,1.0,0.0,inf


# Lift code
###Associating based on Lift

In [None]:
from mlxtend.frequent_patterns import association_rules 
association_rules(frequent_itemsets,metric="lift",min_threshold=1.25)

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
0,(Onion),(Eggs),0.6,0.8,0.6,1.0,1.25,0.12,inf
1,"(Onion, Kidney Beans)",(Eggs),0.6,0.8,0.6,1.0,1.25,0.12,inf
2,(Onion),"(Kidney Beans, Eggs)",0.6,0.8,0.6,1.0,1.25,0.12,inf


# Reinforcement Learning
* Reinforcement learning is an area of machine learning that helps machines or software called agents to take the best decisions (actions) in a situation to maximize reward:

* Let’s say you are playing Mario, your actions are “move left”, “move right”, and “jump“.

* Your objective is to reach the end of the map so that’s the maximum reward possible.

Let’s start playing:

* Your agent Mario will start trying out actions to get a reward, he will start moving right until he reaches an obstacle.

* But when he repeats the whole process he will try a different combination of actions to avoid the obstacle and maximize the reward.

* It will keep doing it until he finishes the whole game.

<img src="https://i.postimg.cc/SNw0w568/pasted-image-0.png" width=400>