<a href="https://colab.research.google.com/github/kurek0010/machine-learing-bootcamp/blob/main/unsupervised/03_association_rules/02_apriori.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### scikit-learn
Strona biblioteki: [https://scikit-learn.org](https://scikit-learn.org)  

Dokumentacja/User Guide: [https://scikit-learn.org/stable/user_guide.html](https://scikit-learn.org/stable/user_guide.html)

Podstawowa biblioteka do uczenia maszynowego w języku Python.

Aby zainstalować bibliotekę scikit-learn, użyj polecenia poniżej:
```
!pip install scikit-learn
```
Aby zaktualizować do najnowszej wersji bibliotekę scikit-learn, użyj polecenia poniżej:
```
!pip install --upgrade scikit-learn
```
Kurs stworzony w oparciu o wersję `0.22.1`

### Spis treści:
1. [Import bibliotek](#0)
2. [Załadownaie danych](#1)
3. [Przygotowanie danych](#2)
4. [Kodowanie transakcji](#3)
5. [Algorytm Apriori](#4)




### <a name='0'></a> Import bibliotek

In [None]:
import pandas as pd

pd.set_option('display.float_format', lambda x: f'{x:.2f}')

### <a name='1'></a> Załadownaie danych

In [None]:
!wget https://storage.googleapis.com/esmartdata-courses-files/ml-course/products.csv
!wget https://storage.googleapis.com/esmartdata-courses-files/ml-course/orders.csv

In [None]:
products = pd.read_csv('products.csv', usecols=['product_id', 'product_name'])
products.head()

In [None]:
orders = pd.read_csv('orders.csv', usecols=['order_id', 'product_id'])
orders.head()

### <a name='2'></a> Przygotowanie danych

In [None]:
data = pd.merge(orders, products, how='inner', on='product_id', sort=True)
data = data.sort_values(by='order_id')
data.head()

In [None]:
data.describe()

In [None]:
# rozkład produktów
data['product_name'].value_counts()

In [None]:
# liczba transakcji
data['order_id'].nunique()

In [None]:
transactions = data.groupby(by='order_id')['product_name'].apply(lambda name: ','.join(name))
transactions

In [None]:
transactions = transactions.str.split(',')
transactions

### <a name='3'></a> Kodowanie transakcji

In [None]:
from mlxtend.preprocessing import TransactionEncoder

encoder = TransactionEncoder()
encoder.fit(transactions)
transactions_encoded = encoder.transform(transactions, sparse=True)
transactions_encoded

In [None]:
transactions_encoded_df = pd.DataFrame(transactions_encoded.toarray(), columns=encoder.columns_)
transactions_encoded_df

### <a name='4'></a> Algorytm Apriori

In [None]:
from mlxtend.frequent_patterns import apriori, association_rules

supports = apriori(transactions_encoded_df, min_support=0.01, use_colnames=True)
supports = supports.sort_values(by='support', ascending=False)
supports.head(10)

In [None]:
rules = association_rules(supports, metric='confidence', min_threshold=0)
rules = rules.iloc[:, [0, 1, 4, 5, 6]]
rules = rules.sort_values(by='lift', ascending=False)
rules.head(15)