
# <p style="font-family:newtimeroman;color:#A60000;font-weight: bold;font-size:150%;"> Recommendation Systems (Association Rule Learning)</p>

# Business Problem

* A Turkish company, Turkey's largest online service platform, brings together those who provide services and those who want to receive services.
* It provides easy access to services such as cleaning, renovation, transportation with a few taps on your computer or smartphone.
* Using a data set that includes service users and the services and categories they receive.
* We want to create a product recommendation system with Association Rule Learning.

# Dataset

* The data set consists of the services received by customers and the categories of these services.
* It contains the date and time information of each service received.

* **UserId**: Customer number
* **ServiceId**: Anonymized services belonging to each category (Example: Sofa washing service under the cleaning category)
> * A ServiceId can be found under different categories and refers to different services under different categories.
> * Example: CategoryId 7 ServiceId 4 refers to honeycomb cleaning while CategoryId 2 ServiceId 4 refers to furniture assembly)
* **CategoryId**: Anonymized categories (Example: Cleaning, transportation, renovation category)
* **CreateDate**: Date the service was purchased


# TASK 1: Preparing the Data


* **Step 1: Read the pearut_data.csv file.**

In [1]:
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.width', 1000)
pd.set_option('display.expand_frame_repr', False)
from mlxtend.frequent_patterns import apriori, association_rules

df_ = pd.read_csv('/kaggle/input/online-service-platform/turkish_online_service_platform_data.csv')
df = df_.copy()
df.head(10)
df.shape
df["UserId"].nunique()

24826

*** Step 2: ServiceId represents a different service for each CategoryId.**

> * **Create a new variable to represent services by concatenating ServiceId and CategoryId with "_".**

In [2]:
df['Service'] = (df['ServiceId'].astype(str)) + '_' + (df['CategoryId'].astype(str))
df.head(10)

Unnamed: 0,UserId,ServiceId,CategoryId,CreateDate,Service
0,25446,4,5,2017-08-06 16:11:00,4_5
1,22948,48,5,2017-08-06 16:12:00,48_5
2,10618,0,8,2017-08-06 16:13:00,0_8
3,7256,9,4,2017-08-06 16:14:00,9_4
4,25446,48,5,2017-08-06 16:16:00,48_5
5,14354,15,1,2017-08-06 16:27:00,15_1
6,14162,21,5,2017-08-06 16:28:00,21_5
7,21230,46,4,2017-08-06 16:34:00,46_4
8,25446,6,7,2017-08-06 16:39:00,6_7
9,10659,4,5,2017-08-06 16:44:00,4_5


*** Step 3: The dataset consists of the date and time when the services were received, there is no basket definition (invoice etc.).**

> * In order to apply Association Rule Learning, a basket (invoice etc.) definition must be created.
> * Here, the basket definition is the monthly services received by each customer. For example; 9_4, 46_4 services received by the customer with id 7256 in the 8th month of 2017 represent one basket; 9_4, 38_4 services received in the 10th month of 2017 represent another basket. Baskets must be identified with a unique ID.
> * For this, first create a new date variable containing only year and month. Combine UserID and the newly created date variable with "_" and assign it to a new variable named ID.

In [3]:
df["CreateDate"] = pd.to_datetime(df["CreateDate"])

df["New_Date"] = pd.to_datetime(df['CreateDate']).dt.to_period('M')
df['BasketID'] = (df['UserId'].astype(str)) + '_' + (df['New_Date'].astype(str))
df.head()

Unnamed: 0,UserId,ServiceId,CategoryId,CreateDate,Service,New_Date,BasketID
0,25446,4,5,2017-08-06 16:11:00,4_5,2017-08,25446_2017-08
1,22948,48,5,2017-08-06 16:12:00,48_5,2017-08,22948_2017-08
2,10618,0,8,2017-08-06 16:13:00,0_8,2017-08,10618_2017-08
3,7256,9,4,2017-08-06 16:14:00,9_4,2017-08,7256_2017-08
4,25446,48,5,2017-08-06 16:16:00,48_5,2017-08,25446_2017-08


# TASK 2: Generate Association Rules

*** Step 1: Create the basket service pivot table.**

In [4]:
pivot = df.pivot_table(index="BasketID",columns="Service",values="UserId", aggfunc="count").notnull()
pivot.head(10)

Service,0_8,10_9,11_11,12_7,13_11,14_7,15_1,16_8,17_5,18_4,19_6,1_4,20_5,21_5,22_0,23_10,24_10,25_0,26_7,27_7,28_4,29_0,2_0,30_2,31_6,32_4,33_4,34_6,35_11,36_1,37_0,38_4,39_10,3_5,40_8,41_3,42_1,43_2,44_0,45_6,46_4,47_7,48_5,49_1,4_5,5_11,6_7,7_3,8_5,9_4
BasketID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1
0_2017-08,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,True,False,False,False,False,False,False,False
0_2017-09,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,True,False,False,False,False,False
0_2018-01,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,True,False,False
0_2018-04,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False
10000_2017-08,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False
10000_2017-12,True,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
10000_2018-03,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
10001_2017-09,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False
10001_2018-05,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False
10001_2018-06,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,True,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False,False


* **Step 2: Create association rules.**

In [5]:
frequent_itemsets = apriori(pivot,
                            min_support=0.01,
                            use_colnames=True)

frequent_itemsets.sort_values('support', ascending=False).head(10)

rules = association_rules(frequent_itemsets, metric='support', min_threshold=0.01)

* **Step 3: Use the arl_recommender function to recommend a service to a user who last received service 2_0.**

In [6]:
def arl_recommender(rules_df, product_id, rec=1):
    sorted_rules = rules_df.sort_values("lift", ascending=False)
    recommendation_list = []
    for i, product in enumerate(sorted_rules["antecedents"]):
        for j in list(product):
            if j == product_id :
                for k in list(sorted_rules.iloc[i]["consequents"]):
                    if k not in recommendation_list:
                        recommendation_list.append(k)

    return recommendation_list[0:rec]

# 3 recommendation for who purchase service 2_0

arl_recommender(rules, "2_0", 3)

['22_0', '25_0', '15_1']