<div class="alert alert-primary" style="margin-top: 20px">


<h1><center>Armut ARL</center></h1>

</div>

![Açıklama](https://www.girisimhaberleri.com/wp-content/uploads/2023/01/armut-com-15-milyon-euro-yatirim-1.jpg)

### Armut, Turkey’s largest online service platform, connects service providers with those seeking services.

### It enables easy access to services such as cleaning, renovation, and moving with a few taps on a computer or smartphone.

### Using the dataset containing users who received services and the categories and services they received, it is desired to create a product recommendation system using Association Rule Learning.


---

# Dataset

---

### The dataset consists of services received by customers and the categories of these services. Each received service includes the date and time information.

| Column Name | Description |
|-------------|-------------|
| UserId      | Customer number |
| ServiceId   | Anonymized services for each category. (Example: sofa cleaning service under the cleaning category). A ServiceId can be found under different categories and represents different services under different categories. (Example: The service with CategoryId 7 and ServiceId 4 is radiator cleaning, while the service with CategoryId 2 and ServiceId 4 is furniture assembly) |
| CategoryId  | Anonymized categories. (Example: cleaning, moving, renovation categories) |
| CreateDate  | The date the service was purchased |

---
## TASK 1: Data Preparation
---

In [1]:
#Libraries and Settings
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
import pandas as pd
from mlxtend.frequent_patterns import apriori, association_rules
pd.set_option("display.width", 500)
pd.set_option("display.max_columns", None)
pd.set_option("display.expand_frame_repr", False)

In [2]:
df_ = pd.read_csv("/kaggle/input/armut-arl/armut_data.csv")
df = df_.copy()

In [3]:
df.head()
df.shape
df.describe().T
df.isnull().sum()

Unnamed: 0,UserId,ServiceId,CategoryId,CreateDate
0,25446,4,5,2017-08-06 16:11:00
1,22948,48,5,2017-08-06 16:12:00
2,10618,0,8,2017-08-06 16:13:00
3,7256,9,4,2017-08-06 16:14:00
4,25446,48,5,2017-08-06 16:16:00


(162523, 4)

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
UserId,162523.0,13089.803862,7325.81606,0.0,6953.0,13139.0,19396.0,25744.0
ServiceId,162523.0,21.64114,13.774405,0.0,13.0,18.0,32.0,49.0
CategoryId,162523.0,4.325917,3.129292,0.0,1.0,4.0,6.0,11.0


UserId        0
ServiceId     0
CategoryId    0
CreateDate    0
dtype: int64

In [4]:
df["Service_in_Category"] = df["ServiceId"].astype(str) + "_" + df["CategoryId"].astype(str)
df.head()

Unnamed: 0,UserId,ServiceId,CategoryId,CreateDate,Service_in_Category
0,25446,4,5,2017-08-06 16:11:00,4_5
1,22948,48,5,2017-08-06 16:12:00,48_5
2,10618,0,8,2017-08-06 16:13:00,0_8
3,7256,9,4,2017-08-06 16:14:00,9_4
4,25446,48,5,2017-08-06 16:16:00,48_5


In [5]:
df["Date"] = pd.to_datetime(df["CreateDate"]).dt.strftime('%Y-%m')
df["ID"] = df["UserId"].astype(str) + "_" + df["Date"].astype(str)

---
## TASK 2: Generate Association Rules
---

In [6]:
df_pivot = df.groupby(["ID", "Service_in_Category"]).size().unstack(fill_value=0).map(lambda x: 1 if x > 0 else 0)
df_pivot.head()

Service_in_Category,0_8,10_9,11_11,12_7,13_11,14_7,15_1,16_8,17_5,18_4,19_6,1_4,20_5,21_5,22_0,23_10,24_10,25_0,26_7,27_7,28_4,29_0,2_0,30_2,31_6,32_4,33_4,34_6,35_11,36_1,37_0,38_4,39_10,3_5,40_8,41_3,42_1,43_2,44_0,45_6,46_4,47_7,48_5,49_1,4_5,5_11,6_7,7_3,8_5,9_4
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1
0_2017-08,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0
0_2017-09,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0
0_2018-01,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0
0_2018-04,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0
10000_2017-08,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0


In [7]:
frequent_itemsets = apriori(df_pivot,
                            min_support=0.01,
                            use_colnames=True)

frequent_itemsets.sort_values("support", ascending=False)

rules = association_rules(frequent_itemsets,
                          metric="support",
                          min_threshold=0.01)

sorted_rules = rules.sort_values("lift", ascending=False)



Unnamed: 0,support,itemsets
8,0.238121,(18_4)
19,0.130286,(2_0)
5,0.120963,(15_1)
39,0.067762,(49_1)
28,0.066568,(38_4)
3,0.056627,(13_11)
12,0.047515,(22_0)
9,0.045563,(19_6)
15,0.042895,(25_0)
7,0.041533,(17_5)


In [8]:
def arl_recommender(rules_df, service_id, rec_count=1):
    sorted_rules = rules_df.sort_values("lift", ascending=False)
    recommendation_list=[]
    for i, product in enumerate(sorted_rules["antecedents"]):
        for j in list(product):
            if j == service_id:
                recommendation_list.append(list(sorted_rules.iloc[i]["consequents"])[0])
    return recommendation_list[0:rec_count]

arl_recommender(rules, "2_0", 1)
arl_recommender(rules, "2_0", 2)
arl_recommender(rules, "2_0", 3)

['22_0']

['22_0', '25_0']

['22_0', '25_0', '15_1']