# Armut Association Rule Based Recommender System

---------

## İş Problemi

**Senaryo**

Türkiye’nin en büyük online hizmet platformu olan Armut, hizmet verenler ile hizmet almak isteyenleri buluşturmaktadır.
Bilgisayarın veya akıllı telefonunun üzerinden birkaç dokunuşla temizlik, tadilat, nakliyat gibi hizmetlere kolayca
ulaşılmasını sağlamaktadır.
Hizmet alan kullanıcıları ve bu kullanıcıların almış oldukları servis ve kategorileri içeren veri setini kullanarak Association
Rule Learning ile ürün tavsiye sistemi oluşturulmak istenmektedir.

## Veri Seti Hikayesi

*Veri seti müşterilerin aldıkları servislerden ve bu servislerin kategorilerinden oluşmaktadır. Alınan her hizmetin tarih ve saat
bilgisini içermektedir.*

In [68]:
### Değişkenler

##### UserId : Müşteri numarası.
##### ServiceId : Her kategoriye ait anonimleştirilmiş servislerdir.
##### CategoryId : Anonimleştirilmiş kategorilerdir.
##### CreateDate : Hizmetin satın alındığı tarih.

## To do List:

**GÖREV 1**: Veriyi Hazırlama

**GÖREV 2**: Birliktelik Kuralları Üretiniz ve Öneride Bulunmak


-------

In [69]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [25]:
# Kullanılacak kütüphaneleri import edelim;

import pandas as pd
import numpy as np
from mlxtend.frequent_patterns import apriori,association_rules



# pandas görüntü ayarlarını yapalım;

pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.float_format', lambda x: '%.3f' % x)


# veri setini dahil edelim;

df_ = pd.read_csv(r"/content/drive/MyDrive/recommendation_cases/case_armut_1/armut_data.csv")
df = df_.copy()
df.head()

Unnamed: 0,UserId,ServiceId,CategoryId,CreateDate
0,25446,4,5,2017-08-06 16:11:00
1,22948,48,5,2017-08-06 16:12:00
2,10618,0,8,2017-08-06 16:13:00
3,7256,9,4,2017-08-06 16:14:00
4,25446,48,5,2017-08-06 16:16:00


### Veriyi Hazırlama 

In [26]:
# ServiceID ve CategoryID değişkenlerini birleştirelim;


df["hizmet"] = df["ServiceId"].astype("str") + "_" + df["CategoryId"].astype("str")

df.head()

Unnamed: 0,UserId,ServiceId,CategoryId,CreateDate,hizmet
0,25446,4,5,2017-08-06 16:11:00,4_5
1,22948,48,5,2017-08-06 16:12:00,48_5
2,10618,0,8,2017-08-06 16:13:00,0_8
3,7256,9,4,2017-08-06 16:14:00,9_4
4,25446,48,5,2017-08-06 16:16:00,48_5


In [27]:
# yeni bir tarih oluşturup yıl ve ay bilgilerini birleştirelim;

df["CreateDate"] = pd.to_datetime(df["CreateDate"])
df["New_Date"] = df["CreateDate"].dt.strftime("%Y-%m")
df.head()


Unnamed: 0,UserId,ServiceId,CategoryId,CreateDate,hizmet,New_Date
0,25446,4,5,2017-08-06 16:11:00,4_5,2017-08
1,22948,48,5,2017-08-06 16:12:00,48_5,2017-08
2,10618,0,8,2017-08-06 16:13:00,0_8,2017-08
3,7256,9,4,2017-08-06 16:14:00,9_4,2017-08
4,25446,48,5,2017-08-06 16:16:00,48_5,2017-08


In [28]:
# fatura bilgisi tanımlamak için UserId bilgisiyle tarihleri birleştirelim;

df["SepetId"] = df["UserId"].astype("str") + "_" + df["New_Date"].astype("str")

df.head()

Unnamed: 0,UserId,ServiceId,CategoryId,CreateDate,hizmet,New_Date,SepetId
0,25446,4,5,2017-08-06 16:11:00,4_5,2017-08,25446_2017-08
1,22948,48,5,2017-08-06 16:12:00,48_5,2017-08,22948_2017-08
2,10618,0,8,2017-08-06 16:13:00,0_8,2017-08,10618_2017-08
3,7256,9,4,2017-08-06 16:14:00,9_4,2017-08,7256_2017-08
4,25446,48,5,2017-08-06 16:16:00,48_5,2017-08,25446_2017-08


### Birliktelik Kuralları Üretiniz ve Öneride Bulunmak

In [29]:
# Pivot tablosunu Oluşturalım;

df_matrix = df.groupby(["SepetId", "hizmet"])["hizmet"].count().unstack().fillna(0).applymap(lambda x: 1 if x>0 else 0 )

df_matrix.head()


hizmet,0_8,10_9,11_11,12_7,13_11,14_7,15_1,16_8,17_5,18_4,19_6,1_4,20_5,21_5,22_0,23_10,24_10,25_0,26_7,27_7,28_4,29_0,2_0,30_2,31_6,32_4,33_4,34_6,35_11,36_1,37_0,38_4,39_10,3_5,40_8,41_3,42_1,43_2,44_0,45_6,46_4,47_7,48_5,49_1,4_5,5_11,6_7,7_3,8_5,9_4
SepetId,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1,Unnamed: 31_level_1,Unnamed: 32_level_1,Unnamed: 33_level_1,Unnamed: 34_level_1,Unnamed: 35_level_1,Unnamed: 36_level_1,Unnamed: 37_level_1,Unnamed: 38_level_1,Unnamed: 39_level_1,Unnamed: 40_level_1,Unnamed: 41_level_1,Unnamed: 42_level_1,Unnamed: 43_level_1,Unnamed: 44_level_1,Unnamed: 45_level_1,Unnamed: 46_level_1,Unnamed: 47_level_1,Unnamed: 48_level_1,Unnamed: 49_level_1,Unnamed: 50_level_1
0_2017-08,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0,0,0
0_2017-09,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,0,0,0,0
0_2018-01,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,1,0,0
0_2018-04,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0
10000_2017-08,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0


In [30]:
# Apriori algoritmasını çalıştıralım;

df_apr = apriori(df_matrix,min_support = 0.01, use_colnames = True)

df_apr.head()

Unnamed: 0,support,itemsets
0,0.02,(0_8)
1,0.027,(11_11)
2,0.029,(12_7)
3,0.057,(13_11)
4,0.023,(14_7)


In [31]:
# birliktelik kuralını uygulayalım;

arl = association_rules(df_apr,metric = "support" , min_threshold = 0.01)

arl.head()

Unnamed: 0,antecedents,consequents,antecedent support,consequent support,support,confidence,lift,leverage,conviction
0,(13_11),(2_0),0.057,0.13,0.013,0.226,1.738,0.005,1.124
1,(2_0),(13_11),0.13,0.057,0.013,0.098,1.738,0.005,1.046
2,(15_1),(2_0),0.121,0.13,0.034,0.281,2.154,0.018,1.209
3,(2_0),(15_1),0.13,0.121,0.034,0.261,2.154,0.018,1.189
4,(15_1),(33_4),0.121,0.027,0.011,0.093,3.4,0.008,1.072


In [67]:
# tavsiye fonksiyonunu yazalım; 

def recom_func(df,id,recom_count = 1):
  recom_list = []
  df = df.sort_values(by="lift", ascending=False)

  for index,products in enumerate(list(df["antecedents"])):
    for k in list(products):
      if k == id:
        recom_list.append(list(df.iloc[index]["consequents"]))
  
  recom_list = [i[0] for i in recom_list]
  return recom_list[:recom_count]

response = recom_func(arl,"2_0",3)

response


['22_0', '25_0', '15_1']