# Restaurant Recommender System Based On GPS

Sistem rekomendasi ini merupakan sistem rekomendasi restoran yang menggunakan alogritma K-Nearest Neighbour. Dalam alogirtma yang ada matrix yang digunakan adalah dengan menggunakan Location yang berupa Latitude dan Longitude serta jenis dari restoran. K-Neighbour yang digunakan adalah 5, untuk mengambil 5 index teratas yang mirip.

# Instalasi
Sebelum menginstall, dimohon untuk menjalankan command pada command prompt

`pip install numpy`

`pip install pandas`

`pip install fuzzywuzzy`

`pip install sklearn`

## Code

pertama-tama dilakukan import numpy,pandas,sklearn.neighbor, dan fuzzywuzzy. numpy dan pandas digunakan untuk pemorsesan data, sklearn.neigbors merupakan model dari KNN yang akan digunakan dan fuzzywuzzy digunakan untuk string matching yang akan digunakan dalam pencarian data.


In [1]:
import numpy as np
import pandas as pd
from fuzzywuzzy import process, fuzz
from sklearn.neighbors import NearestNeighbors
from geopy import distance
import random



Code dibawah digunakan untuk mengambil lokasi dari Restoran dengan menggunakan longitude dan latitude dari restoran yang ada

In [2]:
restaurant_id = pd.read_csv("datasets/geoplaces2.csv",usecols=["placeID",'longitude','latitude'],dtype={"placeID":"int32","longitude":"float32","latitude":"float32"}).sort_values(by="placeID").reset_index(drop=True)
restaurant_id

Unnamed: 0,placeID,latitude,longitude
0,132560,23.752304,-99.166916
1,132561,23.726818,-99.126503
2,132564,23.730925,-99.145187
3,132572,22.141647,-100.992714
4,132583,18.922291,-99.234329
...,...,...,...
125,135088,18.876011,-99.219887
126,135104,23.752981,-99.168434
127,135106,22.149710,-100.976089
128,135108,22.136253,-100.933586


Restaurant_record berisi dataframe yang mempunyai isi berupa placeID, jenis makanan serta lokasi gps dari restoran yang ada

In [3]:
restaurant_record = pd.read_csv("datasets/chefmozcuisine.csv",usecols=["placeID","Rcuisine"],dtype={"placeID":"int32","Rcuisine":"str"}).sort_values(by="placeID")
restaurant_record = restaurant_record[restaurant_record.placeID.isin(restaurant_id.placeID)].reset_index(drop=True)
restaurant_record = restaurant_record.merge(restaurant_id,on="placeID")
restaurant_record


Unnamed: 0,placeID,Rcuisine,latitude,longitude
0,132560,Regional,23.752304,-99.166916
1,132572,Cafeteria,22.141647,-100.992714
2,132583,Fast_Food,18.922291,-99.234329
3,132584,Mexican,23.752365,-99.165291
4,132594,Mexican,23.752167,-99.165710
...,...,...,...,...
106,135086,Burgers,22.141420,-101.013954
107,135088,Cafeteria,18.876011,-99.219887
108,135104,Mexican,23.752981,-99.168434
109,135106,Mexican,22.149710,-100.976089


variable restaurant_features_all digunakan untuk membuat pivot matriks, agar sistem KNN dapat berfungsi, dan mengubah bentuk jenis makanan dari string menjadi integer

In [4]:
restaurant_features = pd.concat([restaurant_record["Rcuisine"].str.get_dummies(sep=",")]).set_index(restaurant_record["placeID"])

restaurant_features_all = restaurant_features.reset_index()
restaurant_features_all

Unnamed: 0,placeID,American,Armenian,Bakery,Bar,Bar_Pub_Brewery,Breakfast-Brunch,Burgers,Cafe-Coffee_Shop,Cafeteria,...,Fast_Food,International,Italian,Japanese,Mediterranean,Mexican,Pizzeria,Regional,Seafood,Vietnamese
0,132560,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,1,0,0
1,132572,0,0,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
2,132583,0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,0
3,132584,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0
4,132594,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
106,135086,0,0,0,0,0,0,1,0,0,...,0,0,0,0,0,0,0,0,0,0
107,135088,0,0,0,0,0,0,0,0,1,...,0,0,0,0,0,0,0,0,0,0
108,135104,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0
109,135106,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,0,0


Variable restaurant_feature_gps merupakan gabungan dari restaurant_feature dengan restaurant_id, dataframe ini yang nantinya akan digunakan untuk keperluan training dari KNN

In [5]:
restaurant_feature_gps = restaurant_features_all.merge(restaurant_id,on="placeID").drop(columns="placeID")
restaurant_feature_gps

Unnamed: 0,American,Armenian,Bakery,Bar,Bar_Pub_Brewery,Breakfast-Brunch,Burgers,Cafe-Coffee_Shop,Cafeteria,Chinese,...,Italian,Japanese,Mediterranean,Mexican,Pizzeria,Regional,Seafood,Vietnamese,latitude,longitude
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,1,0,0,23.752304,-99.166916
1,0,0,0,0,0,0,0,0,1,0,...,0,0,0,0,0,0,0,0,22.141647,-100.992714
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,18.922291,-99.234329
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,0,0,23.752365,-99.165291
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,0,0,23.752167,-99.165710
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
106,0,0,0,0,0,0,1,0,0,0,...,0,0,0,0,0,0,0,0,22.141420,-101.013954
107,0,0,0,0,0,0,0,0,1,0,...,0,0,0,0,0,0,0,0,18.876011,-99.219887
108,0,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,0,0,23.752981,-99.168434
109,0,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,0,0,22.149710,-100.976089


Variable restaurant_full_record berisi id, nama, alamat, jenis makanan dan koordinat dari restoran. 

In [6]:
restaurant_full_record = pd.read_csv("datasets/geoplaces2.csv",usecols=["placeID","name","address"],dtype={"placeID":"int32","name":"str","address":"str","price":"str"},encoding='utf_8').sort_values(by="placeID").reset_index(drop=True)
restaurant_full_record = restaurant_full_record.merge(restaurant_record,on="placeID")
restaurant_full_record

Unnamed: 0,placeID,name,address,Rcuisine,latitude,longitude
0,132560,puesto de gorditas,frente al tecnologico,Regional,23.752304,-99.166916
1,132572,Cafe Chaires,?,Cafeteria,22.141647,-100.992714
2,132583,McDonalds Centro,Rayon sn col. Centro,Fast_Food,18.922291,-99.234329
3,132584,Gorditas Dona Tota,?,Mexican,23.752365,-99.165291
4,132594,tacos de barbacoa enfrente del Tec,?,Mexican,23.752167,-99.165710
...,...,...,...,...,...,...
106,135086,Mcdonalds Parque Tangamanga,Lateral Salvador Nava Martinez 3145,Burgers,22.141420,-101.013954
107,135088,Cafeteria cenidet,Interior Internado Palmira SN,Cafeteria,18.876011,-99.219887
108,135104,vips,?,Mexican,23.752981,-99.168434
109,135106,El Rincon de San Francisco,Universidad 169,Mexican,22.149710,-100.976089


restaurant_rating_df berisi dataframe yang berisi rating dari restoran berserta mapping dengan id restoran tsb

In [7]:
restaurant_rating_record = pd.read_csv("datasets/rating_final.csv",usecols=["placeID","rating","userID"],dtype={"placeID":"int32","rating":"int32","userID":"str"})
restaurant_rating_record = restaurant_rating_record[restaurant_rating_record.placeID.isin(restaurant_id.placeID)]
restaurant_rating_df = restaurant_rating_record.groupby(by=["placeID"]).mean().reset_index()
restaurant_rating_df = restaurant_rating_df[restaurant_rating_df.placeID.isin(restaurant_record.placeID)]
restaurant_rating_df


Unnamed: 0,placeID,rating
0,132560,0.500000
3,132572,1.000000
4,132583,1.000000
5,132584,1.333333
6,132594,0.600000
...,...,...
124,135086,0.800000
125,135088,1.000000
126,135104,0.857143
127,135106,1.200000


restaurant_rating_df kemudian digabungkan dengan restaurant_full_record untuk membentuk data yang berisi id, nama, alamat, jenis makanan, koordinat dan rating dari restoran

In [8]:
restaurant_full_record = restaurant_full_record.merge(restaurant_rating_df,on="placeID")
restaurant_full_record

Unnamed: 0,placeID,name,address,Rcuisine,latitude,longitude,rating
0,132560,puesto de gorditas,frente al tecnologico,Regional,23.752304,-99.166916,0.500000
1,132572,Cafe Chaires,?,Cafeteria,22.141647,-100.992714,1.000000
2,132583,McDonalds Centro,Rayon sn col. Centro,Fast_Food,18.922291,-99.234329,1.000000
3,132584,Gorditas Dona Tota,?,Mexican,23.752365,-99.165291,1.333333
4,132594,tacos de barbacoa enfrente del Tec,?,Mexican,23.752167,-99.165710,0.600000
...,...,...,...,...,...,...,...
106,135086,Mcdonalds Parque Tangamanga,Lateral Salvador Nava Martinez 3145,Burgers,22.141420,-101.013954,0.800000
107,135088,Cafeteria cenidet,Interior Internado Palmira SN,Cafeteria,18.876011,-99.219887,1.000000
108,135104,vips,?,Mexican,23.752981,-99.168434,0.857143
109,135106,El Rincon de San Francisco,Universidad 169,Mexican,22.149710,-100.976089,1.200000


model_cnn digunakan untuk mentrain model dari KNN, hasil dari algoritma KNN sendiri berada dalam variable indices, sedangkan distance adalah jarak dari suatu data ke data yang lainnya

In [9]:
model_cnn =  NearestNeighbors(algorithm='auto', n_neighbors=10)
model_cnn.fit(restaurant_feature_gps)
distances, indices = model_cnn.kneighbors(restaurant_feature_gps,n_neighbors = 10)

function get_index_from_type digunakan untuk melakukan pencarian index restaurant terdekat, restaurant terdekat dihitung dengan cara mencari distance antara koordinat user dengan koordinat restoran, apabila user memberikan jenis restoran yang user inginkan, maka index restoran yang akan diambil adalah index restoran yang memiliki jenis makanan sama seperti yang user inginkan

In [10]:
def get_index_from_type(coords,name=None):
    index = random.sample(range(1,111), 20)
    if(name):
        restaurant_result = restaurant_full_record[restaurant_full_record["Rcuisine"]==name]
    else:
        restaurant_result = restaurant_full_record.loc[index]
    resto_distance = []
    for i in restaurant_result.index.tolist():
        resto_distance.append(distance.distance((restaurant_result.latitude[i],restaurant_result.longitude[i]),coords).km)
    restaurant_result.insert(loc=restaurant_result.shape[1],column="Distance",value=resto_distance)
    restaurant_result = restaurant_result.sort_values(by=['Distance'])
    distance_list = restaurant_result.head(1).Distance.to_list()
    if(distance_list[0] > 0.5):
        restaurant_result = get_index_from_type(coords)
    return restaurant_result

function resto_search akan mencari nama  jenis makanan yang diinput user, apabila jenis makanan tersebut berada didalam datasets (dengan similarity sebanyak 75) maka jenis makanan akan dipilih, namun bila jenis makanan tidak ada di datasets maka jenis makanan tidak akan dipilih

In [11]:
def resto_search(name):
    resto_type = process.extractOne(name,restaurant_full_record['Rcuisine'])
    if(resto_type[1] < 75):
        res = None
    else:
        res = resto_type[0]
    return res

fungsi dari main_function merupakan fungsi utama dalam algoritma ini, fungsi main_function me-return dataframe dari hasil rekomendasi menggunakan KNN, selain itu fungsi ini juga memanggil fungsi resto_search serta fungsi get_index_from_type untuk mendapatkan nama resto dan juga untuk mendapatkan index dari restoran yang terdekat

In [12]:
def main_function(longitude,latitude,resto_type=None,jarak=None):
    if(jarak==None or jarak < 0):
        jarak=None
    coords = (longitude, latitude)
    resto_index = get_index_from_type(coords,name=resto_type)
    index_arr = []
    for id in indices[resto_index.index.tolist()[0]]:
        index_arr.append(id)
    restaurant_result = restaurant_full_record.loc[index_arr]
    resto_distance = []
    for i in restaurant_result.index.tolist():
        resto_distance.append(round(distance.distance((restaurant_result.latitude[i],restaurant_result.longitude[i]),coords).km,2))
    restaurant_result.insert(loc=restaurant_result.shape[1],column="Distance",value=resto_distance)
    if(resto_type):
        restaurant_a_result = restaurant_result[restaurant_result['Rcuisine']==resto_type].reset_index(drop=True).sort_values(by="Distance")
        restaurant_b_result = restaurant_result[restaurant_result['Rcuisine']!=resto_type].reset_index(drop=True).sort_values(by="Distance")
        restaurant_result = pd.concat([restaurant_a_result,restaurant_b_result],ignore_index=True)
    elif(jarak):
        restaurant_result = restaurant_result[restaurant_result["Distance"] <= jarak].sort_values(by="Distance")
    elif(resto_type and jarak):
        restaurant_a_result = restaurant_result[restaurant_result['Rcuisine']==resto_type and restaurant_result["Distance"] <= jarak].reset_index(drop=True).sort_values(by="Distance")
        restaurant_b_result = restaurant_result[restaurant_result['Rcuisine']!=resto_type and restaurant_result["Distance"] <= jarak].reset_index(drop=True).sort_values(by="Distance")
        restaurant_result = pd.concat([restaurant_a_result,restaurant_b_result],ignore_index=True)
    else:
        restaurant_result = restaurant_result.sort_values(by='Distance',axis=0)
    for i in restaurant_result.index.to_list():
        print("Rekomendasi Restoran")
        print("Nama   : "+restaurant_result.name.loc[i])
        print("Alamat : "+restaurant_result.address.loc[i])
        print("Jenis  : "+restaurant_result.Rcuisine.loc[i])
        print("Jarak  : "+str(restaurant_result.Distance.loc[i])+"Km")
        print("Rating : "+str(round(restaurant_result.rating.loc[i],2))+"\n")
        print("========================================\n")

In [15]:
main_function(float(22.1506429),float(-100.9870148))

Rekomendasi Restoran
Nama   : La Posada del Virrey
Alamat : Av. V. Carranza
Jenis  : International
Jarak  : 0.0Km
Rating : 1.39


Rekomendasi Restoran
Nama   : crudalia
Alamat : ?
Jenis  : Bar
Jarak  : 0.22Km
Rating : 1.24


Rekomendasi Restoran
Nama   : Preambulo Wifi Zone Cafe
Alamat : Anahuac 805
Jenis  : Cafeteria
Jarak  : 0.4Km
Rating : 1.58


Rekomendasi Restoran
Nama   : Preambulo Wifi Zone Cafe
Alamat : Anahuac 805
Jenis  : Cafe-Coffee_Shop
Jarak  : 0.4Km
Rating : 1.58


Rekomendasi Restoran
Nama   : La Virreina
Alamat : Av. Carranza 830
Jenis  : Mexican
Jarak  : 0.44Km
Rating : 1.53


Rekomendasi Restoran
Nama   : VIPS
Alamat : NICOLAS ZAPATA 300
Jenis  : American
Jarak  : 0.44Km
Rating : 1.0


Rekomendasi Restoran
Nama   : Tortas Locas Hipocampo
Alamat : Venustiano Carranza 719 Centro
Jenis  : Fast_Food
Jarak  : 0.45Km
Rating : 1.33


Rekomendasi Restoran
Nama   : Cabana Huasteca
Alamat : Cuauhtemoc 455
Jenis  : Mexican
Jarak  : 0.49Km
Rating : 1.46


Rekomendasi Restoran
Nam

In [17]:
main_function(float(22.1480965),float(-101.0173023),"Burgers")

Rekomendasi Restoran
Nama   : Carls Jr
Alamat : Av. V. Carranza
Jenis  : Burgers
Jarak  : 0.0Km
Rating : 1.43


Rekomendasi Restoran
Nama   : Mcdonalds Parque Tangamanga
Alamat : Lateral Salvador Nava Martinez 3145
Jenis  : Burgers
Jarak  : 0.82Km
Rating : 0.8


Rekomendasi Restoran
Nama   : Tortas y hamburguesas el gordo
Alamat : Ricardo B. Anaya
Jenis  : Burgers
Jarak  : 8.46Km
Rating : 0.6


Rekomendasi Restoran
Nama   : Hamburguesas Valle Dorado
Alamat : Av. Coral
Jenis  : Burgers
Jarak  : 8.47Km
Rating : 0.8


Rekomendasi Restoran
Nama   : El Mundo de la Pasta
Alamat : Rio Papaloapan 265 Lomas de San Luis (3)
Jenis  : Italian
Jarak  : 0.26Km
Rating : 1.5


Rekomendasi Restoran
Nama   : Gordas de morales
Alamat : ?
Jenis  : Mexican
Jarak  : 0.28Km
Rating : 1.42


Rekomendasi Restoran
Nama   : La Estrella de Dimas
Alamat : Av. de los Pintores
Jenis  : Mexican
Jarak  : 0.62Km
Rating : 1.8


Rekomendasi Restoran
Nama   : emilianos
Alamat : venustiano carranza
Jenis  : Bar_Pub_Brewery
