### COCKTAIL_RECOMMANTIONS

In this notebook we are going to implement the recommentation algorithm that calculate the cosine simmilarity of a cocktail based on the ingredients that cocktail has.


First thing first we are going to import all the packages that we are going to use in this notebook.

In [1]:
import pandas as pd
import numpy as np
import random
import time
import os
import json
import matplotlib.pyplot as plt
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import linear_kernel
from sklearn.metrics.pairwise import cosine_similarity
from sklearn.feature_extraction.text import CountVectorizer
from ast import literal_eval

First we need to read the `cocktails.json` file and extracy the ingredients of each `cocktail` as a list and create a new dataframe that we are going to use for implementing the recommentation algorithm. 

In [2]:
with open('cocktails.json', 'r') as reader:
    data = json.loads(reader.read())

From the `json` data that we are having we are going to create a dataframe that matches a cocktail with it's ingridients.

In [19]:
values = list()
for cocktail in data:
    name = cocktail.get('name')
    ingridients = []
    for ingredient in cocktail.get('glass_and_ingredients').get('ingredients'):
        i = ingredient.get('ingredient')
        ingridients.append(i)
    ingridients = ' '.join(ingridients).lower()
    values.append([name, ingridients])

df = pd.DataFrame(values, columns=['cocktail', 'ingredients'])
df.head()

Unnamed: 0,cocktail,ingredients
0,Vesper,gin vodka lillet blonde
1,Bacardi,white rum lime juice syrup
2,Negroni,gin campari vermouth
3,Rose,kirsch vermouth syrup
4,Old Fashioned,whiskey angostura bitters sugar few dashes pla...


Next we are going to save that dataframe as a `.csv` file.

In [21]:
df.to_csv('cocktails_ingredients.csv', index=False)
print("Saved!!")

Saved!!


Next we are going to create a function that recomments a cocktail that has similar ingredients. This function is going to be called `get_recommendations_based_on_ingredients`. Which takes in the name of the `cocktail` and the filename where the `cocktails` and `ingredients`  are pairly.

In [23]:
def get_recommendations_based_on_ingredients(name: str, file_name, n:int = 11)-> list[str]:
  dataframe = pd.read_csv(file_name)
  tfidf = TfidfVectorizer(stop_words='english')
  tfidf_matrix = tfidf.fit_transform(dataframe.ingredients)
  cosine_sim = linear_kernel(tfidf_matrix, tfidf_matrix)
  dataframe.drop_duplicates(subset=["cocktail"], inplace=True)
  indices = pd.Series(dataframe.index, index=dataframe['cocktail']).drop_duplicates()
  idx = indices[name]
  sim_scores = list(enumerate(cosine_sim[idx]))
  sim_scores = sorted(sim_scores, key=lambda x: x[1], reverse=True)
  sim_scores = sim_scores[:n]
  recipe_indices = [i[0] for i in sim_scores]
  return [i for i in dataframe['cocktail'].iloc[recipe_indices]]
get_recommendations_based_on_ingredients("Screwdriver", 'cocktails_ingredients.csv')

['Screwdriver',
 'Harvey Wallbanger',
 'Sex on the Beach',
 'Mimosa',
 'Tequila Sunrise',
 'Monkey Gland',
 'Kamikaze',
 'Sea Breeze',
 'Casino',
 "Planter's Punch",
 'Golden Dream']