# Data-Driven Cocktail Challenge
## A mystery recipe to solve with pandas
### Brought to you by Top Shelf Data Science, a new podcast from Alteryx. 

Top Shelf Data Science features top experts in lively and informative conversations that will change the way you do data science. We'll explore important topics in machine learning and AI with the innovators shaping the field, diving deep but keeping it light with happy hour beverages and snacks.

Find us in your favorite podcast player and at alteryx.com/topshelf.

***

You'll need [this dataset of cocktail recipes](https://www.kaggle.com/ai-first/cocktail-ingredients) to complete this challenge!

In [18]:
import pandas as pd
import numpy as np

In [19]:
# Update file location for your downloaded dataset

df = pd.read_csv("all_drinks.csv")

In [32]:
# Step 1a
# Find the cocktail in the dataset that includes raspberry vodka as its first ingredient. 
# What is the measurement for that first ingredient?

df.loc[df['strIngredient1'] == 'Raspberry vodka', 'strMeasure1']

428    2 oz 
Name: strMeasure1, dtype: object

In [21]:
# Step 1b
# Only one drink contains apricot brandy as its first ingredient and apple brandy as its second. 
# What’s the quantity for apricot brandy? 

df.loc[(df['strIngredient1'] == 'Apricot brandy') & (df['strIngredient2'] == 'Apple brandy'), 'strMeasure1']

87    1/2 oz 
Name: strMeasure1, dtype: object

In [22]:
# Step 1c
# Only one drink is served in a highball glass and has sugar as its first ingredient. 
# What’s the quantity for sugar?

df.loc[(df['strIngredient1'] == 'Sugar') & (df['strGlass'] == 'Highball glass'), 'strMeasure1']

160    1 tsp superfine 
Name: strMeasure1, dtype: object

In [23]:
# Step 1d
# Only one drink is served in a punch bowl (as its “glass”) and has “apple” in its name. 
# What’s the measurement for its first ingredient?

df.loc[(df['strGlass'] == 'Punch Bowl') & (df['strDrink'].str.contains("Apple")), 'strMeasure1']

92    3 oz 
Name: strMeasure1, dtype: object

In [24]:
# Step 2a
# First ingredient
# Find the drink at index 462. What is its second ingredient?

df.iloc[462]['strIngredient2']

'Powdered sugar'

In [33]:
# Step 2b
# Second ingredient
# Find the size of the cocktail dataframe. Divide the size by 93 and round the result to 0 decimal places.
# Find the cocktail at that index in the dataframe. What is its first ingredient?

df.loc[(df.size / 93).round()]['strIngredient1']

'Gin'

In [26]:
# Step 2c
# Third ingredient
# How many drinks in the dataset are listed as "Alcoholic"?
# Find that number and subtract 29.
# Then identify the drink at that index in the dataset and find its fourth ingredient.

df.iloc[(df['strAlcoholic'] == 'Alcoholic').sum(axis=0) - 29]['strIngredient4']

'Lemon juice'

In [35]:
# Step 2d
# Fourth ingredient
# Three drinks have names 13 characters long that contain the word "Amaretto."
# Find the third ingredient of the drink whose name is second alphabetically among those three drinks.

df.loc[(df['strDrink'].str.len() == 13) & (df['strDrink'].str.contains("Amaretto"))]['strIngredient3'][1:2]

76    Club soda
Name: strIngredient3, dtype: object

In [37]:
# Step 2e
# Fifth ingredient
# How many unique values are there for the first measurement column? 
# Find the drink at the index for that value. Identify its third ingredient.
# You will need a slice of that ingredient as your final item for the recipe.

df.iloc[df['strMeasure1'].nunique()]['strIngredient3']

'Lemon'

In [39]:
# Step 3
# Shake or stir?
# How many drinks contain "stir" in their instructions? Be sure your query is not case sensitive.
# If the number is less than 200, use "shake" in your recipe; if 200 or more, use "stir" in the recipe.

df[df['strInstructions'].str.contains("Stir", case=False, na=False)].shape[0]

152

In [40]:
# Step 4
# Which kind of glass should this drink use?
# Find the memory in bytes used by the names of the drinks ('strDrink' column).
# What is the third digit of the resulting number? 
# Find the drink at that index; put the same kind of glass it uses into your recipe's instructions.

df.iloc[int(str(df['strDrink'].memory_usage())[2])]['strGlass']

'Collins Glass'