# Get from API

In this notebook we will get the data needed for the project from the api of [spoonacular](https://spoonacular.com/food-api/docs). We will save this raw data in a csv file, this csv will be cleaned and divided in multiple table and turned in a database in a different notebook.

In [98]:
import pandas as pd
import numpy as np
import requests
import os

In [99]:
# parameters
api_key = 'aaaf78de68d94b1dbd93ee06cc2551fb' # replace with your own API key
params = {
    'number': 100, # limited numbers of returns: 100
    'apiKey': api_key
}

In [100]:
#API call
response = requests.get('https://api.spoonacular.com/recipes/random', params=params) 

for each recipe of the response (if status code = 200), we will take the folowing information:
* id: uniquely identify a recipe
* image: link to an image of the recipe, could be great for the app
* sourceRrl: link to the recipe (for the app)
* title: name of the recipe
* instructions: instruction of the recipe (html format)
* summary: summary of what is the recipe (gluten free?, vegan?, prep time)
* extendedIngredients: ingredients + quantity + id ingrédients + aisle + ...  This part will be cleaned in another notebook

In [101]:
df_list = []  # List to accumulate rows

if response.status_code == 200:
    data = response.json()
    
    for recipe in data['recipes']:
        # Skip the recipe if any info is missing
        if 'id' not in recipe or 'image' not in recipe or 'sourceUrl' not in recipe or 'title' not in recipe or 'instructions' not in recipe or 'summary' not in recipe or 'extendedIngredients' not in recipe:
            continue

        new_row = {
            'id': recipe['id'],
            'image': recipe['image'],
            'sourceUrl': recipe['sourceUrl'],
            'title': recipe['title'],
            'instructions': recipe['instructions'],
            'summary': recipe['summary'],
            'extendedIngredients': recipe['extendedIngredients']
        }
        df_list.append(new_row)

    df = pd.DataFrame(df_list)

    
else:
    print('La requête a échoué avec le code', response.status_code)

In [102]:
#look at the result
print(df.info())
print("duplicates:", df['id'].duplicated().sum())
df

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 100 entries, 0 to 99
Data columns (total 7 columns):
 #   Column               Non-Null Count  Dtype 
---  ------               --------------  ----- 
 0   id                   100 non-null    int64 
 1   image                100 non-null    object
 2   sourceUrl            100 non-null    object
 3   title                100 non-null    object
 4   instructions         100 non-null    object
 5   summary              100 non-null    object
 6   extendedIngredients  100 non-null    object
dtypes: int64(1), object(6)
memory usage: 5.6+ KB
None
duplicates: 0


Unnamed: 0,id,image,sourceUrl,title,instructions,summary,extendedIngredients
0,640273,https://spoonacular.com/recipeImages/640273-55...,https://www.foodista.com/recipe/MMGP257M/crab-...,Crab Cake Stuffed Shrimp,Line a baking sheet with parchment paper.\nHea...,Crab Cake Stuffed Shrimp requires roughly <b>4...,"[{'id': 1145, 'aisle': 'Milk, Eggs, Other Dair..."
1,729532,https://spoonacular.com/recipeImages/729532-55...,http://www.pinkwhen.com/gluten-free-almond-blu...,Gluten Free Almond Blueberry Coffee Cake,<ol><li>Preheat your oven to 375 degrees F and...,"You can never have too many breakfast recipes,...","[{'id': 10112061, 'aisle': 'Baking', 'image': ..."
2,647679,https://spoonacular.com/recipeImages/647679-55...,http://www.foodista.com/recipe/BT8SH5YL/hydera...,Hyderabadi baghara Baingan,<ol><li>Wash the eggplants and pat them dry. S...,Hyderabadi baghara Baingan might be just the s...,"[{'id': 11304, 'aisle': 'Produce', 'image': 'p..."
3,638385,https://spoonacular.com/recipeImages/638385-55...,http://www.foodista.com/recipe/NJJ7QPLQ/chicke...,Chicken thighs wrapped in prosciutto,"<ol><li>Marinate the chicken with the thyme, b...","If you want to add more <b>gluten free, dairy ...","[{'id': 2044, 'aisle': 'Produce', 'image': 'fr..."
4,634972,https://spoonacular.com/recipeImages/634972-55...,https://www.foodista.com/recipe/8GHPPTH2/bigol...,Bigoli with smoked salmon,cup pine nuts\n250 g bigoli fresh pasta (or si...,Bigoli with smoked salmon requires around <b>4...,"[{'id': 12147, 'aisle': 'Produce;Baking', 'ima..."
...,...,...,...,...,...,...,...
95,658503,https://spoonacular.com/recipeImages/658503-55...,http://www.foodista.com/recipe/8JHF7S2G/roaste...,Roasted Beet Hummus,<ol><li>Place all ingredients except for the c...,Roasted Beet Hummus requires roughly <b>45 min...,"[{'id': 11080, 'aisle': 'Produce', 'image': 'b..."
96,664970,https://spoonacular.com/recipeImages/664970-55...,http://www.foodista.com/recipe/44X5W8GW/spinac...,Warm Spinach Artichoke Dip,"<ol><li>Place olive oil, artichoke hearts, gar...",Warm Spinach Artichoke Dip takes about <b>1 ho...,"[{'id': 99242, 'aisle': 'Produce', 'image': 'a..."
97,638557,https://spoonacular.com/recipeImages/638557-55...,http://www.foodista.com/recipe/X6YYBXTZ/chili-...,Chili Gobi,"<ol><li>Make a batter with chili powder, beate...",Chili Gobi is an American side dish. For <b>$1...,"[{'id': 11135, 'aisle': 'Produce', 'image': 'c..."
98,636552,https://spoonacular.com/recipeImages/636552-55...,https://www.foodista.com/recipe/LF56ZJ6Z/butte...,Buttermilk Cornbread and Sage Stuffing,Preheat oven to 325F.\nSpread all bread crumbs...,If you want to add more <b>Southern</b> recipe...,"[{'id': 18079, 'aisle': 'Pasta and Rice', 'ima..."


In [103]:
# File path
file_path = '../Data/raw_recipe.csv'

if os.path.isfile(file_path):
    # Append new result to the existing CSV file
    df.to_csv(file_path, mode='a', header=False, index=False)
else:
    # Create a new CSV file and save the result
    df.to_csv(file_path, index=False)


The previous codes have to be run until the number of line minus the number of duplicates satisfy you:

In [104]:
df2 = pd.read_csv('../Data/raw_recipe.csv')
df2.info()
df2['id'].duplicated().sum()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1293 entries, 0 to 1292
Data columns (total 7 columns):
 #   Column               Non-Null Count  Dtype 
---  ------               --------------  ----- 
 0   id                   1293 non-null   int64 
 1   image                1293 non-null   object
 2   sourceUrl            1293 non-null   object
 3   title                1293 non-null   object
 4   instructions         1271 non-null   object
 5   summary              1293 non-null   object
 6   extendedIngredients  1293 non-null   object
dtypes: int64(1), object(6)
memory usage: 70.8+ KB


323