# Summary

- What it does
    - Recipe management/storage
        - Input recipe, output df, csv, txt
    - Recipe costing
    - ?Restaurant simulation
        - Point rewards
        - Blockchain payments
- Who it helps
- What tools/libraries does it use

## To Do:
- Set up Spoonacular API
    - Explore to relevant endpoints
    - Create functions to get data from endpoints


## Jump to:
[Imports, Base Functions](#imports-base-functions)

[Spoonacular](#spoonacular)

[Google Sheets API](#google-sheets-api)

## Resources
[Python Dev Interface: Requests](https://docs.python-requests.org/en/latest/api/)

# Libraries

## Installations
[Google Sheets for Developers: Sheets API](https://developers.google.com/sheets/api/reference/rest)
- Install `google-auth-httplib2` and `google-auth-oauthlib` if using oauth2. Their functions have largely been incorporated into the API client library

[GSpread by burnash](https://github.com/burnash/gspread)

In [1]:
# FuzzyWuzzy
# \!pip install fuzzywuzzy

# Docx
# \!pip install python-docx

# Google API client library
# \!pip install --upgrade google-api-python-client oauth2client # google-auth-httplib2 google-auth-oauthlib

# GSpread
# \!pip install gspread



## Imports

In [2]:
from pathlib import Path
import csv
import pandas as pd
# import hvplot.pandas
# import panel as pn
# pn.extension('plotly')
# import plotly.express as px
# import plotly.io as pio
# pio.renderers.default = 'iframe_connected'  # Bypass mimetype 'renderer not found'
import matplotlib.pyplot as plt
# import numpy as np
# import seaborn as sns

import os
from pathlib import Path
from dotenv import load_dotenv
load_dotenv('./tokens/token.env')
spoon_key = os.getenv('SPOONACULAR_KEY')
gcloud_key = os.getenv('GCLOUD_KEY')
gcloud_oauth_key = os.getenv('GCLOUD_OAUTH_KEY')
gcloud_oauth_secret = os.getenv('GCLOUD_OAUTH_SECRET')

import json, requests
from pandas.io.json import json_normalize

# Docx
import docx
from docx import Document
from docx.shared import Inches
from os import listdir
from os.path import isfile, join

# Google Sheets API
from pprint import pprint
from googleapiclient import discovery
from oauth2client.service_account import ServiceAccountCredentials

# GSpread by burnash
import gspread

# 'chatter` functions from chatter.py
import sys
# from chatter import *

# Spoonacular
API with endpoints related to food, including ing price and nutrition data, recipe cost breakdown, product comparisons, and menu data from over 800 American restaurant chains.

## API and key URLs

In [3]:
# Requests should be formatted as "spoon_url + endpoint + '?query=' + query_params + key_url"
spoon_url = 'https://api.spoonacular.com'
key_url = '&apiKey=' + spoon_key
# requests.get(spoon_url + key_url + spoon_key)

## Endpoints
Learn more about [quotas](https://spoonacular.com/food-api/docs#Quotas).

### [Search Recipes](https://spoonacular.com/food-api/docs#Search-Recipes-Complex)

Search through hundreds of thousands of recipes using advanced filtering and ranking.

NOTE: This method combines searching by query, by ingredients, and by nutrients into one endpoint.

Calling this endpoint requires `1 point` and `0.01 points` per result returned. Since this endpoint combines the capabilities of four different endpoints into one, additional points may be required depending on the parameters you set.

- If `fillIngredients` is set to true, `0.025 points` will be added per recipe returned.
- If a nutrient filter is set, `1 point` will be added.
- If `addRecipeInformation` is set to true, `0.025 points` will be added per recipe returned.
- If `addRecipeNutrition` is set to true, `0.025 points` will be added per recipe returned and `addRecipeInformation` will automatically be set to true as well.

Parameters:
- `query`: the (natural language) recipe search query.
- `cuisine`: the cuisine(s) of the recipes. One or more, comma separated. [List of supported cuisines](https://spoonacular.com/food-api/docs#Cuisines)
- `excludeCuisine`: the cuisine(s) the recipes must not match. One or more, comma separated.
- `includeIngredients`: a comma-separated list of ingredients that should/must be used in the recipes.
- `excludeIngredients`: a comma-separated list of ingredients or ingredient types that the recipes must not contain.

In [4]:
# Search Recipes endpoint
search_recipe_url = spoon_url + '/recipes/complexSearch?query='

# Function that takes in a query, ingredient, or nutrient and returns up to 10 relevant recipes
# Example request: https://api.spoonacular.com/recipes/complexSearch?query=pasta&maxFat=25&number=2
def search_recipe(query):
    return requests.get(search_recipe_url + query + '&number=10' + key_url).json()

search_recipe('cookies')

{'results': [{'id': 654028,
   'title': 'Oreo Cookies & Cream No-Bake Cheesecake',
   'image': 'https://spoonacular.com/recipeImages/654028-312x231.jpg',
   'imageType': 'jpg'},
  {'id': 662151,
   'title': 'Sugar Cookies',
   'image': 'https://spoonacular.com/recipeImages/662151-312x231.jpg',
   'imageType': 'jpg'},
  {'id': 1095878,
   'title': 'Muesli Cookies',
   'image': 'https://spoonacular.com/recipeImages/1095878-312x231.jpg',
   'imageType': 'jpg'},
  {'id': 639865,
   'title': 'Coffee Cookies',
   'image': 'https://spoonacular.com/recipeImages/639865-312x231.jpg',
   'imageType': 'jpg'},
  {'id': 662786,
   'title': 'Tahini Cookies',
   'image': 'https://spoonacular.com/recipeImages/662786-312x231.jpg',
   'imageType': 'jpg'},
  {'id': 655353,
   'title': 'Peanut Cookies',
   'image': 'https://spoonacular.com/recipeImages/655353-312x231.jpg',
   'imageType': 'jpg'},
  {'id': 640246,
   'title': 'Cowboy Cookies With Pretzels and Raisinettes',
   'image': 'https://spoonacular.c

### [Search Recipes by Ingredients](https://spoonacular.com/food-api/docs#Search-Recipes-by-Ingredients)

Ever wondered what recipes you can cook with the ingredients you have in your fridge or pantry? This endpoint lets you find recipes that either maximize the usage of ingredients you have at hand (pre shopping) or minimize the ingredients that you don't currently have (post shopping).

Find recipes that use as many of the given ingredients as possible and require as few additional ingredients as possible. This is a "what's in your fridge" API endpoint.

Calling this endpoint requires `1 point` and `0.01 points` per recipe returned.

Parameters:
- `ingredients`: a comma-separated list of ingredients that the recipes should contain
- `number` the maximum number of recipes to return (between 1 and 100). Default 10.

In [5]:
recipe_by_ing_url = spoon_url + '/recipes/findByIngredients?ingredients='

def search_recipe_by_ing():
    pass

### [Ingredient Search](https://spoonacular.com/food-api/docs#Ingredient-Search)

Search for simple whole foods (e.g. fruits, vegetables, nuts, grains, meat, fish, dairy etc.).

Calling this endpoint requires `1 point` and `0.01 points` per result if `metaInformation` is set to true.

Parameters:
- `query`: the partial or full ingredient name.
- `number`: the number of expected results (between 1 and 100).

Necessary steps:
- Search ingredients
- Get ingredient ID
- Get ingredient info
- Create df of ingredient info
- Add ingredient info df to recipe df
- Extract specific data needed for evaluation

In [6]:
# Ingredient Search endpoint
search_ing_url = spoon_url + '/food/ingredients/search?query='

# Function that takes in an ingredient and returns up to 20 matches
# Example request: https://api.spoonacular.com/food/ingredients/search?query=banana&number=2&sort=calories&sortDirection=desc
def search_ing(ing):
    return requests.get(search_ing_url + ing + '&number=20' + key_url).json()

ing_search = search_ing('all purpose flour')
ing_search

{'results': [{'id': 20081, 'name': 'allpurpose flour', 'image': 'flour.png'},
  {'id': 20581, 'name': 'unbleached all purpose flour', 'image': 'flour.png'},
  {'id': 93620,
   'name': 'gluten free all purpose flour',
   'image': 'gluten-free-flour.jpg'}],
 'offset': 0,
 'number': 20,
 'totalResults': 3}

In [7]:
# Get results from `ing_search`
ing_result = ing_search['results']
ing_result

[{'id': 20081, 'name': 'allpurpose flour', 'image': 'flour.png'},
 {'id': 20581, 'name': 'unbleached all purpose flour', 'image': 'flour.png'},
 {'id': 93620,
  'name': 'gluten free all purpose flour',
  'image': 'gluten-free-flour.jpg'}]

In [8]:
# Select first result
ing_result[0]

{'id': 20081, 'name': 'allpurpose flour', 'image': 'flour.png'}

In [9]:
# Create df from results, dropping 'image' column
ing_df = pd.DataFrame(ing_result[0], index=[0]).drop(columns='image')
ing_df

Unnamed: 0,id,name
0,20081,allpurpose flour


In [10]:
# Get ingredient `id`
print(ing_df['id'])
print()
print(ing_df.loc[0])
print()
print(ing_df.iloc[0])
print()

ing_id = ing_df.iloc[0]['id']
ing_id

0    20081
Name: id, dtype: int64

id                 20081
name    allpurpose flour
Name: 0, dtype: object

id                 20081
name    allpurpose flour
Name: 0, dtype: object



20081

In [11]:
# Function to get ingredient `id`
def get_ing_id(ing):
    ingredient = search_ing(ing)['results'][0]
    print(f"Match for {ing}:")
    print(ingredient)

    ing_id = ingredient['id']
    return ing_id

get_ing_id('all purpose flour')

Match for all purpose flour:
{'id': 20081, 'name': 'allpurpose flour', 'image': 'flour.png'}


20081

### [Get Ingredient Information](https://spoonacular.com/food-api/docs#Get-Ingredient-Information)
Use an ingredient id to get all available information about an ingredient, such as its image and supermarket aisle.

Calling this endpoint requires `1 point`.

Parameters:
- `id`: the ingredient ID#
- `amount`: the amount of this ingredient
- `unit`: the unit of measure for the given amount (e.g. grams)

In [12]:
# Get ingredient information
def get_ing_info(ing, ing_amount, unit):
        ing_id = get_ing_id(ing)
        return requests.get(f"{spoon_url}/food/ingredients/{ing_id}/information?amount={ing_amount}&unit={unit}{key_url}").json()

get_ing_info('all purpose flour', 250, 'g')

Match for all purpose flour:
{'id': 20081, 'name': 'allpurpose flour', 'image': 'flour.png'}


{'id': 20081,
 'original': 'wheat flour',
 'originalName': 'wheat flour',
 'name': 'wheat flour',
 'amount': 250.0,
 'unit': 'g',
 'unitShort': 'g',
 'unitLong': 'grams',
 'possibleUnits': ['g', 'oz', 'teaspoon', 'cup', 'serving', 'tablespoon'],
 'estimatedCost': {'value': 33.33, 'unit': 'US Cents'},
 'consistency': 'solid',
 'shoppingListUnits': ['ounces'],
 'aisle': 'Baking',
 'image': 'flour.png',
 'meta': [],
 'nutrition': {'nutrients': [{'title': 'Cholesterol',
    'name': 'Cholesterol',
    'amount': 0.0,
    'unit': 'mg'},
   {'title': 'Vitamin K', 'name': 'Vitamin K', 'amount': 0.75, 'unit': 'µg'},
   {'title': 'Calories', 'name': 'Calories', 'amount': 910.0, 'unit': 'kcal'},
   {'title': 'Caffeine', 'name': 'Caffeine', 'amount': 0.0, 'unit': 'mg'},
   {'title': 'Zinc', 'name': 'Zinc', 'amount': 1.75, 'unit': 'mg'},
   {'title': 'Vitamin B12',
    'name': 'Vitamin B12',
    'amount': 0.0,
    'unit': 'µg'},
   {'title': 'Choline', 'name': 'Choline', 'amount': 26.0, 'unit': 'mg'

In [13]:
# Create df from ingredient info
def ing_info_to_df(ing, ing_amount, unit):
    ing_info = get_ing_info(ing, ing_amount, unit)
    df = pd.json_normalize(ing_info)[['id','name','amount','unit','estimatedCost.value']]
    df['estimatedCost.value'] = (df['estimatedCost.value'] / 100).round(2)
    df.columns = [['ID','Ingredient_Name','Recipe_Qty','Recipe_Unit','Recipe_Cost_Per_Unit_$']]
    return df

# ing_info_to_df('all purpose flour', 16, 'tbsp')     # 16 Tbsp = 1c
ing_info_to_df('all purpose flour', 1, 'c')

Match for all purpose flour:
{'id': 20081, 'name': 'allpurpose flour', 'image': 'flour.png'}


Unnamed: 0,ID,Ingredient_Name,Recipe_Qty,Recipe_Unit,Recipe_Cost_Per_Unit_$
0,20081,wheat flour,1.0,c,0.17


#### Tests

In [14]:
# Test of get_ing_info()
test_chocolate = get_ing_info('milk chocolate chips',200,'g')
test_chocolate

Match for milk chocolate chips:
{'id': 10099278, 'name': 'milk chocolate chips', 'image': 'chocolate-chips.jpg'}


{'id': 10099278,
 'original': 'milk chocolate morsels',
 'originalName': 'milk chocolate morsels',
 'name': 'milk chocolate morsels',
 'amount': 200.0,
 'unit': 'g',
 'unitShort': 'g',
 'unitLong': 'grams',
 'possibleUnits': ['g', 'oz'],
 'estimatedCost': {'value': 135.71, 'unit': 'US Cents'},
 'consistency': 'solid',
 'aisle': 'Baking',
 'image': 'chocolate-chips.jpg',
 'meta': [],
 'nutrition': {'nutrients': [{'title': 'Cholesterol',
    'name': 'Cholesterol',
    'amount': 28.26,
    'unit': 'mg'},
   {'title': 'Vitamin K', 'name': 'Vitamin K', 'amount': 0.0, 'unit': 'µg'},
   {'title': 'Calories', 'name': 'Calories', 'amount': 494.7, 'unit': 'kcal'},
   {'title': 'Caffeine', 'name': 'Caffeine', 'amount': 0.0, 'unit': 'mg'},
   {'title': 'Zinc', 'name': 'Zinc', 'amount': 0.0, 'unit': 'mg'},
   {'title': 'Vitamin B12',
    'name': 'Vitamin B12',
    'amount': 0.0,
    'unit': 'µg'},
   {'title': 'Saturated Fat',
    'name': 'Saturated Fat',
    'amount': 17.66,
    'unit': 'g'},
   {

In [15]:
ing_info_to_df('milk chocolate chips',200,'g')

Match for milk chocolate chips:
{'id': 10099278, 'name': 'milk chocolate chips', 'image': 'chocolate-chips.jpg'}


Unnamed: 0,ID,Ingredient_Name,Recipe_Qty,Recipe_Unit,Recipe_Cost_Per_Unit_$
0,10099278,milk chocolate morsels,200.0,g,1.36


In [16]:
# Extracting desired data from ingredient info
print(f"{test_chocolate['estimatedCost']['value']} {test_chocolate['estimatedCost']['unit']}")
pd.json_normalize(test_chocolate)[['id','name','amount','unit','possibleUnits','estimatedCost.value','estimatedCost.unit']]

135.71 US Cents


Unnamed: 0,id,name,amount,unit,possibleUnits,estimatedCost.value,estimatedCost.unit
0,10099278,milk chocolate morsels,200.0,g,"[g, oz]",135.71,US Cents


In [17]:
# Test of ing_info_to_df()
ing_info_to_df('milk chocolate chips','200','g')

Match for milk chocolate chips:
{'id': 10099278, 'name': 'milk chocolate chips', 'image': 'chocolate-chips.jpg'}


Unnamed: 0,ID,Ingredient_Name,Recipe_Qty,Recipe_Unit,Recipe_Cost_Per_Unit_$
0,10099278,milk chocolate morsels,200.0,g,1.36


### [Convert Amounts](https://spoonacular.com/food-api/docs#Convert-Amounts)
Convert unit measurements like "2.5 cups of flour to grams".

Parameters:
- `ingredientName`: the ingredient to convert.
- `sourceAmount`: the original amount you're converting **<u>from</u>** (e.g. the "2.5" in "2.5 cups of flour to grams").
- `sourceUnit`: the original unit of measure you're converting **<u>from</u>** (e.g. the "cups" in "2.5 cups of flour to grams").
- `targetUnit`: the new unit of measure you're converting **<u>to</u>** (e.g. the "grams" in "2.5 cups of flour to grams").

In [18]:
# Function to convert an ingredient's unit of measure
# Example request: https://api.spoonacular.com/recipes/convert?ingredientName=flour&sourceAmount=2.5&sourceUnit=cups&targetUnit=grams
def convert(ing, sourceAmount, sourceUnit, targetUnit):
    return requests.get(f"{spoon_url}/recipes/convert?ingredientName={ing}&sourceAmount={str(sourceAmount)}&sourceUnit={sourceUnit}&targetUnit={targetUnit}{key_url}").json()

convert('flour', 2.5, 'cups', 'grams')

{'sourceAmount': 2.5,
 'sourceUnit': 'cups',
 'targetAmount': 312.5,
 'targetUnit': 'grams',
 'answer': '2.5 cups flour translates to 312.5 grams.',
 'type': 'CONVERSION'}

In [19]:
# Test convert() function and separating data
test_convert = convert('flour', 2.5, 'cups', 'grams')
print(f"{test_convert['sourceAmount']} {test_convert['sourceUnit']}")
print(f"{test_convert['targetAmount']} {test_convert['targetUnit']}")

2.5 cups
312.5 grams


In [20]:
# Create a DataFrame from the convert() json data
test_df = pd.json_normalize(test_convert).drop(columns=['answer','type'])
test_df

Unnamed: 0,sourceAmount,sourceUnit,targetAmount,targetUnit
0,2.5,cups,312.5,grams


### Other Endpoints

In [21]:
# # Ingredients endpoints
# search_ingredient = spoon_url + '/food/ingredients/search?query='

# # Products endpoints
# products = spoon_url + '/food/products'

# # Menu Items endpoints
# menu_item = spoon_url + '/food/menuItems'

# # Meal Planner endpoints
# meal_plan = spoon_url + '/mealplanner'

# # Wine endpoints
# wine = spoon_url + '/food/wine'

# # Detect Food in Text endpoint
# detect = spoon_url + '/food/detect'

In [22]:
# Get ID from search results


In [23]:
# # Dictionary containing endpoints
# ends = {
#     'recipes': '/recipes/complexSearch?',
#     'ingredient': '/food/ingredients?',
#     'products': '/food/products?',
#     'menu_items': '/food/menuItems?',
#     'meal_plan': '/mealplanner?',
#     'wine': '/food/wine?',
#     'detect': '/food/detect?',
# }

# def search(endpoint, query_item):
#     # user_ends = input("Select endpoint: ")
#     search_url = f"{spoon_url}{ends[endpoint]}query={query_item}{key_url}"
#     print(search_url)

In [24]:
# search("recipes", "mango")

In [25]:
# search_ing + 'banana' + api_key + '?format=json'
# requests.get(z)
# y = search_ing + 'mango' + api_key #+ '?format=json'
# print(y)
# requests.get(y)

In [26]:
# z = search_ing + '/search?' + api_key + '&query=mango'
# print(z)
# requests.get(z)

In [27]:
# def find_ing(ing):
#     search_ing + ing + api_key
# find_ing("banana")

In [28]:
request = requests.get(spoon_url, params={
    'grant_type': 'client_credentials',
    'client_id': spoon_key,
})
{request}

{<Response [404]>}

In [29]:
# request = requests.post(spoon_url + '?apiKey=' + spoon_key, {
#     'grant_type': 'client_credentials',
#     'client_id': api_key,
# })

In [30]:
# recipe_id = ''
# ingredient = ''
# x = ''
# class spoon:
#     class recipes():
#         recipes = api + 'recipes/'
#         search = recipes + 'complexSearch/'
#         price = recipes + recipe_id + 'priceBreakdownWidget.json'
#         ingredients = recipes + recipe_id + 'ingredientWidget.json'
#         nutrition = recipes + recipe_id + 'nutritionWidget.json'
#     class ingredient(x):
#         ingredient = api + 'food/ingredients/'
#         search = ingredient + 'search?query=' + x

# Google Sheets API
[Google Sheets for Developers: Sheets API](https://developers.google.com/sheets/api/reference/rest)

`gcloud_key`, `gcloud_oauth_key`, `gcloud_oauth_secret`

## Google API Client Library

OAuth scopes:
- https://www.googleapis.com/auth/drive
- https://www.googleapis.com/auth/drive.readonly
- https://www.googleapis.com/auth/drive.file
- https://www.googleapis.com/auth/spreadsheets
- https://www.googleapis.com/auth/spreadsheets.readonly

In [31]:
# Google Client Library Installation
# \!pip install --upgrade google-api-python-client

# Imports for Google Cloud
# from googleapiclient import discovery
# from oauth2client.client import OAuth2Credentials as creds
# crm = discovery.build(
#     'cloudresourcemanager', 'v3', http=creds.authorize(httplib2.Http()))

# project = crm.projects().get(projectId=flags.projectId).execute()

In [32]:
# Sample script from Google API documentation
from __future__ import print_function
import os.path
from googleapiclient.discovery import build
from google_auth_oauthlib.flow import InstalledAppFlow
from google.auth.transport.requests import Request
from google.oauth2.credentials import Credentials

# If modifying these scopes, delete the file token.json.
SCOPES = ['https://www.googleapis.com/auth/spreadsheets.readonly']

# The ID and range of a sample spreadsheet.
SAMPLE_SPREADSHEET_ID = '1BxiMVs0XRA5nFMdKvBdBZjgmUUqptlbs74OgvE2upms'
SAMPLE_RANGE_NAME = 'Class Data!A2:E'

def main():
    """Shows basic usage of the Sheets API.
    Prints values from a sample spreadsheet.
    """
    creds = None
    # The file token.json stores the user's access and refresh tokens, and is
    # created automatically when the authorization flow completes for the first
    # time.
    if os.path.exists('tokens/token.json'):
        creds = Credentials.from_authorized_user_file('tokens/token.json', SCOPES)
    # If there are no (valid) credentials available, let the user log in.
    if not creds or not creds.valid:
        if creds and creds.expired and creds.refresh_token:
            creds.refresh(Request())
        else:
            flow = InstalledAppFlow.from_client_secrets_file(
                'tokens/credentials.json', SCOPES)
            creds = flow.run_local_server(port=0)
        # Save the credentials for the next run
        with open('tokens/token.json', 'w') as token:
            token.write(creds.to_json())

    service = build('sheets', 'v4', credentials=creds)

    # Call the Sheets API
    sheet = service.spreadsheets()
    result = sheet.values().get(spreadsheetId=SAMPLE_SPREADSHEET_ID,
                                range=SAMPLE_RANGE_NAME).execute()
    values = result.get('values', [])

    if not values:
        print('No data found.')
    else:
        print('Name, Major:')
        for row in values:
            # Print columns A and E, which correspond to indices 0 and 4.
            print('%s, %s' % (row[0], row[4]))

if __name__ == '__main__':
    main()

Name, Major:
Alexandra, English
Andrew, Math
Anna, English
Becky, Art
Benjamin, English
Carl, Art
Carrie, English
Dorothy, Math
Dylan, Math
Edward, English
Ellen, Physics
Fiona, Art
John, Physics
Jonathan, Math
Joseph, English
Josephine, Math
Karen, English
Kevin, Physics
Lisa, Art
Mary, Physics
Maureen, Physics
Nick, Art
Olivia, Physics
Pamela, Math
Patrick, Art
Robert, English
Sean, Physics
Stacy, Math
Thomas, Art
Will, Math


In [33]:
# # Google Sheets service endpoint for HTTP requests
# # https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/get
# sheets_url = 'https://sheets.googleapis.com/v4/spreadsheets/'
# sheet_id = '1-WrIRryaOObpAzJNzRKtc0lsdhDhkaVJRT7xyPN4coU'
# requests_url = f"{sheets_url}{sheet_id}?key={gcloud_key}"
# credentials = ['https://www.googleapis.com/auth/drive','https://www.googleapis.com/auth/spreadsheets']
# service = discovery.build('sheets', 'v4')#, credentials=credentials)
# ranges = ['A1:H28']
# include_grid_data = True
# requests = service.spreadsheets().get(
#     spreadsheetId=sheet_id,
#     ranges=ranges,
#     includeGridData=include_grid_data
#     )
# response = request.execute()
# pprint(response)

In [34]:
# r = requests.get(requests_url, params={
#     'client_id': gcloud_oauth_key,
#     'client_secret': gcloud_oauth_secret,
# })
# r.json()

In [35]:
# def request(url):
#     r = requests.get(url)
#     return r.json()
# request(requests_url)

In [36]:
# print(f"{sheets_url}{sheet_id}/values:batchGet?key={gcloud_key}")

In [37]:
# request(f"{sheets_url}{sheet_id}/values:batchGet?key={gcloud_key}")

In [38]:
# def req():
#     r = requests.get(
#         sheets_url,
#         headers = {
#             'Authorization': 'Bearer {gcloud_key}'.format(token=)},
#     )
#     return r.json()

# req()

In [39]:
# https://developers.google.com/sheets/api/reference/rest/v4/ValueRenderOption

## GSpread
[Read and Update Google Spreadsheets with Python!](https://www.analyticsvidhya.com/blog/2020/07/read-and-update-google-spreadsheets-with-python/)

[GSpread User Guide](https://github.com/burnash/gspread/blob/6d01d7b5bb79601d0eece20255eabed9d13aa5bf/docs/user-guide.rst)

[Authentication](https://github.com/burnash/gspread/blob/6d01d7b5bb79601d0eece20255eabed9d13aa5bf/docs/oauth2.rst)

[User Guide](https://github.com/burnash/gspread/blob/6d01d7b5bb79601d0eece20255eabed9d13aa5bf/docs/user-guide.rst)

### Methods

1. Opening a Spreadsheet
- a. by title:
    - `sh = gc.open('My poor gym results')`
- b. by key: 
    - `sht1 = gc.open_by_key('0BmgG6nO_6dprdS1MN3d3MkdPa142WFRrdnRRUWl1UFE')`
- c. by URL:
    - `sht2 = gc.open_by_url('https://docs.google.com/spreadsheet/ccc?key=0Bm...FE&hl')`

2. Creating a Spreadsheet
If you're using a `service account`, this new spreadsheet will be visible only to this account. To be able to access newly created spreadsheet from Google Sheets with your own Google account you must share it with your email. See how to share a spreadsheet in the section below.
- `sh = gc.create('A new spreadsheet')`

3. Sharing a Spreadsheet
- `sh.share('email@host.com', perm_type='user', role='writer')`

4. Selecting a Worksheet
- a. by index, starting from zero:
    - `worksheet = sh.get_worksheet(0)`
- b. by title:
    - `worksheet = sh.worksheet("January")`
- c. by the most common case: *Sheet1*:
    - `worksheet = sh.sheet1`
- Get a list of all worksheets:
    - `worksheet_list = sh.worksheets()`

5. Creating a Worksheet
- `worksheet = sh.add_worksheet(title="A worksheet", rows="100", cols="20")`

6. Deleting a Worksheet
- `sh.del_worksheet(worksheet)`

7. Getting a Cell Value
- a. Using A1 notation:
    - `val = worksheet.acell('B1').value`
- b. By row and column coordinates:
    - `val = worksheet.cell(1, 2).value`
- To get a cell formula:
    - `cell = worksheet.acell('B1', value_render_option='FORMULA').value`
    - OR
    - `cell = worksheet.cell(1, 2, value_render_option='FORMULA').value`


### Define Scopes and Add Credentials

In [40]:
# Scopes for Google APIs
scope = [
    'https://www.googleapis.com/auth/drive',
    'https://www.googleapis.com/auth/documents'
    'https://spreadsheets.google.com/feeds',
    
]

# Account credentials
creds = ServiceAccountCredentials.from_json_keyfile_name('tokens/easy-as-py-service-account.json')

# Authorize clientsheet
client = gspread.authorize(creds)

### Open Spreadsheet and Sheet (Tab)

In [91]:
# Get instance of 'Cost Cheesecake' spreadsheet using its title
costing_sheet = client.open('Cheesecake Cost')
print(costing_sheet)

# Get second sheet from spreadsheet using its index number
costing_instance = costing_sheet.get_worksheet(1)
costing_instance

<Spreadsheet 'Cheesecake Cost' id:1njFFWbnUF0J57hD4iG0p4fVLCgh1jFRMVz6ZALQCYXE>


<Worksheet 'Menu Cost' id:242075997>

### Data Exploration

In [42]:
# Get total number of columns
print(f"Number of columns: {costing_instance.col_count}")

# Get value in 'Ingredient_Cost_Total' cell
costing_instance.cell(col=8,row=13)

Number of columns: 26


<Cell R13C8 None>

In [43]:
# Get all records in the data
records_data = costing_instance.get_all_records()

# View data
records_data

[{'Item_Name': 'Cheesecake New York',
  'Menu_Category': 'Dessert',
  'Recipe_Yield': '',
  'Menu_Sale_Price': '',
  'Total_Recipe_Cost': '',
  'Food_Cost_Percentage': '',
  'Individual_Portion_Cost': '',
  'Total_Recipe_Sales': '',
  'Profit_Margin': ''}]

In [44]:
# Convert json to df
records_df = pd.DataFrame.from_dict(records_data)

# View top records
records_df

Unnamed: 0,Item_Name,Menu_Category,Recipe_Yield,Menu_Sale_Price,Total_Recipe_Cost,Food_Cost_Percentage,Individual_Portion_Cost,Total_Recipe_Sales,Profit_Margin
0,Cheesecake New York,Dessert,,,,,,,


### Defining Functions

In [45]:
# Function to get instance of spreadsheet
def open_spreadsheet(title):
    return client.open(title)

# Nested function to get specific sheet (tab) from spreadsheet instance
def open_sheet(title, tab_num):
    sheet = open_spreadsheet(title)
    return sheet.get_worksheet(tab_num)

# Use `for loop` to convert letters to numbers
def letter_to_num(column):
    number = ord(column.lower()) - 96
    return number
print(letter_to_num('A'))
print(letter_to_num('a'))

# Function to get specific cell
def get_cell(sheet, column, row):
    col = letter_to_num(column)
    return sheet.cell(col=col,row=row)

# Get records data
def get_records(sheet):
    records = sheet.get_all_records()
    return records

# Get menu engineering
def get_df(title, tab_num):
    sheet = open_sheet(title, tab_num)
    records = sheet.get_all_records()
    df = pd.DataFrame.from_dict(records)
    return df

1
1


In [46]:
df = get_df('Menu Engineering', 1)
print(df.head())

menu_engineering = df.drop(df[df['Item_Name'] == ''].index)
menu_engineering.head(10)

    Item_Name Item_Sold Menu_Cat  Sales_Mix_% Menu_Sale_Price_$ Food_Cost_$  \
0  Cheesecake        60  Dessert         48.0               7.0         1.1   
1   Apple Pie        20  Dessert         16.0               6.0        1.25   
2    Tiramisu        30  Dessert         24.0               8.0         3.0   
3     Cupcake        15  Dessert         12.0               4.0        0.75   
4                                         0.0                                 

   CM_$  Total_Sales_$  Total_Costs_$  Total_CM_$  CM_% Sales_Mix_Cat CM_Cat  
0  5.90          420.0          66.00      354.00  54.7             H      H  
1  4.75          120.0          25.00       95.00  14.7             L      H  
2  5.00          240.0          90.00      150.00  23.2             H      H  
3  3.25           60.0          11.25       48.75   7.5             L      L  
4  0.00            0.0           0.00        0.00   0.0           N/A    N/A  


Unnamed: 0,Item_Name,Item_Sold,Menu_Cat,Sales_Mix_%,Menu_Sale_Price_$,Food_Cost_$,CM_$,Total_Sales_$,Total_Costs_$,Total_CM_$,CM_%,Sales_Mix_Cat,CM_Cat
0,Cheesecake,60,Dessert,48.0,7.0,1.1,5.9,420.0,66.0,354.0,54.7,H,H
1,Apple Pie,20,Dessert,16.0,6.0,1.25,4.75,120.0,25.0,95.0,14.7,L,H
2,Tiramisu,30,Dessert,24.0,8.0,3.0,5.0,240.0,90.0,150.0,23.2,H,H
3,Cupcake,15,Dessert,12.0,4.0,0.75,3.25,60.0,11.25,48.75,7.5,L,L


### Test Functions with New Sheet

In [47]:
# Open specified sheet from spreadsheet
engineering_sheet = open_sheet('Menu Engineering', 1)
engineering_sheet

<Worksheet 'Menu Engineering' id:1614991599>

In [48]:
# Get column count and specified cell data
print(f"Number of columns: {engineering_sheet.col_count}")
print(f"{get_cell(engineering_sheet, 'B', 31)}")
print(f"{get_cell(engineering_sheet, 'h', 31)}")

Number of columns: 26
<Cell R31C2 None>
<Cell R31C8 None>


In [49]:
# Get all records in the data
engineering_data = engineering_sheet.get_all_records()
engineering_data

[{'Item_Name': 'Cheesecake',
  'Item_Sold': 60,
  'Menu_Cat': 'Dessert',
  'Sales_Mix_%': 48.0,
  'Menu_Sale_Price_$': 7.0,
  'Food_Cost_$': 1.1,
  'CM_$': 5.9,
  'Total_Sales_$': 420.0,
  'Total_Costs_$': 66.0,
  'Total_CM_$': 354.0,
  'CM_%': 54.7,
  'Sales_Mix_Cat': 'H',
  'CM_Cat': 'H'},
 {'Item_Name': 'Apple Pie',
  'Item_Sold': 20,
  'Menu_Cat': 'Dessert',
  'Sales_Mix_%': 16.0,
  'Menu_Sale_Price_$': 6.0,
  'Food_Cost_$': 1.25,
  'CM_$': 4.75,
  'Total_Sales_$': 120.0,
  'Total_Costs_$': 25.0,
  'Total_CM_$': 95.0,
  'CM_%': 14.7,
  'Sales_Mix_Cat': 'L',
  'CM_Cat': 'H'},
 {'Item_Name': 'Tiramisu',
  'Item_Sold': 30,
  'Menu_Cat': 'Dessert',
  'Sales_Mix_%': 24.0,
  'Menu_Sale_Price_$': 8.0,
  'Food_Cost_$': 3.0,
  'CM_$': 5.0,
  'Total_Sales_$': 240.0,
  'Total_Costs_$': 90.0,
  'Total_CM_$': 150.0,
  'CM_%': 23.2,
  'Sales_Mix_Cat': 'H',
  'CM_Cat': 'H'},
 {'Item_Name': 'Cupcake',
  'Item_Sold': 15,
  'Menu_Cat': 'Dessert',
  'Sales_Mix_%': 12.0,
  'Menu_Sale_Price_$': 4.0,
  

In [50]:
get_cell(engineering_sheet, 'a', 6)

<Cell R6C1 None>

In [51]:
engineering_df = pd.DataFrame.from_dict(engineering_data)
# engineering_df.replace('', float('NaN'), inplace=True)
# engineering_df.dropna(subset=['Item_Name'], inplace=True)
engineering_df.drop(engineering_df[engineering_df['Item_Name'] == ''].index, inplace=True)
engineering_df

Unnamed: 0,Item_Name,Item_Sold,Menu_Cat,Sales_Mix_%,Menu_Sale_Price_$,Food_Cost_$,CM_$,Total_Sales_$,Total_Costs_$,Total_CM_$,CM_%,Sales_Mix_Cat,CM_Cat
0,Cheesecake,60,Dessert,48.0,7.0,1.1,5.9,420.0,66.0,354.0,54.7,H,H
1,Apple Pie,20,Dessert,16.0,6.0,1.25,4.75,120.0,25.0,95.0,14.7,L,H
2,Tiramisu,30,Dessert,24.0,8.0,3.0,5.0,240.0,90.0,150.0,23.2,H,H
3,Cupcake,15,Dessert,12.0,4.0,0.75,3.25,60.0,11.25,48.75,7.5,L,L


### Clean Up DataFrame

In [52]:
def clean_df(df):
    df.drop(df[df['Item_Name'] == ''].index, inplace=True)
    return df

In [53]:
clean = clean_df(pd.DataFrame.from_dict(engineering_data))
clean

Unnamed: 0,Item_Name,Item_Sold,Menu_Cat,Sales_Mix_%,Menu_Sale_Price_$,Food_Cost_$,CM_$,Total_Sales_$,Total_Costs_$,Total_CM_$,CM_%,Sales_Mix_Cat,CM_Cat
0,Cheesecake,60,Dessert,48.0,7.0,1.1,5.9,420.0,66.0,354.0,54.7,H,H
1,Apple Pie,20,Dessert,16.0,6.0,1.25,4.75,120.0,25.0,95.0,14.7,L,H
2,Tiramisu,30,Dessert,24.0,8.0,3.0,5.0,240.0,90.0,150.0,23.2,H,H
3,Cupcake,15,Dessert,12.0,4.0,0.75,3.25,60.0,11.25,48.75,7.5,L,L


In [54]:
clean.dtypes

Item_Name             object
Item_Sold             object
Menu_Cat              object
Sales_Mix_%          float64
Menu_Sale_Price_$     object
Food_Cost_$           object
CM_$                 float64
Total_Sales_$        float64
Total_Costs_$        float64
Total_CM_$           float64
CM_%                 float64
Sales_Mix_Cat         object
CM_Cat                object
dtype: object

In [55]:
clean.iloc[[1]].sum()

Item_Name            Apple Pie
Item_Sold                   20
Menu_Cat               Dessert
Sales_Mix_%               16.0
Menu_Sale_Price_$          6.0
Food_Cost_$               1.25
CM_$                      4.75
Total_Sales_$            120.0
Total_Costs_$             25.0
Total_CM_$                95.0
CM_%                      14.7
Sales_Mix_Cat                L
CM_Cat                       H
dtype: object

In [56]:
x = 5
def change_x():
    x = 1
    return x
print(x)
print(change_x())
print(x)
print()

y = 6
def change_y():
    global y
    y = 10
    return y
print(y)
print(change_y())
print(y)

5
1
5

6
10
10


### Order Sheet

In [57]:
order_sheet = open_spreadsheet('Order Sheet')

In [58]:
order_sheet1 = open_sheet('Order Sheet', 0)
order_sheet_records = get_records(order_sheet1)
order_sheet_records

[{'Item': 'Asparagus, 11lb',
  'Par': '3 M & Th',
  'Order Unit': 'cse/11ct',
  'Count Unit': 'ea',
  'Main Loca- tion': 3,
  'Walk- In': '',
  'Kitchen': '',
  'Dry Storage': '',
  'In House': 0,
  'Order': '',
  'Order From:': 'Reinhart',
  'Gordon': '',
  'Wainer': 52.25,
  'Rochs': 32.0,
  '': '',
  'Sysco': 25.55,
  'Reinhart': 23.97,
  'PFG': 25.61,
  'Best Price': 23.97,
  'Inventory Value': 0.0},
 {'Item': 'Arugula',
  'Par': '',
  'Order Unit': '3 lb cse',
  'Count Unit': '',
  'Main Loca- tion': '',
  'Walk- In': '',
  'Kitchen': '',
  'Dry Storage': '',
  'In House': 0,
  'Order': '',
  'Order From:': 'PFG',
  'Gordon': '',
  'Wainer': 19.96,
  'Rochs': 20.67,
  '': '',
  'Sysco': 22.25,
  'Reinhart': '',
  'PFG': 18.36,
  'Best Price': 18.36,
  'Inventory Value': 0.0},
 {'Item': 'Apples,granny smith',
  'Par': '',
  'Order Unit': 'ea',
  'Count Unit': 'ea',
  'Main Loca- tion': '',
  'Walk- In': '',
  'Kitchen': '',
  'Dry Storage': '',
  'In House': 0,
  'Order': '.',
  'O

In [92]:
# Create df for 'Order Sheet' with select columns
df = get_df('Order Sheet', 0)[['Item','Order Unit','Best Price']]

# Keep rows where the 'Best Price' IS NOT equal to 0
df = df[df['Best Price'] != 0]
order_sheet_df = df.rename(columns={
    'Item':'Ingredient_Name',
    'Order Unit':'Purchase_Unit',
    'Best Price':'Purchase_Price_$'
})
order_sheet_df

Unnamed: 0,Ingredient_Name,Purchase_Unit,Purchase_Price_$
0,"Asparagus, 11lb",cse/11ct,23.97
1,Arugula,3 lb cse,18.36
2,"Apples,granny smith",ea,0.89
3,"Avocado, Hass, Ripe",ea,1.60
4,Bananas ( 3x per week ),lb,0.67
...,...,...,...
75,"Tomatoes, 6x6",cs,46.86
76,"Tomatoes, Grape",cs/12/ea,20.86
77,"Tomatoes, Plum",cs,38.31
78,"Tomatoes, Tomatillo",lb,2.58


# MS Word Docx

## [Python Docx Documentation](https://python-docx.readthedocs.io/en/latest/api/document.html)

In [60]:
# ## Endpoints
# docs_service = build('docs', 'v1', credentials=creds)
# edit_document(docs_services, )
# cheesecake_recipe_id = '1hGnXM62uyXAiFAYRr4NB5Mel-zciQ1ojCJnxV_D3dUY'

# docs_service.documents().get(cheesecake_recipe_id, body=body).execute()

In [61]:
# Python-Docx Library
import docx
from docx import Document
from docx.shared import Inches

# Open blank docx
document = Document()

# Add paragraph
paragraph = document.add_paragraph('This is a paragraph.')

# Insert a paragraph above a specific `paragraph`
prior_paragraph = paragraph.insert_paragraph_before('This paragraph goes above `paragraph`.')

# Add a heading (by default a top-level heading, or "Heading 1")
document.add_heading('The REAL meaning of the universe')

# Add a heading for a sub-section
# `level=0` adds a "Title"
document.add_heading('The role of dolphins', level=2)

# Add a page break
document.add_page_break()

# Apply a style when creating a paragraph
document.add_paragraph('This is a bulleted list.', style='ListBullet')
# or
paragraph = document.add_paragraph('This is a bullet list.')
paragraph.style = 'List Bullet'

# Styles
styles = document.styles
styles

  return self._get_style_id_from_style(self[style_name], style_type)


<docx.styles.styles.Styles at 0x29c9e12af48>

## [How to Extract Tabular Data from Doc files Using Python?](https://www.analyticsvidhya.com/blog/2021/09/how-to-extract-tabular-data-from-doc-files-using-python/)

In [62]:
docx_text = docx.Document('recipes/cheesecake_recipe.docx')
data = {}
paragraphs = docx_text.paragraphs
for i in range(2, len(docx_text.paragraphs)):
    data[i] = docx_text.paragraphs[i].text.split('\t')
data_values = list(data.values())
data_values

[['Cream cheese', '20 oz'],
 ['Granulated sugar', '17.5 oz'],
 ['Sour cream', '4 oz'],
 ['All purpose flour', '3 Tbsp'],
 ['Eggs', '5 each'],
 ['Egg yolks', '2 each'],
 ['Vanilla extract', '3 tsp'],
 ['Graham cracker crumbs', '6.75 oz'],
 ['Unsalted butter', '6 oz'],
 [''],
 ['Method of Prep'],
 ['Mix it.'],
 ['Bake it.'],
 ['Cool it.'],
 ['Eat it.']]

In [63]:
# Create df and drop rows with 'None' values. This should eliminate line breaks and "Method of Prep" section 
doc_df = pd.DataFrame(data_values,columns=['Ingredient_Name','Recipe_Qty']).dropna()
    #['Ingredient_Name','Recipe_Qty']
doc_df

Unnamed: 0,Ingredient_Name,Recipe_Qty
0,Cream cheese,20 oz
1,Granulated sugar,17.5 oz
2,Sour cream,4 oz
3,All purpose flour,3 Tbsp
4,Eggs,5 each
5,Egg yolks,2 each
6,Vanilla extract,3 tsp
7,Graham cracker crumbs,6.75 oz
8,Unsalted butter,6 oz


### Function

In [64]:
def recipe_from_docx(docx_file):
    docx_text = docx.Document(docx_file)
    data = {}
    paragraphs = docx_text.paragraphs
    for i in range(2, len(docx_text.paragraphs)):
        data[i] = docx_text.paragraphs[i].text.split('\t')
    data_values = list(data.values())
    return data_values

#### Tests

In [65]:
recipe_book = recipe_from_docx('recipe_book.docx')
recipe_book

[['Cream cheese', '20 oz'],
 ['Sugar, granulated', '17.5 oz'],
 ['Sour cream', '4 oz'],
 ['All purpose flour', '3 Tbsp'],
 ['Eggs', '5 each'],
 ['Egg yolks', '2 each'],
 ['Vanilla extract', '3 tsp'],
 ['Graham cracker crumbs', '6.75 oz'],
 ['Butter, unsalted', '6 oz'],
 ['Method of Prep'],
 ['Mix it.'],
 ['Bake it.'],
 ['Cool it.'],
 ['Eat it.'],
 ['\n'],
 ['Chicken Parm Meatballs'],
 ['Ingredients'],
 ['Ground chicken', '18 packs'],
 ['Eggs', '18 ea'],
 ['Panko', '14 c'],
 ['White onion (fine diced)', '4 ea'],
 ['Garlic (chopped)', '8 Tbsp'],
 ['Romano (grated)', '1 Qt'],
 ['Marinara', '2 c'],
 ['Tri-mix', '12 tsp'],
 ['Crushed red pepper', '3 Tbsp'],
 ['Dry thyme', '4 Tbsp'],
 ['Fresh parsley (chopped)', '15 Tbsp'],
 ['Fresh basil (chopped)', '5 Tbsp'],
 ['Method of Prep'],
 ['Mix all together.'],
 ['\n'],
 ['Basil Ricotta'],
 ['Ingredients'],
 ['Ricotta', '2 lbs'],
 ['Fresh basil', '1 c'],
 ['Lemon (juice and zest)', '1 ea'],
 ['Ground nutmeg', '1 tsp'],
 ['Tri-mix', '2 tsp'],
 ['EV

In [83]:
def clean_df(df):
    df.drop(df[df['Item_Name'] == ''].index, inplace=True)
    return df

clean_recipe_book = [x for x in recipe_book]# if x != '\n']
frame = pd.DataFrame(clean_recipe_book,columns=['Ingredient_Name','Recipe_Qty']).dropna()
frame

Unnamed: 0,Ingredient_Name,Recipe_Qty
0,Cream cheese,20 oz
1,"Sugar, granulated",17.5 oz
2,Sour cream,4 oz
3,All purpose flour,3 Tbsp
4,Eggs,5 each
5,Egg yolks,2 each
6,Vanilla extract,3 tsp
7,Graham cracker crumbs,6.75 oz
8,"Butter, unsalted",6 oz
17,Ground chicken,18 packs


In [67]:
book = Document('recipe_book.docx')
sections = book.sections
print(len(sections))

1


In [81]:
def iterate_document_sections(document):
    """Generate a sequence of paragraphs for each headed section in document.

    Each generated sequence has a heading paragraph in its first position, 
    followed by one or more body paragraphs.
    """
    paragraphs = [document.paragraphs[0]]
    for paragraph in document.paragraphs[1:]:
        if is_heading(paragraph):
             yield paragraphs
             paragraphs = [paragraph]
             continue
        paragraphs.append(paragraph)
    yield paragraphs

split_book = iterate_document_sections(book)
split_book

<generator object iterate_document_sections at 0x0000029CA129B1C8>

#### [Reading and Writing MS Word Files in Python via Python-Docx Module](https://stackabuse.com/reading-and-writing-ms-word-files-in-python-via-python-docx-module/)

In [69]:
all_paras = book.paragraphs
print(f"Lines in recipe book: {len(all_paras)}")
for para in all_paras:
    print(para.text)
    print('------')

Lines in recipe book: 61
Cheesecake
------
Ingredients
------
Cream cheese	20 oz
------
Sugar, granulated	17.5 oz
------
Sour cream	4 oz
------
All purpose flour	3 Tbsp
------
Eggs	5 each
------
Egg yolks	2 each
------
Vanilla extract	3 tsp
------
Graham cracker crumbs	6.75 oz
------
Butter, unsalted	6 oz
------
Method of Prep
------
Mix it.
------
Bake it.
------
Cool it.
------
Eat it.
------


------
Chicken Parm Meatballs
------
Ingredients
------
Ground chicken	18 packs
------
Eggs	18 ea
------
Panko	14 c
------
White onion (fine diced)	4 ea
------
Garlic (chopped)	8 Tbsp
------
Romano (grated)	1 Qt
------
Marinara	2 c
------
Tri-mix	12 tsp
------
Crushed red pepper	3 Tbsp
------
Dry thyme	4 Tbsp
------
Fresh parsley (chopped)	15 Tbsp
------
Fresh basil (chopped)	5 Tbsp
------
Method of Prep
------
Mix all together.
------


------
Basil Ricotta
------
Ingredients
------
Ricotta	2 lbs
------
Fresh basil	1 c
------
Lemon (juice and zest)	1 ea
------
Ground nutmeg	1 tsp
------
Tri-m

In [70]:
single_para = book.paragraphs[2]
print(single_para.text)

Cream cheese	20 oz


In [71]:
book.styles

<docx.styles.styles.Styles at 0x29c9e2d0388>

### re.sub function to ignore text in brackets []

In [73]:
import re
def ignore_brackets(x):
    # `x` is the string to check for bracketed data
    re.sub("([\(\[]).*?([\)\]])", "\g<1>\g<2>", x)

# ignore_brackets(elem)

In [74]:
for i, tuple in enumerate(recipe_book):
    elem_one = tuple[0]
    # elem_two = tuple[1]     # 'list index out of range' if "Qty" column empty
    print(elem_one)#, elem_two)

Cream cheese
Sugar, granulated
Sour cream
All purpose flour
Eggs
Egg yolks
Vanilla extract
Graham cracker crumbs
Butter, unsalted
Method of Prep
Mix it.
Bake it.
Cool it.
Eat it.


Chicken Parm Meatballs
Ingredients
Ground chicken
Eggs
Panko
White onion (fine diced)
Garlic (chopped)
Romano (grated)
Marinara
Tri-mix
Crushed red pepper
Dry thyme
Fresh parsley (chopped)
Fresh basil (chopped)
Method of Prep
Mix all together.


Basil Ricotta
Ingredients
Ricotta
Fresh basil
Lemon (juice and zest)
Ground nutmeg
Tri-mix
EVOO
Heavy cream
Method of Prep
Mix all together.


Marinara Sauce
Ingredients
Blended Oil
Garlic (chopped)
White onion (pureed)
Red pepper flakes
Crushed plum tomatoes
Tomato paste
Tri-mix
Sugar, granulated
Method of Prep
Saute onion and garlic in oil until just before browning.
Add tomatoes, tomato paste, spices, and sugar.
Simmer for 45 min, stirring occasionally to keep from burning or sticking.
Cool with cooling stick and transfer into 22 Qt Cambro.


### Get recipe docx files from directory

In [76]:
from os import listdir
from os.path import isfile, join
recipe_files = [f for f in listdir('./recipes/') if isfile(join('./recipes/', f))]
print(recipe_files, '\n')

# for i in recipe_files:
#     print(i)

# recipe_from_docx(f"recipes/{recipe_files[0]}")

for i in recipe_files:
    print(i)
    print(recipe_from_docx(f"./recipes/{i}"), '\n')

['basil_ricotta_recipe.docx', 'cheesecake_recipe.docx', 'chicken_parm_meatballs_recipe.docx', 'marinara_recipe.docx'] 

basil_ricotta_recipe.docx
[['Ricotta', '2 lbs'], ['Fresh basil', '1 c'], ['Lemon (juice and zest)', '1 ea'], ['Ground nutmeg', '1 tsp'], ['Tri-mix', '2 tsp'], ['EVOO', '⅛ c'], ['Heavy cream', '¼ c'], ['Method of Prep'], ['Mix all together.'], ['']] 

cheesecake_recipe.docx
[['Cream cheese', '20 oz'], ['Granulated sugar', '17.5 oz'], ['Sour cream', '4 oz'], ['All purpose flour', '3 Tbsp'], ['Eggs', '5 each'], ['Egg yolks', '2 each'], ['Vanilla extract', '3 tsp'], ['Graham cracker crumbs', '6.75 oz'], ['Unsalted butter', '6 oz'], [''], ['Method of Prep'], ['Mix it.'], ['Bake it.'], ['Cool it.'], ['Eat it.']] 

chicken_parm_meatballs_recipe.docx
[['Ground chicken', '18 packs'], ['Eggs', '18 ea'], ['Panko', '14 c'], ['White onion (fine diced)', '4 ea'], ['Garlic (chopped)', '8 Tbsp'], ['Romano (grated)', '1 Qt'], ['Marinara', '2 c'], ['Tri-mix', '12 tsp'], ['Crushed red p

#### Function

In [86]:
# from os import listdir
# from os.path import isfile, join

def get_recipes():
    recipe_files = [f for f in listdir('./recipes/') if isfile(join('./recipes/', f))]
    for i in recipe_files:
        print(i)
        print(recipe_from_docx(f"./recipes/{i}"), '\n')

get_recipes()

basil_ricotta_recipe.docx
[['Ricotta', '2 lbs'], ['Fresh basil', '1 c'], ['Lemon (juice and zest)', '1 ea'], ['Ground nutmeg', '1 tsp'], ['Tri-mix', '2 tsp'], ['EVOO', '⅛ c'], ['Heavy cream', '¼ c'], ['Method of Prep'], ['Mix all together.'], ['']] 

cheesecake_recipe.docx
[['Cream cheese', '20 oz'], ['Granulated sugar', '17.5 oz'], ['Sour cream', '4 oz'], ['All purpose flour', '3 Tbsp'], ['Eggs', '5 each'], ['Egg yolks', '2 each'], ['Vanilla extract', '3 tsp'], ['Graham cracker crumbs', '6.75 oz'], ['Unsalted butter', '6 oz'], [''], ['Method of Prep'], ['Mix it.'], ['Bake it.'], ['Cool it.'], ['Eat it.']] 

chicken_parm_meatballs_recipe.docx
[['Ground chicken', '18 packs'], ['Eggs', '18 ea'], ['Panko', '14 c'], ['White onion (fine diced)', '4 ea'], ['Garlic (chopped)', '8 Tbsp'], ['Romano (grated)', '1 Qt'], ['Marinara', '2 c'], ['Tri-mix', '12 tsp'], ['Crushed red pepper', '3 Tbsp'], ['Dry thyme', '4 Tbsp'], ['Fresh parsley (chopped)', '15 Tbsp'], ['Fresh basil (chopped)', '5 Tbsp'],

### Create DataFrame from nested list of ingredients

In [84]:
basil_ricotta_recipe = recipe_from_docx(f"./recipes/{recipe_files[0]}")
basil_ricotta_recipe

[['Ricotta', '2 lbs'],
 ['Fresh basil', '1 c'],
 ['Lemon (juice and zest)', '1 ea'],
 ['Ground nutmeg', '1 tsp'],
 ['Tri-mix', '2 tsp'],
 ['EVOO', '⅛ c'],
 ['Heavy cream', '¼ c'],
 ['Method of Prep'],
 ['Mix all together.'],
 ['']]

In [85]:
df = pd.DataFrame(basil_ricotta_recipe, columns=['Ingredient_Name','Recipe_Qty']).dropna()
# df['Recipe_Qty'], df['Recipe_Unit'] = df['Recipe_Qty'].str.split(' ', 1).str
# df['Recipe_Qty'].str.split(' ', 1, expand=True)
df = df.join(df['Recipe_Qty'].str.split(' ', 1, expand=True)).drop(columns=['Recipe_Qty']).rename(columns={0:'Recipe_Qty',1:'Recipe_Unit'})
df

Unnamed: 0,Ingredient_Name,Recipe_Qty,Recipe_Unit
0,Ricotta,2,lbs
1,Fresh basil,1,c
2,Lemon (juice and zest),1,ea
3,Ground nutmeg,1,tsp
4,Tri-mix,2,tsp
5,EVOO,⅛,c
6,Heavy cream,¼,c


#### Function

In [89]:
def recipe_to_df(recipe_name):
    df = pd.DataFrame(recipe_name, columns=['Ingredient_Name','Recipe_Qty']).dropna()
    df = df.join(df['Recipe_Qty'].str.split(' ', 1, expand=True)).drop(columns=['Recipe_Qty']).rename(columns={0:'Recipe_Qty',1:'Recipe_Unit'})
    return df

In [90]:
recipe_to_df(basil_ricotta_recipe)

Unnamed: 0,Ingredient_Name,Recipe_Qty,Recipe_Unit
0,Ricotta,2,lbs
1,Fresh basil,1,c
2,Lemon (juice and zest),1,ea
3,Ground nutmeg,1,tsp
4,Tri-mix,2,tsp
5,EVOO,⅛,c
6,Heavy cream,¼,c


# Class


# Product List CSV

# Recipe Cost CSV


# News API
API for searching articles and breaking news headlines from around the world, even in other languages.