# Data Collection for Impossible Foods

Summary: I will extract pdf tables from https://foodinsight.org/wp-content/uploads/2020/01/IFIC-Plant-Alternative-to-Animal-Meat-Survey.pdf and turn tables into separate pandas dataframes. Then I will use an API to pull nutritional information about fake meat and compare what people believe about the healthiness of Impossible meat to the nutrition facts of Impossible meat.

## PART 1 

In [49]:
# tabula-py can read tables in a PDF
!pip install tabula-py



In [1]:
# import packages we need
# importing pandas to convert finalized data to a dataframe
import pandas as pd

# import regular expression so can interpret non-exact classes that we would have used review.find for if were exact
import re

# import random library so can generate random number for sleeps between server requests
import random

#import time (sleep) function to sleep server
import time

#json will allow us to pull our API returns into a python dictionary
import json

#need this db tool to connect to the database, there are other ways but we are using this one
from sqlalchemy import create_engine

# tabula allows us to read tables from a PDF and convert them into a pandas DataFrame
import tabula

# Import package to make HTTP request, i.e. fetch URL similar to a browser request
import requests

In [2]:
# file that I will be reading / scraping data tables from
file = "https://foodinsight.org/wp-content/uploads/2020/01/IFIC-Plant-Alternative-to-Animal-Meat-Survey.pdf"

#### Dataframe 1: When thinking about all the food and beverages you consume, please select the statement that best describes you.

In [3]:
# importing pdf file and reading all tables
# having issues reading more than 1 page of table data
# I will begin with just 1 table and automate process in part 3 of this assignment
diets_df = tabula.read_pdf(file, pages=[16],lattice=True, stream=True)

In [4]:
# viewing the table read to check for what needs to be edited
diets_df

Unnamed: 0,"Diet Types\rQ1. When thinking about all the food and beverages you consume, please select the statement that best describes you.",Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6
0,,Omnivore,Vegetarian,Pescatarian,Vegan,Some Vegetarian,Other
1,Total,66%,6%,5%,5%,8%,11%
2,Men,59% ↓,7%,9% ↑,6%,9%,11%
3,Women,73% ↑,5%,2% ↓,4%,6%,11%
4,White,67%,6%,4%,4%,8%,10%
5,African American,66%,3%,5%,8%,2%,16%
6,Hispanic/ Latinx,65%,7%,9%,4%,11%,4% ↓
7,Under 45,57% ↓,8% ↑,6%,8% ↑,11% ↑,10%
8,45-64,69%,5%,4%,3%,6%,13%
9,65+,83% ↑,2% ↓,3%,0% ↓,3%,9%


In [9]:
# double checking it is viewed as a dataframe
type(diets_df)

pandas.core.frame.DataFrame

In [10]:
# setting column as header
diets_df.columns = ['Demographics','Omnivore','Vegetarian','Pescatarian','Vegan','Some Vegetarian','Other']

In [11]:
# dropping repeat header column
diets_df = diets_df.drop(0)

# getting rid of excess columns
diets_df = diets_df[0:15]

In [12]:
# set index as demographics column
diets_df = diets_df.set_index(['Demographics'])

In [13]:
# checking dataframe for more edits
diets_df

Unnamed: 0_level_0,Omnivore,Vegetarian,Pescatarian,Vegan,Some Vegetarian,Other
Demographics,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
Total,66%,6%,5%,5%,8%,11%
Men,59% ↓,7%,9% ↑,6%,9%,11%
Women,73% ↑,5%,2% ↓,4%,6%,11%
White,67%,6%,4%,4%,8%,10%
African American,66%,3%,5%,8%,2%,16%
Hispanic/ Latinx,65%,7%,9%,4%,11%,4% ↓
Under 45,57% ↓,8% ↑,6%,8% ↑,11% ↑,10%
45-64,69%,5%,4%,3%,6%,13%
65+,83% ↑,2% ↓,3%,0% ↓,3%,9%
Northeast,63%,5%,3%,8% ↑,10%,10%


#### Dataframe 2: Why did you decide to eat a plant alternative to animal meat? (Select all that apply).

In [106]:
# importing pdf file and reading table
reasoning_df = tabula.read_pdf(file, pages=[18],lattice=True, stream=True)

In [107]:
# checking dataframe for edits
reasoning_df

Unnamed: 0,Reasons for consuming a plant alternative to animal meat\rQ3. Why did you decide to eat a plant alternative to animal meat? Please select all that apply.,Unnamed: 1,Unnamed: 2,Unnamed: 3,Unnamed: 4,Unnamed: 5,Unnamed: 6,Unnamed: 7
0,,Thought it would\rtaste good,Trying to eat less\rmeat,Believe plants are\rbetter for enviro,Believe plants are\rhealthier,Encouraged to try\rby friend/family,Heard about and\rcurious,Ingredients\rintrigued me
1,Total,26%,27%,27%,24%,20%,30%,23%
2,Men,22%,27%,29%,23%,20%,29%,20%
3,Women,32%,27%,25%,25%,20%,31%,25%
4,White,26%,30%,26%,26%,20%,30%,23%
5,African American,37%,15%,19%,9%,24%,27%,29%
6,Hispanic/ Latinx,22%,25%,34%,26%,26%,33%,14%
7,Under 45,29%,25%,29%,24%,23%,30%,23%
8,45-64,28%,33%,26%,25%,17%,28%,22%
9,65+,7%,25%,23%,25%,13%,34%,23%


In [108]:
# setting column heads
reasoning_df.columns = ['Demographics','Thought it would taste good','Trying to eat less meat','Believe plants are better for environment','Believe plants are healthier','Encouraged to try by friend/family','Heard about and curious','Ingredients intrigued me']


In [109]:
# dropping repeat header column
reasoning_df = reasoning_df.drop(0)

# getting rid of excess columns
reasoning_df = reasoning_df[0:15]

In [110]:
# set index as demographics column
reasoning_df = reasoning_df.set_index(['Demographics'])

In [111]:
# checking dataframe in case more edits are necessary for a clean table
reasoning_df

Unnamed: 0_level_0,Thought it would taste good,Trying to eat less meat,Believe plants are better for environment,Believe plants are healthier,Encouraged to try by friend/family,Heard about and curious,Ingredients intrigued me
Demographics,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Total,26%,27%,27%,24%,20%,30%,23%
Men,22%,27%,29%,23%,20%,29%,20%
Women,32%,27%,25%,25%,20%,31%,25%
White,26%,30%,26%,26%,20%,30%,23%
African American,37%,15%,19%,9%,24%,27%,29%
Hispanic/ Latinx,22%,25%,34%,26%,26%,33%,14%
Under 45,29%,25%,29%,24%,23%,30%,23%
45-64,28%,33%,26%,25%,17%,28%,22%
65+,7%,25%,23%,25%,13%,34%,23%
Northeast,27%,25%,27%,23%,20%,31%,31%


#### Dataframe 3: Postivie reactions to plant alternative to meat.

In [19]:
# importing pdf file and reading table
pos_reactions_df = tabula.read_pdf(file, pages=[20],lattice=True, stream=True)

In [21]:
# setting column heads
pos_reactions_df.columns = ['Demographics','liked_taste','tasted_like_meat','texture_like_meat','cooks_like_meat','less_concerned_food_safety','nothin_liked','other']


In [24]:
# viewing table to see what I need to clean up
pos_reactions_df

Unnamed: 0,Demographics,liked_taste,tasted_like_meat,texture_like_meat,cooks_like_meat,less_concerned_food_safety,nothin_liked,other
0,,Liked the taste,Tasted like meat,Texture similar to\rmeat,Able to cook/\rtreat like meat,Less concerned\rabout food safety,Nothing I liked,Other
1,Total,53%,34%,35%,29%,29%,8%,4%
2,Men,51%,35%,34%,31%,29%,8%,4%
3,Women,54%,33%,36%,26%,29%,8%,5%
4,White,54%,34%,36%,30%,25%,8%,6%
5,African American,52%,40%,39%,18%,37%,7%,0%
6,Hispanic/ Latinx,47%,32%,32%,28%,42%,8%,2%
7,Under 45,55%,38%,33%,28%,34%,5%,3%
8,45-64,50%,28%,37%,31%,25%,11%,7%
9,65+,44%,32%,42%,27%,13%,18%,8%


In [25]:
# dropping repeat header column
pos_reactions_df = pos_reactions_df.drop(0)

# getting rid of excess columns
pos_reactions_df = pos_reactions_df[0:15]

In [26]:
pos_reactions_df

Unnamed: 0,Demographics,liked_taste,tasted_like_meat,texture_like_meat,cooks_like_meat,less_concerned_food_safety,nothin_liked,other
1,Total,53%,34%,35%,29%,29%,8%,4%
2,Men,51%,35%,34%,31%,29%,8%,4%
3,Women,54%,33%,36%,26%,29%,8%,5%
4,White,54%,34%,36%,30%,25%,8%,6%
5,African American,52%,40%,39%,18%,37%,7%,0%
6,Hispanic/ Latinx,47%,32%,32%,28%,42%,8%,2%
7,Under 45,55%,38%,33%,28%,34%,5%,3%
8,45-64,50%,28%,37%,31%,25%,11%,7%
9,65+,44%,32%,42%,27%,13%,18%,8%
10,Northeast,52%,39%,33%,31%,30%,5%,5%


In [27]:
# set index as demographics column
pos_reactions_df = pos_reactions_df.set_index(['Demographics'])

In [28]:
pos_reactions_df.head()

Unnamed: 0_level_0,liked_taste,tasted_like_meat,texture_like_meat,cooks_like_meat,less_concerned_food_safety,nothin_liked,other
Demographics,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Total,53%,34%,35%,29%,29%,8%,4%
Men,51%,35%,34%,31%,29%,8%,4%
Women,54%,33%,36%,26%,29%,8%,5%
White,54%,34%,36%,30%,25%,8%,6%
African American,52%,40%,39%,18%,37%,7%,0%


#### Dataframe 4: Negative reactions to plant alternative to meat.

In [29]:
# importing pdf file and reading table
neg_reactions_df = tabula.read_pdf(file, pages=[21],lattice=True, stream=True)

In [30]:
# setting column heads
neg_reactions_df.columns = ['Demographics','did_not_like_taste','did_not_taste_like_meat','texture_not_like_meat','does_not_cooks_like_meat','more_concerned_food_safety','liked_everything','other']


In [31]:
# viewing table to see what I need to clean up
neg_reactions_df

Unnamed: 0,Demographics,did_not_like_taste,did_not_taste_like_meat,texture_not_like_meat,does_not_cooks_like_meat,more_concerned_food_safety,liked_everything,other
0,,Didn’t like the\rtaste,Didn’t taste like\rmeat,Texture not like\rmeat,Not able to cook/\rtreat like meat,More concerned\rabout food safety,Nothing I didn’t\rlike,Other
1,Total,19%,20%,31%,18%,25%,40%,5%
2,Men,19%,21%,36%,23%,34% ↑,31% ↓,5%
3,Women,20%,19%,26%,13%,16% ↓,50% ↑,5%
4,White,20%,21%,32%,18%,26%,38%,6%
5,African American,19%,16%,29%,30%,21%,39%,6%
6,Hispanic/ Latinx,15%,20%,30%,15%,23%,46%,5%
7,Under 45,19%,23%,35%,22%,33% ↑,32% ↓,3% ↓
8,45-64,22%,15%,26%,14%,15% ↓,49%,10% ↑
9,65+,17%,19%,25%,7%,6% ↓,61% ↑,5%


In [33]:
# dropping repeat header column
neg_reactions_df = neg_reactions_df.drop(0)

# getting rid of excess columns
neg_reactions_df = neg_reactions_df[0:15]

In [34]:
# checking that edits went through properly
neg_reactions_df

Unnamed: 0,Demographics,did_not_like_taste,did_not_taste_like_meat,texture_not_like_meat,does_not_cooks_like_meat,more_concerned_food_safety,liked_everything,other
1,Total,19%,20%,31%,18%,25%,40%,5%
2,Men,19%,21%,36%,23%,34% ↑,31% ↓,5%
3,Women,20%,19%,26%,13%,16% ↓,50% ↑,5%
4,White,20%,21%,32%,18%,26%,38%,6%
5,African American,19%,16%,29%,30%,21%,39%,6%
6,Hispanic/ Latinx,15%,20%,30%,15%,23%,46%,5%
7,Under 45,19%,23%,35%,22%,33% ↑,32% ↓,3% ↓
8,45-64,22%,15%,26%,14%,15% ↓,49%,10% ↑
9,65+,17%,19%,25%,7%,6% ↓,61% ↑,5%
10,Northeast,17%,21%,35%,20%,20%,41%,5%


In [36]:
# set index as demographics column
neg_reactions_df = neg_reactions_df.set_index(['Demographics'])

In [37]:
neg_reactions_df.head()

Unnamed: 0_level_0,did_not_like_taste,did_not_taste_like_meat,texture_not_like_meat,does_not_cooks_like_meat,more_concerned_food_safety,liked_everything,other
Demographics,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
Total,19%,20%,31%,18%,25%,40%,5%
Men,19%,21%,36%,23%,34% ↑,31% ↓,5%
Women,20%,19%,26%,13%,16% ↓,50% ↑,5%
White,20%,21%,32%,18%,26%,38%,6%
African American,19%,16%,29%,30%,21%,39%,6%


## PART 2

### Pulling nutrition facts from Nutritionix API

In [112]:
# Set GET url
url = "https://api.nutritionix.com/v1_1/search/impossible?results=0:20&fields=item_name,brand_name,item_id,nf_calories,nf_total_fat,nf_total_carbohydrate,nf_sugars,nf_protein&appId=APP_ID&appKey=KEY"

api_response = requests.get(url)

In [113]:
# checking connection
api_response

<Response [200]>

In [115]:
api_response.text

'{"total_hits":45,"max_score":6.1342916,"hits":[{"_index":"f762ef22-e660-434f-9071-a10ea6691c27","_type":"item","_id":"5ff864fdfec2a13721afd270","_score":6.1342916,"fields":{"item_id":"5ff864fdfec2a13721afd270","item_name":"Burger Made From Plants","brand_name":"Impossible","nf_calories":240,"nf_total_fat":14,"nf_total_carbohydrate":9,"nf_sugars":0.5,"nf_protein":19,"nf_serving_size_qty":1,"nf_serving_size_unit":"serving"}},{"_index":"f762ef22-e660-434f-9071-a10ea6691c27","_type":"item","_id":"5f68b45ebc11f51728f90693","_score":5.830667,"fields":{"item_id":"5f68b45ebc11f51728f90693","item_name":"Burger Patties Made From Plants","brand_name":"Impossible","nf_calories":240,"nf_total_fat":14,"nf_total_carbohydrate":9,"nf_sugars":0.5,"nf_protein":19,"nf_serving_size_qty":1,"nf_serving_size_unit":"serving"}},{"_index":"f762ef22-e660-434f-9071-a10ea6691c27","_type":"item","_id":"5d8f06903fd1dc45742c5221","_score":5.793393,"fields":{"item_id":"5d8f06903fd1dc45742c5221","item_name":"Burger, ma

In [116]:
type(api_response.text)

str

In [117]:
json.loads(api_response.text)

{'total_hits': 45,
 'max_score': 6.1342916,
 'hits': [{'_index': 'f762ef22-e660-434f-9071-a10ea6691c27',
   '_type': 'item',
   '_id': '5ff864fdfec2a13721afd270',
   '_score': 6.1342916,
   'fields': {'item_id': '5ff864fdfec2a13721afd270',
    'item_name': 'Burger Made From Plants',
    'brand_name': 'Impossible',
    'nf_calories': 240,
    'nf_total_fat': 14,
    'nf_total_carbohydrate': 9,
    'nf_sugars': 0.5,
    'nf_protein': 19,
    'nf_serving_size_qty': 1,
    'nf_serving_size_unit': 'serving'}},
  {'_index': 'f762ef22-e660-434f-9071-a10ea6691c27',
   '_type': 'item',
   '_id': '5f68b45ebc11f51728f90693',
   '_score': 5.830667,
   'fields': {'item_id': '5f68b45ebc11f51728f90693',
    'item_name': 'Burger Patties Made From Plants',
    'brand_name': 'Impossible',
    'nf_calories': 240,
    'nf_total_fat': 14,
    'nf_total_carbohydrate': 9,
    'nf_sugars': 0.5,
    'nf_protein': 19,
    'nf_serving_size_qty': 1,
    'nf_serving_size_unit': 'serving'}},
  {'_index': 'f762ef22-

In [118]:
type(json.loads(api_response.text))

dict

In [119]:
api_data = json.loads(api_response.text)

In [120]:
api_data

{'total_hits': 45,
 'max_score': 6.1342916,
 'hits': [{'_index': 'f762ef22-e660-434f-9071-a10ea6691c27',
   '_type': 'item',
   '_id': '5ff864fdfec2a13721afd270',
   '_score': 6.1342916,
   'fields': {'item_id': '5ff864fdfec2a13721afd270',
    'item_name': 'Burger Made From Plants',
    'brand_name': 'Impossible',
    'nf_calories': 240,
    'nf_total_fat': 14,
    'nf_total_carbohydrate': 9,
    'nf_sugars': 0.5,
    'nf_protein': 19,
    'nf_serving_size_qty': 1,
    'nf_serving_size_unit': 'serving'}},
  {'_index': 'f762ef22-e660-434f-9071-a10ea6691c27',
   '_type': 'item',
   '_id': '5f68b45ebc11f51728f90693',
   '_score': 5.830667,
   'fields': {'item_id': '5f68b45ebc11f51728f90693',
    'item_name': 'Burger Patties Made From Plants',
    'brand_name': 'Impossible',
    'nf_calories': 240,
    'nf_total_fat': 14,
    'nf_total_carbohydrate': 9,
    'nf_sugars': 0.5,
    'nf_protein': 19,
    'nf_serving_size_qty': 1,
    'nf_serving_size_unit': 'serving'}},
  {'_index': 'f762ef22-

In [121]:
nutrition_data = api_data['hits']

In [122]:
# viewing nutrition data to see field names and what I would like to pull
nutrition_data

[{'_index': 'f762ef22-e660-434f-9071-a10ea6691c27',
  '_type': 'item',
  '_id': '5ff864fdfec2a13721afd270',
  '_score': 6.1342916,
  'fields': {'item_id': '5ff864fdfec2a13721afd270',
   'item_name': 'Burger Made From Plants',
   'brand_name': 'Impossible',
   'nf_calories': 240,
   'nf_total_fat': 14,
   'nf_total_carbohydrate': 9,
   'nf_sugars': 0.5,
   'nf_protein': 19,
   'nf_serving_size_qty': 1,
   'nf_serving_size_unit': 'serving'}},
 {'_index': 'f762ef22-e660-434f-9071-a10ea6691c27',
  '_type': 'item',
  '_id': '5f68b45ebc11f51728f90693',
  '_score': 5.830667,
  'fields': {'item_id': '5f68b45ebc11f51728f90693',
   'item_name': 'Burger Patties Made From Plants',
   'brand_name': 'Impossible',
   'nf_calories': 240,
   'nf_total_fat': 14,
   'nf_total_carbohydrate': 9,
   'nf_sugars': 0.5,
   'nf_protein': 19,
   'nf_serving_size_qty': 1,
   'nf_serving_size_unit': 'serving'}},
 {'_index': 'f762ef22-e660-434f-9071-a10ea6691c27',
  '_type': 'item',
  '_id': '5d8f06903fd1dc45742c52

In [123]:
# creating a new dict to hold finalized nutrition info that I want to turn into a dataframe
# this takes about 30 seconds

#nutrition_facts will be finalized dictionary
nutrition_facts = {
    'item': [],
    'brand': [],
    'calories': [],
    'total_fat': [],
    'total_carbohydrates': [],
    'total_sugars': [],
    'protein': [],
    'serving_size': [],
    'serving_size_unit': []
}


#this for loop will go through the nutrition_data dict, pull specific values based on the keys inputted, and 
# append them to the nutrition_facts dict
for food in nutrition_data:
    name = food['fields']['item_name']
    nutrition_facts['item'].append(name)
    print('name: ',name)
    
    brand = food['fields']['brand_name']
    nutrition_facts['brand'].append(brand)
    print('brand: ',brand)
    
    calories = food['fields']['nf_calories']
    nutrition_facts['calories'].append(calories)
    print('calories: ',calories)
    
    fat = food['fields']['nf_total_fat']
    nutrition_facts['total_fat'].append(fat)
    print('fat: ',fat)
    
    carbs = food['fields']['nf_total_carbohydrate']
    nutrition_facts['total_carbohydrates'].append(carbs)
    print('carbs: ',carbs)
    
    sugar = food['fields']['nf_sugars']
    nutrition_facts['total_sugars'].append(sugar)
    print('sugar: ',sugar)
    
    protein = food['fields']['nf_protein']
    nutrition_facts['protein'].append(protein)
    print('protein: ',protein)
    
    servsiz = food['fields']['nf_serving_size_qty']
    nutrition_facts['serving_size'].append(servsiz)
    print('serving size: ',servsiz)
    
    servunit = food['fields']['nf_serving_size_unit']
    nutrition_facts['serving_size_unit'].append(servunit)
    print('serving size measurement: ',servunit)
    
    # add sleep between each request so we do not get blocked from server or stop other people from accessing it
    # to not overwhelm yelp server
    sleep_duration = random.randint(1,5)
    print(f'sleeping for {sleep_duration} second(s)')
    time.sleep(sleep_duration)
    
    print('-'*50)

name:  Burger Made From Plants
brand:  Impossible
calories:  240
fat:  14
carbs:  9
sugar:  0.5
protein:  19
serving size:  1
serving size measurement:  serving
sleeping for 1 second(s)
--------------------------------------------------
name:  Burger Patties Made From Plants
brand:  Impossible
calories:  240
fat:  14
carbs:  9
sugar:  0.5
protein:  19
serving size:  1
serving size measurement:  serving
sleeping for 2 second(s)
--------------------------------------------------
name:  Burger, made from Plants
brand:  Impossible
calories:  240
fat:  14
carbs:  9
sugar:  0.5
protein:  19
serving size:  1
serving size measurement:  serving
sleeping for 2 second(s)
--------------------------------------------------
name:  Burger Patties Made from Plants
brand:  Impossible
calories:  240
fat:  14
carbs:  9
sugar:  0.5
protein:  19
serving size:  1
serving size measurement:  serving
sleeping for 1 second(s)
--------------------------------------------------
name:  Burger made from Plants
bran

In [124]:
# confirming what is inside
# takes about 6 seconds
nutrition_facts

{'item': ['Burger Made From Plants',
  'Burger Patties Made From Plants',
  'Burger, made from Plants',
  'Burger Patties Made from Plants',
  'Burger made from Plants',
  'Beef Made From Plants',
  'Burger Made From Plants',
  'Burger made from Plants',
  'Impossible Burger',
  'Impossible Burger',
  'Impossible Patty',
  'Impossible Burger',
  'Impossible Burger',
  'Impossible Burger',
  'Impossible Whopper',
  'Impossible Burger',
  'The Impossible Burger',
  'Impossible Burger',
  'Impossible Patty',
  'Impossible Burger'],
 'brand': ['Impossible',
  'Impossible',
  'Impossible',
  'Impossible',
  'Impossible',
  'Impossible',
  'Impossible',
  'Impossible',
  'Cheesecake Factory',
  'Claim Jumper',
  'Burger 21',
  'Earls',
  'Burger 21',
  'Ram Restaurant & Brewery',
  'Burger King',
  'Wahlburgers',
  'Wahlburgers',
  'Glory Days Grill',
  'Islands Fine Burgers & Drinks',
  'Firebirds Wood Fired Grill'],
 'calories': [240,
  240,
  240,
  240,
  240,
  238,
  240,
  240,
  1010

In [125]:
# convert dictionary to df
nutrition_df = pd.DataFrame(nutrition_facts)

In [126]:
# check df was properly made
nutrition_df.head()

#side note: this has taken so much work to create this beautiful df, I am so proud

Unnamed: 0,item,brand,calories,total_fat,total_carbohydrates,total_sugars,protein,serving_size,serving_size_unit
0,Burger Made From Plants,Impossible,240,14,9.0,0.5,19,1,serving
1,Burger Patties Made From Plants,Impossible,240,14,9.0,0.5,19,1,serving
2,"Burger, made from Plants",Impossible,240,14,9.0,0.5,19,1,serving
3,Burger Patties Made from Plants,Impossible,240,14,9.0,0.5,19,1,serving
4,Burger made from Plants,Impossible,240,14,9.0,0.5,19,1,serving


In [127]:
# Need to export all as csv's
# diets_df
# resoning_df
# nutrition_df
# export dataframe to a csv file with the index column
diets_df.to_csv('diets.csv')
reasoning_df.to_csv('motivations.csv')
nutrition_df.to_csv('nutrition.csv')

## Writing DataFrames to AWS RDS Database

In [38]:
# Step 1: create DB connection 
engine = create_engine('mysql+mysqldb://USERNAME:PASSWORD@ENDPOINT:PORT/impossible_foods?charset=utf8')

In [135]:
# writing df to database
diets_df.to_sql('diets', engine, if_exists='append') 
reasoning_df.to_sql('motivations', engine, if_exists='append')
nutrition_df.to_sql('nutrition', engine, if_exists='append', index = False)

In [42]:
pos_reactions_df.to_sql('positive_reactions', engine, if_exists='append')
neg_reactions_df.to_sql('negative_reactions', engine, if_exists='append')