# Nutritional Analysis of Foods
## Overview and Motivation

## Related work

## Initial Questions
* Is there a food that, if eaten solely could fulfill all the nutritional requirements of a 2000 calorie diet?
* What are the foods that provide all the nutritional value you need for the least amount of Calories?
* What are some overall trends of the best foods?
* What are the trends of all the foods that provide you with enough nutrients?

## Exploratory Analysis

All of the data we needed was readily available in a CSV file

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
from bokeh.io import push_notebook,show,output_notebook
from bokeh.layouts import row
from bokeh.plotting import figure
from bokeh.charts import Bar, output_file, show
from bokeh.models import Range1d
from bokeh.charts.operations import blend
from bokeh import palettes
%matplotlib inline
output_notebook()
calfoods = pd.read_csv('ABBREV.csv', index_col=0)
calfoods.head(5)

Unnamed: 0_level_0,Shrt_Desc,Water_(g),Energ_Kcal,Protein_(g),Lipid_Tot_(g),Ash_(g),Carbohydrt_(g),Fiber_TD_(g),Sugar_Tot_(g),Calcium_(mg),...,Vit_K_(�g),FA_Sat_(g),FA_Mono_(g),FA_Poly_(g),Cholestrl_(mg),GmWt_1,GmWt_Desc1,GmWt_2,GmWt_Desc2,Refuse_Pct
NDB_No,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1001,"BUTTER,WITH SALT",15.87,717,0.85,81.11,2.11,0.06,0.0,0.06,24.0,...,7.0,51.368,21.021,3.043,215.0,5.0,"1 pat, (1"" sq, 1/3"" high)",14.2,1 tbsp,0.0
1002,"BUTTER,WHIPPED,W/ SALT",16.72,718,0.49,78.3,1.62,2.87,0.0,0.06,23.0,...,4.6,45.39,19.874,3.331,225.0,3.8,"1 pat, (1"" sq, 1/3"" high)",9.4,1 tbsp,0.0
1003,"BUTTER OIL,ANHYDROUS",0.24,876,0.28,99.48,0.0,0.0,0.0,0.0,4.0,...,8.6,61.924,28.732,3.694,256.0,12.8,1 tbsp,205.0,1 cup,0.0
1004,"CHEESE,BLUE",42.41,353,21.4,28.74,5.11,2.34,0.0,0.5,528.0,...,2.4,18.669,7.778,0.8,75.0,28.35,1 oz,17.0,1 cubic inch,0.0
1005,"CHEESE,BRICK",41.11,371,23.24,29.68,3.18,2.79,0.0,0.51,674.0,...,2.5,18.764,8.598,0.784,94.0,132.0,"1 cup, diced",113.0,"1 cup, shredded",0.0


After importing it we did some light cleaning where we:
* Removed foods that don't contain all the essential nutrients
* Normalized all the nutriets to 1 calorie while also 
* Made the column names a bit easier to read

In [2]:
#Rename some of our columns to something a bit easier on the eyes
calfoods = calfoods.rename(index=str,columns={'Shrt_Desc':'Name','Protein_(g)':'Protein (g)','Lipid_Tot_(g)':'Total Fat(g)','Cholestrl_(mg)':'Cholesterol (mg)',
               'FA_Sat_(g)':'Saturated Fat (g)','Sodium_(mg)':'Sodium (mg)','Potassium_(mg)':'Potassium (mg)',
               'Carbohydrt_(g)':'Carbohydrates (g)','Fiber_TD_(g)':'Fiber (g)','Energ_Kcal':'Calories'})
# Look at a specific subset of nutrients
calnutrients = ['Name','Protein (g)','Total Fat(g)','Cholesterol (mg)',
               'Saturated Fat (g)','Sodium (mg)','Potassium (mg)',
               'Carbohydrates (g)','Fiber (g)','Calories','Weight (g)']
calfoods['Weight (g)'] = 100.0
calfoods = calfoods[calnutrients]
calfoods = calfoods.fillna(0)
# Get rid of foods we dont have serving sizes for
calfoods = calfoods[calfoods.apply(lambda x:x['Calories'] > 0, axis=1)]

# Normalizing all our foods to 1 calorie
def normalizeNutrientsCal(x):
    ratio = x['Calories']
    for nutrient in calnutrients:
        if(type(x[nutrient]) is str):
            continue
        x[nutrient] = x[nutrient]/ratio
    return x
calfoods = calfoods.apply(normalizeNutrientsCal,axis=1)
calfoods.head()

# We only want foods that have a chance to sustain our needs
def filterNutrientsCal(x):
    for nutrient in calnutrients:
        if(x[nutrient] <= 0):
            return False
    return True
calfoods = calfoods[calfoods.apply(filterNutrientsCal, axis=1)]
print "Foods found:",len(calfoods)
calfoods.head(5)

Foods found: 1425


Unnamed: 0_level_0,Name,Protein (g),Total Fat(g),Cholesterol (mg),Saturated Fat (g),Sodium (mg),Potassium (mg),Carbohydrates (g),Fiber (g),Calories,Weight (g)
NDB_No,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
1013,"CHEESE,COTTAGE,CRMD,W/FRUIT",0.110206,0.039691,0.134021,0.023825,3.546392,0.927835,0.047526,0.002062,1,1.030928
1043,"CHEESE,PAST PROCESS,PIMENTO",0.059013,0.0832,0.250667,0.052435,2.44,0.432,0.004613,0.000267,1,0.266667
1102,"MILK,CHOC,FLUID,COMM,WHL,W/ ADDED VIT A & VITA...",0.038193,0.040843,0.144578,0.025349,0.722892,2.012048,0.124578,0.009639,1,1.204819
1103,"MILK,CHOC,FLUID,COMM,RED FAT",0.039342,0.025,0.105263,0.015487,0.868421,2.223684,0.159605,0.009211,1,1.315789
1104,"MILK,CHOC,LOWFAT,W/ ADDED VIT A & VITAMIN D",0.055806,0.016129,0.080645,0.009419,1.048387,2.774194,0.159032,0.001613,1,1.612903


Once we had all the foods cleaned, we could scale them up to meet the recommended daily amount of nutrients for a 2000 Calorie diet which are


| Nutrient              | Unit of Measure | Daily Values |
|-----------------------|-----------------|--------------|
| Total Fat             | grams (g)       | 65           |
| Saturated fatty acids | grams (g)       | 20           |
| Cholesterol           | milligrams (mg) | 300          |
| Sodium                | milligrams (mg) | 2400         |
| Potassium             | milligrams (mg) | 3500         |
| Total carbohydrate    | grams (g)       | 300          |
| Fiber                 | grams (g)       | 25           |
| Protein               | grams (g)       | 50           |

For the purposes of our analysis we ignored vitamins as they would disqualify too many foods and could be obtained without any calories through a multi-vitamin

In [3]:
# http://www.netrition.com/rdi_page.html
recommended = [-1,50,65,300,20,2400,3500,300,25,-1,-1]
def findSatisfyingWeightCal(food):
    for x in range(0,len(calnutrients)):
        nutrient = calnutrients[x]
        rec = recommended[x]
        if(rec == -1 or food[nutrient] >= rec):
            continue
        ratio = rec/food[nutrient]
        for y in calnutrients:
            if(type(food[y]) is str):
                continue
            food[y] = food[y]*ratio
    return food   
calweighted_foods = calfoods.apply(findSatisfyingWeightCal,axis=1)
display = ['Name','Calories','Weight (g)','Protein (g)','Total Fat(g)','Cholesterol (mg)',
               'Saturated Fat (g)','Sodium (mg)','Potassium (mg)',
               'Carbohydrates (g)','Fiber (g)']

### The Best Foods
Withouth further ado, here are the best foods that we found sorted by total calories to achieve your daily necessary nutrients

In [4]:
calweighted_foods[display].sort_values(by='Calories').head(10)

Unnamed: 0_level_0,Name,Calories,Weight (g),Protein (g),Total Fat(g),Cholesterol (mg),Saturated Fat (g),Sodium (mg),Potassium (mg),Carbohydrates (g),Fiber (g)
NDB_No,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
3127,"BABYFOOD,VEG,SPINACH,CRMD,STR",2220.0,6000.0,150.0,78.0,300.0,42.12,2940.0,11460.0,342.0,108.0
6082,"CAMPBELL'S RED & WHITE,CHICK NOODLEO'S SOUP,COND",2330.808081,3282.828283,78.131313,65.0,525.252525,26.065657,12507.575758,14083.333333,390.656566,26.262626
16059,"CHILI WITH BEANS,CANNED",2333.836858,2265.861027,138.670695,85.196375,385.196375,25.672205,9584.592145,8270.392749,300.0,74.773414
6986,"CAMPBELL'S HEALTHY REQUEST,CHICK NOODLE SOUP,COND",2350.0,5000.0,120.0,65.0,500.0,20.0,16700.0,17500.0,325.5,50.0
22982,"KASHI,STEAM MEAL,CHICK FETTUCCINE,FRZ ENTREE",2383.333333,2407.407407,149.259259,65.0,337.037037,21.666667,3996.296296,4044.444444,337.037037,48.148148
11372,"POTATOES,SCALLPD,HOME-PREPARED W/BUTTER",2448.979592,2782.931354,79.87013,102.411874,333.951763,62.755102,9322.820037,10519.480519,300.0,52.875696
11387,"POTATOES,SCALLPD,DRY MIX,PREP W/H2O,WHL MILK&B...",2536.363636,2727.272727,57.818182,117.272727,300.0,71.809091,9300.0,5536.363636,348.272727,30.0
3296,"BABYFOOD,TURKEY,RICE&VEG,TODD",2571.428571,4285.714286,162.857143,68.571429,300.0,21.428571,7842.857143,4585.714286,321.428571,34.285714
11385,"POTATOES,AU GRATIN,DRY MIX,PREP W/H2O,WHL MILK...",2583.333333,2777.777778,63.888889,114.444444,416.666667,71.833333,12194.444444,6083.333333,356.666667,25.0
6434,"CAMPBELL'S CHUNKY SOUPS,OLD FASHIONED VEG BF SOUP",2589.430894,5284.552846,166.99187,65.0,317.073171,21.560976,18284.552846,8613.821138,329.756098,68.699187


This looks a lot like what we would expect "healthy" foods to be, mostly soups, potatoes, and baby food.

*Wait did you just say baby food?*

While baby food may seem odd an odd pick for a "healthy" food at first it's important to remember that babies often have a very singular diet since they have yet to develop the ability to eat most foods and it would only make sense that one of the very few foods they eat would contain a good balance of essential nutrients.

As for our goal of finding a singular food that fits the 2000 calorie diet, it seems the closest we can get to the ideal 2000 Calories is with vegetable spinach baby food at 2220 Calories. So it seems that there is no single USDA approved food that can give you the right amount of nutrients for 2000 Calories or less.


After looking at the data it was hard to judge just how closely these foods stuck to the reccomended amount of nutrients so we converted the raw numbers into their percentage above the recommended daily amount and 

In [5]:
def findOveragesCal(food):
    for x in range(0,len(calnutrients)):
        nutrient = calnutrients[x]
        rec = recommended[x]
        if(rec == -1):
            continue
        food[nutrient] -= rec
    return food   
caloverage_foods = calweighted_foods.apply(findOveragesCal,axis=1)
def findPercentOveragesCal(food):
    for x in range(0,len(calnutrients)):
        nutrient = calnutrients[x]
        rec = recommended[x]
        if(rec == -1):
            continue
        food[nutrient] = ((food[nutrient]/rec)*100)-100
    return food   
caloverage_foods = calweighted_foods.apply(findPercentOveragesCal,axis=1)
caloverage_foods[display].sort_values(by='Calories').head(10).rename(index=str,columns={'Protein (g)':'Protein (%)','Total Fat(g)':'Total Fat(%)','Cholesterol (mg)':'Cholesterol (%)',
               'Saturated Fat (g)':'Saturated Fat (%)','Sodium (mg)':'Sodium (%)','Potassium (mg)':'Potassium (%)',
               'Carbohydrates (g)':'Carbohydrates (%)','Fiber (g)':'Fiber (g)'})
df = caloverage_foods.sort_values(by='Calories').head(10).rename(index=str,columns={'Protein (g)':'Protein (%)','Total Fat(g)':'Total Fat(%)','Cholesterol (mg)':'Cholesterol (%)',
               'Saturated Fat (g)':'Saturated Fat (%)','Sodium (mg)':'Sodium (%)','Potassium (mg)':'Potassium (%)',
               'Carbohydrates (g)':'Carbohydrates (%)','Fiber (g)':'Fiber (%)'})
a = Bar(df, label='vars',group='Name', 
        values=blend('Protein (%)', 'Total Fat(%)','Cholesterol (%)',
                     'Saturated Fat (%)','Sodium (%)','Potassium (%)',
                     'Carbohydrates (%)','Fiber (%)',name='values', labels_name='vars'),
        title="Excess Nutrients (% above recommended daily intake)",width=900,palette=palettes.BrBG11)
a.xaxis.axis_label = ""
a.yaxis.axis_label = "% above reccomended daily intake"
show(a)

Wow! that's a ton of sodium!

Looking at this graph makes it seem as though sodium and potassium are the largest overages by a huge margin but it's important to keep in mind that both of these nutrients are measured in milligrams and thus, changes to their content have a greater impact on these percentages.

We decided we needed to take another look at this graph without potassium and sodium to get a clearer picture of how the nutrients measured in grams stacked up to each-other.

In [6]:
df = caloverage_foods.sort_values(by='Calories').head(10).rename(index=str,columns={'Protein (g)':'Protein (%)','Total Fat(g)':'Total Fat(%)','Cholesterol (mg)':'Cholesterol (%)',
               'Saturated Fat (g)':'Saturated Fat (%)','Sodium (mg)':'Sodium (%)','Potassium (mg)':'Potassium (%)',
               'Carbohydrates (g)':'Carbohydrates (%)','Fiber (g)':'Fiber (%)'})
a = Bar(df, label='vars',group='Name', 
        values=blend('Protein (%)', 'Total Fat(%)','Cholesterol (%)',
                     'Saturated Fat (%)',
                     'Carbohydrates (%)','Fiber (%)',name='values', labels_name='vars'),
        title="Excess Nutrients (% above recommended daily intake) (Excluding Sodium & Potassium)",width=900,height=1000,palette=palettes.BrBG11)
a.xaxis.axis_label = ""
a.yaxis.axis_label = "% above reccomended daily intake"
show(a)

So second to sodium and potassium, our best foods have a the highest overages in proteins and saturated fats. So while these foods may be great for giving you all the essential nutrients you need, you may get a bit more then you bargained for in the form of sodium, protein, and saturated fats.

Taking another look at our darling child, vegetable-spinach baby food shows that it's most prominent overages are in saturated fats and proteins. This again, makes sense if you think about the context of a baby's life as it needs a lot of these nutrients to grow bigger and stronger.

The final thing to note is that out of our best foods, the lowest nutrient overall seems to be carbohydrates. This may explain why low carb diets often work for so many people as they may be getting a much more balanced set of nutrients which would improve their health overall.


### The Worst Foods
After looking at the best foods, we thought it was only fair to look at the worst by calorie as well hoping that it would provide insight into what makes certain foods more nutritious than others.

In [7]:
calweighted_foods[display].sort_values(by='Calories',ascending=False).head(10)

Unnamed: 0_level_0,Name,Calories,Weight (g),Protein (g),Total Fat(g),Cholesterol (mg),Saturated Fat (g),Sodium (mg),Potassium (mg),Carbohydrates (g),Fiber (g)
NDB_No,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
7979,"SAUSAGE,PORK,TURKEY,& BF,RED NA",774545.454545,272727.272727,29209.090909,73063.636364,190909.090909,29220.0,1851818.0,654545.454545,300.0,272.727273
21381,"MCDONALD'S,FRUIT 'N YOGURT PARFAIT (WITHOUT GR...",450000.0,500000.0,12350.0,5650.0,25000.0,20.0,190000.0,825000.0,88350.0,4500.0
7939,"FRANKFURTER,PORK",288214.285714,107142.857143,13725.0,25371.428571,70714.285714,9341.785714,874285.7,282857.142857,300.0,107.142857
19071,"CANDIES,CAROB,UNSWTND",162000.0,30000.0,2445.0,9408.0,300.0,8705.4,32100.0,189900.0,16887.0,1140.0
19086,"CANDIES,CONFECTIONER'S COATING,PNUT BUTTER",158700.0,30000.0,5490.0,8940.0,300.0,3936.0,75000.0,151500.0,14064.0,1500.0
28196,"MOTHER'S,CIRCUS ANIMAL COOKIES",154500.0,30000.0,1140.0,7590.0,300.0,7410.0,57600.0,29400.0,20490.0,240.0
28201,"MOTHER'S,HOLIDAY CIRCUS ANIMAL COOKIES",154500.0,30000.0,1140.0,7590.0,300.0,7410.0,57600.0,29400.0,20490.0,240.0
28200,"MOTHER'S,HALLOWEEN CIRCUS ANIMALS COOKIES",154200.0,30000.0,1140.0,7590.0,300.0,7410.0,57300.0,29400.0,20430.0,240.0
28204,"MOTHER'S,JUNGLE ANIMAL COOKIES",154200.0,30000.0,1140.0,7590.0,300.0,7290.0,57600.0,29400.0,20490.0,240.0
28080,"KEEBLER,FUDGE SHOPPE,DELUXE GRAHAMS COOKIES",153300.0,30000.0,1200.0,7740.0,300.0,5160.0,79500.0,80100.0,19980.0,660.0


More than half that list is just candy and cookies with Mother's circus animals and it's variations being the most prominent on the list.

Sausage however is the clear leader here but it's not clear just how much until we look at the overages chart

The most interesting part of the list was McDonald's yogurt parfaits, usually a parfait is seen as one of the healthier deserts you can have and it's appearance on our list actually presents a small hole in our methodology. The food itself is actually fairly healthy overall but it lacks in many of the key nutrients were looking for and thus ends up on our worst foods list because it has to be scaled up very high to get these nutrients to where we want them to be.

In [8]:
def findOveragesCal(food):
    for x in range(0,len(calnutrients)):
        nutrient = calnutrients[x]
        rec = recommended[x]
        if(rec == -1):
            continue
        food[nutrient] -= rec
    return food   
caloverage_foods = calweighted_foods.apply(findOveragesCal,axis=1)
def findPercentOveragesCal(food):
    for x in range(0,len(calnutrients)):
        nutrient = calnutrients[x]
        rec = recommended[x]
        if(rec == -1):
            continue
        food[nutrient] = ((food[nutrient]/rec)*100)-100
    return food   
caloverage_foods = calweighted_foods.apply(findPercentOveragesCal,axis=1)
caloverage_foods[display].sort_values(by='Calories',ascending=False).head(10).rename(index=str,columns={'Protein (g)':'Protein (%)','Total Fat(g)':'Total Fat(%)','Cholesterol (mg)':'Cholesterol (%)',
               'Saturated Fat (g)':'Saturated Fat (%)','Sodium (mg)':'Sodium (%)','Potassium (mg)':'Potassium (%)',
               'Carbohydrates (g)':'Carbohydrates (%)','Fiber (g)':'Fiber (g)'})
df = caloverage_foods.sort_values(by='Calories',ascending=False).head(10).rename(index=str,columns={'Protein (g)':'Protein (%)','Total Fat(g)':'Total Fat(%)','Cholesterol (mg)':'Cholesterol (%)',
               'Saturated Fat (g)':'Saturated Fat (%)','Sodium (mg)':'Sodium (%)','Potassium (mg)':'Potassium (%)',
               'Carbohydrates (g)':'Carbohydrates (%)','Fiber (g)':'Fiber (%)'})
a = Bar(df, label='vars',group='Name', 
        values=blend('Protein (%)', 'Total Fat(%)','Cholesterol (%)',
                     'Saturated Fat (%)','Sodium (%)','Potassium (%)',
                     'Carbohydrates (%)','Fiber (%)',name='values', labels_name='vars'),
        title="Excess Nutrients (% above recommended daily intake)",width=900,palette=palettes.BrBG11)
a.xaxis.axis_label = ""
a.yaxis.axis_label = "% above reccomended daily intake"
show(a)

Pork sausage is far and away the winner here, destroying the competition in five of the 8 nutrients we're looking at. This large scaling is due to it's lack of carbohydrates and fiber which accents it's wealth of Saturated Fat, Protein, and Sodium.

Overall though, these foods are much lower in Sodium, Potassium, and Protein than our best foods.

### Looking At  Everything

After looking at the best and worst foods we wanted a better look at the trends across all our foods so we took a look at the average overages across our entire corpus of data


In [9]:
avg_nutrients = caloverage_foods.rename(index=str,columns={'Protein (g)':'Protein (%)','Total Fat(g)':'Total Fat(%)','Cholesterol (mg)':'Cholesterol (%)',
               'Saturated Fat (g)':'Saturated Fat (%)','Sodium (mg)':'Sodium (%)','Potassium (mg)':'Potassium (%)',
               'Carbohydrates (g)':'Carbohydrates (%)','Fiber (g)':'Fiber (%)'}).mean()
avg_nutrients = avg_nutrients.drop("Weight (g)")
avg_nutrients = avg_nutrients.drop("Calories")
p = Bar(avg_nutrients)
show(p)

So over our entire data set we're seeing very high levels of Protein, Fats and Sodium compared to other nutrients. Surprisingly, the lowest average was cholestoral, which indicates that many foods have a healthy proportion of cholestorol overall.

We also observed similar trends when looking at the median of our data, so this is a very real trend in our data.

## Final Analysis

## Appendix: What about grams?
Throughout this entire notebook we looked at our food through the lens of calories but we also had a few interesting findings when we looked at which foods meet nutritional guidelines for the least amount of weight

In [10]:
calweighted_foods[display].sort_values(by='Weight (g)').head(10)

Unnamed: 0_level_0,Name,Calories,Weight (g),Protein (g),Total Fat(g),Cholesterol (mg),Saturated Fat (g),Sodium (mg),Potassium (mg),Carbohydrates (g),Fiber (g)
NDB_No,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
11672,POTATO PANCAKES,2891.046386,1078.748652,65.587918,159.223301,1024.811219,26.925566,8241.639698,6709.816613,300.0,35.598706
6128,"SOUP,CHICK NOODLE,DRY,MIX",4312.091503,1143.79085,176.372549,74.460784,846.405229,22.521242,41668.300654,3500.0,712.810458,36.601307
21604,"SCHOOL LUNCH,PIZZA,SSAGE TOPPING,THIN CRST,WHL...",3000.0,1200.0,151.56,93.96,300.0,41.856,5484.0,4620.0,386.88,46.8
21605,"SCHOOL LUNCH,PIZZA,SAUSAGE TOP,THICK CRUST,WHL...",3189.716312,1241.134752,164.574468,112.570922,322.695035,52.363475,5374.113475,3500.0,379.539007,49.64539
6959,"GRAVY,INST TURKEY,DRY",5112.5,1250.0,146.5,183.25,300.0,61.6375,51125.0,3825.0,719.5,47.5
21602,"SCHOOL LUNCH,PIZZA,PEPPI TOPPING,THIN CRUST,WH...",3175.0,1250.0,159.75,107.75,300.0,35.125,6187.5,4775.0,390.5,53.75
21603,"SCHOOL LUNCH,PIZZA,PEPP TOPPING,THICK CRUST,WH...",3237.5,1250.0,179.375,122.75,300.0,52.7375,5925.0,3850.0,353.75,53.75
18300,"PANCAKES,WHOLE-WHEAT,DRY MIX,INCOMPLETE,PREP",2609.318996,1254.480287,106.630824,81.541219,765.232975,21.94086,7175.62724,3500.0,368.817204,35.125448
43260,"BEVERAGE,INST BRKFST PDR,CHOC,SUGAR-FREE,NOT R...",4562.745098,1274.509804,456.27451,65.0,560.784314,27.554902,9138.235294,21730.392157,522.54902,25.490196
21478,"DIGIORNO PIZZA,PEPPERONI TOPPING,THIN CRISPY C...",3695.895522,1305.970149,172.649254,167.947761,365.671642,64.945896,8658.58209,3500.0,374.421642,36.567164


There are a lot of pancakes, pizza, and school lunches on the list
To put some of these foods into perspective:

* Potato Pancakes: 1078 grams = 49 pancakes
* Chicken Noodle Soup: 1143 grams = 15 packets
* Digiorno Thin Crust Pizza: 1305 grams = 2.4 Pizzas

These are usually the foods that people buy when they need to feed a lot of people but what's interesting is that the companies seem to be maximizing the nutritional content of these foods for the least amount of weight possible


In [13]:
df = caloverage_foods.sort_values(by='Weight (g)').head(10).rename(index=str,columns={'Protein (g)':'Protein (%)','Total Fat(g)':'Total Fat(%)','Cholesterol (mg)':'Cholesterol (%)',
               'Saturated Fat (g)':'Saturated Fat (%)','Sodium (mg)':'Sodium (%)','Potassium (mg)':'Potassium (%)',
               'Carbohydrates (g)':'Carbohydrates (%)','Fiber (g)':'Fiber (g)'})
a = Bar(df, label='vars',group='Name', 
        values=blend('Protein (%)', 'Total Fat(%)','Cholesterol (%)',
                     'Saturated Fat (%)','Sodium (%)','Potassium (%)',
                     'Carbohydrates (%)','Fiber (%)',name='values', labels_name='vars'),
        title="Excess Nutrients (% above recommended daily intake)",width=900)
a.xaxis.axis_label = ""
a.yaxis.axis_label = "% above reccomended daily intake"
show(a)

When you look at the percent overages on the foods in the above graph, it's clear that they run into the same problems as most of the foods in our data set with high levels of Sodium, Protein, and Fats.