## USDA Food Info Dataset
- Exploring a dataset from the United States Department of Agriculture (USDA). 
- This dataset contains nutritional information on the most common foods Americans consume.
- Each column in the dataset shows a different attribute of the foods and each row describes a different food item.

### Columns in the dataset:
* NDB_No - unique id of the food.
* Shrt_Desc - name of the food.
* Water_(g) - water content in grams.
* Energ_Kcal - energy measured in kilo-calories.
* Protein_(g) - protein measured in grams.
* Cholestrl_(mg) - cholesterol in milligrams.

In [14]:
# read the dataset
import pandas as pd
food_info = pd.read_csv('food_info.csv')
food_info.head(2)

Unnamed: 0,NDB_No,Shrt_Desc,Water_(g),Energ_Kcal,Protein_(g),Lipid_Tot_(g),Ash_(g),Carbohydrt_(g),Fiber_TD_(g),Sugar_Tot_(g),...,Vit_A_IU,Vit_A_RAE,Vit_E_(mg),Vit_D_mcg,Vit_D_IU,Vit_K_(mcg),FA_Sat_(g),FA_Mono_(g),FA_Poly_(g),Cholestrl_(mg)
0,1001,BUTTER WITH SALT,15.87,717,0.85,81.11,2.11,0.06,0.0,0.06,...,2499.0,684.0,2.32,1.5,60.0,7.0,51.368,21.021,3.043,215.0
1,1002,BUTTER WHIPPED WITH SALT,15.87,717,0.85,81.11,2.11,0.06,0.0,0.06,...,2499.0,684.0,2.32,1.5,60.0,7.0,50.489,23.426,3.012,219.0


In [15]:
food_info.shape

(8618, 36)

In [16]:
col_names = food_info.columns.tolist()  # return a list contaning only the column names
print col_names

['NDB_No', 'Shrt_Desc', 'Water_(g)', 'Energ_Kcal', 'Protein_(g)', 'Lipid_Tot_(g)', 'Ash_(g)', 'Carbohydrt_(g)', 'Fiber_TD_(g)', 'Sugar_Tot_(g)', 'Calcium_(mg)', 'Iron_(mg)', 'Magnesium_(mg)', 'Phosphorus_(mg)', 'Potassium_(mg)', 'Sodium_(mg)', 'Zinc_(mg)', 'Copper_(mg)', 'Manganese_(mg)', 'Selenium_(mcg)', 'Vit_C_(mg)', 'Thiamin_(mg)', 'Riboflavin_(mg)', 'Niacin_(mg)', 'Vit_B6_(mg)', 'Vit_B12_(mcg)', 'Vit_A_IU', 'Vit_A_RAE', 'Vit_E_(mg)', 'Vit_D_mcg', 'Vit_D_IU', 'Vit_K_(mcg)', 'FA_Sat_(g)', 'FA_Mono_(g)', 'FA_Poly_(g)', 'Cholestrl_(mg)']


In [17]:
zinc_copper = food_info[['Zinc_(mg)', 'Copper_(mg)']]  # double[[]]
print zinc_copper.head(2)

   Zinc_(mg)  Copper_(mg)
0       0.09        0.000
1       0.05        0.016


In [18]:
# Performing Math With Multiple Columns
water_energy = food_info["Water_(g)"] * food_info["Energ_Kcal"]
print water_energy[0:5]

0    11378.79
1    11378.79
2      210.24
3    14970.73
4    15251.81
dtype: float64


In [19]:
# Sorting A DataFrame By A Column

In [20]:
# Sorts the DataFrame in-place, rather than returning a new DataFrame.
food_info.sort_values("Sodium_(mg)", inplace=True)

# Sorts by descending order, rather than ascending.
food_info.sort_values("Sodium_(mg)", inplace=True, ascending=False)