# Catinfo.org cat food nutritional data

**Source:** https://catinfo.org/chart/index.php

**Description:**

> Typical nutrient analysis data provided by the respective companies 

**Files:**

- `foods.csv`: A list of cat foods

### 1. Open `foods.csv`

In [1]:
import pandas as pd
df = pd.read_csv("foods.csv")
df


Unnamed: 0,Company,Variety,Protein,Fat,Carbs,Phos,Notes
0,4HEALTH,Turkey/Salmon,31%,62%,7%,273mg,
1,4HEALTH,Grain-Free Turkey/Giblets,29%,70%,1%,274mg,
2,4HEALTH,Chicken/Beef,33%,59%,8%,278mg,
3,4HEALTH,Grain-Free Salmon in Gravy,36%,57%,7%,278mg,
4,4HEALTH,Grain-Free Chicken/Whitefish,30%,70%,0%,287mg,
...,...,...,...,...,...,...,...
1163,ZIWIPEAK,Venison,35%,62%,3%,252mg,
1164,ZIWIPEAK,Lamb,32%,63%,6%,254mg,
1165,ZIWIPEAK,Venison & Fish,36%,59%,5%,260mg,
1166,ZIWIPEAK,Rabbit & Lamb,35%,62%,3%,271mg,


### 2. Convert Protein, Fat, Carbs and Phosphorous columns to numbers

This will be four separate lines of code.

In [9]:
#Protein
df['Protein'] = df['Protein'].str.replace("%","").astype(float)

In [12]:
df['Fat']= df['Fat'].str.replace("%","").astype(float)

In [13]:
#Carbs
df['Carbs']= df['Carbs'].str.replace("%","").astype(float)

In [14]:
# Phos
df['Phos']= df['Phos'].str.replace("mg","").astype(float)

In [15]:
df

Unnamed: 0,Company,Variety,Protein,Fat,Carbs,Phos,Notes
0,4HEALTH,Turkey/Salmon,31.0,62.0,7.0,273.0,
1,4HEALTH,Grain-Free Turkey/Giblets,29.0,70.0,1.0,274.0,
2,4HEALTH,Chicken/Beef,33.0,59.0,8.0,278.0,
3,4HEALTH,Grain-Free Salmon in Gravy,36.0,57.0,7.0,278.0,
4,4HEALTH,Grain-Free Chicken/Whitefish,30.0,70.0,0.0,287.0,
...,...,...,...,...,...,...,...
1163,ZIWIPEAK,Venison,35.0,62.0,3.0,252.0,
1164,ZIWIPEAK,Lamb,32.0,63.0,6.0,254.0,
1165,ZIWIPEAK,Venison & Fish,36.0,59.0,5.0,260.0,
1166,ZIWIPEAK,Rabbit & Lamb,35.0,62.0,3.0,271.0,


### 3. Does a cat food advertised as having gravy have a higher or lower amount of carbohydrates than the average can of cat food?

One command will summarize the carbohydrate content across all cans, one will summarize carbohydrate content of varieties that include gravy.

In [38]:
df['Variety'].str.contains("Gravy").value_counts().mean()


581.5

In [24]:
df['Carbs'].median()

10.0

In [25]:
df['Carbs'].mean()

10.465753424657533

### 4. How does the carbohydrate content of grain-free cat food compare to normal cat food?

You need to include both "Grain Free" and "Grain-Free".

In [37]:
# Variety
df['Variety'].str.contains('Grain Free','Grain-Free').value_counts().mean()

581.5

### 5. Let's see some non-NaN entries from the notes column. If you can, please make it so we can read the *entire note*

There aren't notes in the first 5 or last 5, so you'll need to filter. I don't need to see all of them, just a few.

In [None]:
#Notes



### 6. Clean up the "Company" column

The ` - ` and everything after it should be removed, and saved in a new column called `company_cleaned`. The cleaning should work like this:

|Company|company_cleaned|
|---|---|
|9Lives - Hearty Cuts in Gravy|9Lives|
|DAVE's - Restricted Diet|DAVE's|
|FANCY FEAST - Delights with Cheddar|FANCY FEAST|

In [43]:
df['company_cleaned'] = df['Company'].str.split('-').str[0]
df

Unnamed: 0,Company,Variety,Protein,Fat,Carbs,Phos,Notes,company_cleaned
0,4HEALTH,Turkey/Salmon,31.0,62.0,7.0,273.0,,4HEALTH
1,4HEALTH,Grain-Free Turkey/Giblets,29.0,70.0,1.0,274.0,,4HEALTH
2,4HEALTH,Chicken/Beef,33.0,59.0,8.0,278.0,,4HEALTH
3,4HEALTH,Grain-Free Salmon in Gravy,36.0,57.0,7.0,278.0,,4HEALTH
4,4HEALTH,Grain-Free Chicken/Whitefish,30.0,70.0,0.0,287.0,,4HEALTH
...,...,...,...,...,...,...,...,...
1163,ZIWIPEAK,Venison,35.0,62.0,3.0,252.0,,ZIWIPEAK
1164,ZIWIPEAK,Lamb,32.0,63.0,6.0,254.0,,ZIWIPEAK
1165,ZIWIPEAK,Venison & Fish,36.0,59.0,5.0,260.0,,ZIWIPEAK
1166,ZIWIPEAK,Rabbit & Lamb,35.0,62.0,3.0,271.0,,ZIWIPEAK


### 7. Fix the ALWAYS-YELLING CASE ISSUE

I don't like how many of the companes are just SCREAMED OUT ALL-CAPS NAMES. Let's convert them to title case.

Title case in normal Python works like this:

In [68]:
"THIS IS ALL CAPS".title()

'This Is All Caps'

You can't just use plain `.title()` to fix up `company_cleaned`, but I'm sure you can figure out its pandas equivalent...

In [49]:
df['company_cleaned'] = df['company_cleaned'].str.title()
df

Unnamed: 0,Company,Variety,Protein,Fat,Carbs,Phos,Notes,company_cleaned
0,4HEALTH,Turkey/Salmon,31.0,62.0,7.0,273.0,,4Health
1,4HEALTH,Grain-Free Turkey/Giblets,29.0,70.0,1.0,274.0,,4Health
2,4HEALTH,Chicken/Beef,33.0,59.0,8.0,278.0,,4Health
3,4HEALTH,Grain-Free Salmon in Gravy,36.0,57.0,7.0,278.0,,4Health
4,4HEALTH,Grain-Free Chicken/Whitefish,30.0,70.0,0.0,287.0,,4Health
...,...,...,...,...,...,...,...,...
1163,ZIWIPEAK,Venison,35.0,62.0,3.0,252.0,,Ziwipeak
1164,ZIWIPEAK,Lamb,32.0,63.0,6.0,254.0,,Ziwipeak
1165,ZIWIPEAK,Venison & Fish,36.0,59.0,5.0,260.0,,Ziwipeak
1166,ZIWIPEAK,Rabbit & Lamb,35.0,62.0,3.0,271.0,,Ziwipeak


### 8. What is the average protein content for each company's wet food?

You can use `figsize=(4, 15)` when plotting to make it look a little nicer.

Bonus points if you drop the ones missing protein content, but we haven't talked about that in a long time.

In [61]:
df['Protein'].groupby(df['Variety'].str.contains("wet food").value_counts().mean()).plot(kind='barh', figsize=(4, 15))


# df['Variety'].str.contains("Gravy").value_counts().mean()

ValueError: Grouper for '1163.0' not 1-dimensional