# Reading Data and Simple Manipulations

Data scientists use the `pandas` library a lot in their work.  We tend to think of it as
our main workhorse, and there's a lot that we can do with it.

In python, we use the `import` statement to give us access to libraries.  The following line
imports the `pandas` library and allows us to refer to it as `pd` (data scientists are lazy and type poorly):

In [None]:
import pandas as pd

## Reading data

Next, let's read a CSV file into a DataFrame (which is basically a table).  
This particular file contains nutritional information from McDonalds.  It was released by McDonalds in 2015 and is 
available via https://www.kaggle.com/mcdonalds/nutrition-facts

In [None]:
mcdonalds = pd.read_csv("menu.csv")

Running the next block will "evaluate" the DataFrame.  What that usually means is that we're going to 
print it to the notebook:

In [None]:
mcdonalds

## Simple manipulations

We can look at the different parts of a DataFrame:


In [None]:
mcdonalds.columns

In [None]:
mcdonalds.shape

In [None]:
mcdonalds.head()

In [None]:
mcdonalds.tail()

In [None]:
mcdonalds['Calories']

## Simple statistics

In [None]:
mcdonalds['Calories'].mean()

In [None]:
mcdonalds['Calories'].sum()

In [None]:
mcdonalds['Calories'].count()

In [None]:
mcdonalds.describe()

## Split-apply-combine

In [None]:
mcdonalds['Category'].value_counts()

In [None]:
mcdonalds[mcdonalds["Category"] == "Salads"]

In [None]:
mcdonalds.query("Category == 'Salads'")

In [None]:
mcdonalds.groupby('Category')['Calories'].mean()

### Example: Which menu item has the most calories?

In [None]:
# Solution
mcdonalds.sort_values('Calories',ascending=False)

### Challenge: Come up with a question that could be answered with this data set