In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [2]:
menu = pd.read_csv('https://raw.githubusercontent.com/umsi-data-science/data/main/menu.csv')

In [None]:
menu.head()

In [None]:
menu.columns

In [None]:
plt.plot(menu["Calories"],menu["Saturated Fat"],'ro')
plt.show()

## Using specific types of plots via pyplot

In addition to scatterplots, pyplot offers a number of other plot types.  These can be accessed via convenience functions such as ```scatter()```, ```hist()```, ```bar()```, ```barh()```, and ```pie()```, amongst others:

In [None]:
plt.scatter(menu["Calories"],menu["Saturated Fat"])
plt.show()

In [None]:
plt.hist(menu['Calories'])
plt.show()

## Pandas and matplotlib integration

Cumbersome?  Yes.  A better way?  Use the matplotlib integration from pandas:

In [None]:
f = menu['Calories'].plot(kind='hist')
type(f)

In [None]:
type(f)

Here are the valid values for "kind":

kind :
    - 'line' : line plot (default)
    - 'bar' : vertical bar plot
    - 'barh' : horizontal bar plot
    - 'hist' : histogram
    - 'box' : boxplot
    - 'kde' : Kernel Density Estimation plot
    - 'density' : same as 'kde'
    - 'area' : area plot
    - 'pie' : pie plot

## Bar plots with groupby()

In [None]:
categories

In [None]:
categories = menu.groupby('Category').size()

In [None]:
categories.plot(kind='barh')

In [None]:
categories_sorted = categories.sort_values(ascending=True)

In [None]:
categories_sorted.plot(kind='barh')

### <font color="magenta">Q3: Create a new column in the menu DataFrame called "Sugary" whose value is 1 if the values of "Sugars" is greater than 20, otherwise set it to 0. 

    Hint: use np.where(...)

In [None]:
menu['Sugary'] = np.where(menu['Sugars'] > 20, 1 , 0)

## Create a stacked bar plot by using a 2-level groupby() followed by an unstack():

In [None]:
menu.groupby(["Category","Sugary"]).size().unstack().plot(kind = "bar")

In [None]:
menu.groupby(["Category","Sugary"]).size().unstack().plot(kind = "bar", stacked = True)

In [None]:
menu.groupby(['Category','Sugary']).size().groupby(by='Category').apply(
    lambda x: 100 * x / x.sum()
).unstack().plot(kind='bar',stacked=True)

### Pie Charts

There are many issues with pie charts, and the one below is a good example of what not to do, but everyone wants to know how to make them:

In [None]:
categories.plot(kind='pie',title='Menu Categories')

## Subplots (again)

In addition to the way we used subplots in the previous class, we can use the ```.subplots()``` function to generate mulitple plots within a figure.  ```subplots()``` returns a set of axes on which we can make plots.

To demonstrate how this works, let's fill in just one of the subplots:


In [None]:
f, (ax1, ax2) = plt.subplots(2) # if only 1 argument, we assume it's the number of rows
ax1.hist(menu['Calories'])
plt.show()

Now let's fill in both subplots:

In [None]:
f, (ax1, ax2) = plt.subplots(2)
ax1.hist(menu['Calories'])
ax2.plot(menu['Calories'],menu['Total Fat'],'bo')
plt.show()

Now let's make a 2x2 layout of 4 plots.  Note the structure of the return values from the subplots function:

In [None]:
f, ((ax1, ax2),(ax3,ax4)) = plt.subplots(2,2)
ax1.hist(menu['Calories'])
ax2.plot(menu['Calories'],menu['Total Fat'],'bo')
plt.show()

In [None]:
f, ((ax1, ax2),(ax3,ax4)) = plt.subplots(2,2)
ax1.hist(menu['Total Fat'])
ax2.plot(menu['Calories'],menu['Total Fat'],'bo')
ax3.hist(menu['Dietary Fiber'])
ax4.plot(menu['Calories'],menu['Total Fat'],'bo')
plt.show()

Alternatively, we can use the pandas-matplotlib integration.  Note the use of the ```ax=``` keyword arg.

In [None]:
f, (ax1, ax2) = plt.subplots(2)
menu['Calories'].plot(ax=ax1, kind='hist')
menu['Dietary Fiber'].plot(ax = ax2,kind='hist')
plt.show()

### xkcd style plots (just for fun)

Note that we can scope the use of any ```plt.``` function by using a ```with``` statement.

Note also that we can save a figure as a file by using ```savefig(...)``` (as shown below).

In [None]:
with plt.xkcd():
        # This figure will be in XKCD-style
        plt.plot(menu["Calories"],menu["Saturated Fat"],'ro')
        plt.title("McD's Food")
        plt.xlabel('Calories')
        plt.ylabel('Fat')
        plt.annotate("Don't eat this",xytext=(1200,15),xy=(1880,20),arrowprops=dict(facecolor='black', shrink=0.1))
        plt.savefig('xkcd.png',format='png') # alternatively, we could save as a pdf, svg, ps, or eps
        plt.show()