### Plotting Data with Pandas and Matplotlib.Pyplot

For common plot types and settings, pandas provides functions that can be
accessed directly from the dataframe. It is always possible to design
manual plots via matplotlib.pyplot, or use other libraries such as seaborn.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt

a) Read the file "LaborSupply1988.csv" into a pandas dataframe.

In [None]:
df = pd.read_csv("LaborSupply1988.csv")
df.head()

b) Plot a histogram of the attribute "age". What is the most frequent age?

In [None]:
# Pandas dataframes have built in basic plotting functionalities
df["age"].plot.hist(bins=15)
df["age"].mode()  # the mode is the most common value in a dataset

c) Plot the average number of "kids" against "age" and interpret the resulting graph.  
Compute the correlation between "kids" and "age" to check your interpretation.

In [None]:
df.groupby("age")["kids"].mean().plot(style=".")

corr = df[["kids", "age"]].corr()
print(corr)

*The correlation between age and kids is negative, meaning the average number of kids decreases with increasing age.*

d) Plot "log of hourly wage (lnwg)" against "age".

In [None]:
df.plot(x="age", y="lnwg", style=".")

e) Plot the mean of "log of hourly wages (lnwg)" against "age".  
Compute and discuss the type of correlation between "lnwg" and "age".

In [None]:
df.groupby("age")["lnwg"].mean().plot(x="age", y="lnwg", style=".")
corr = df[["age", "lnwg"]].corr()
print(corr)


f) Plot "lnhr" against "age" with different colors for "disab=0" and "disab=1".

In [None]:
Xs = df["age"].values
Ys = df["lnhr"].values
filterfunction = lambda x : "red" if x == 0 else "blue"
colors = df["disab"].apply(filterfunction).values
for x, y, c in zip(Xs, Ys, colors):
    plt.scatter(x, y, s=10, color=c)
plt.show()

g) Create a boxplot of the "lnhr" (log of annual hours) against the number of kids.  
What can be observed regarding median and variance?  
Is the observation meaningful for large values of kids?

In [None]:
df.plot.box(column="lnhr", by="kids")