# Pandas and Plots

In this *notebook* we will see an introduction to the [Pandas](https://pandas.pydata.org/) module, focused on **DataFrame** class and some of  its functionalities.

Later, let's work on plots, using the module [Matplotlib](https://matplotlib.org/).

The first step is loading the Pandas package:

In [None]:
import pandas as pd

Now, let's create our first DataFrame, but empty, and see how Python responds:

In [None]:
df = pd.DataFrame()
print(df)

So, it shows the empty DataFrame. Now, let's feed a DataFrame with some data:

In [None]:
data = [1,2,3,4,5]
df = pd.DataFrame(data)
df

Now, let's feed the DataFrame with data and columns names:

In [None]:
data = [['Alex', 10], ['Bob', 12], ['Clarke', 13]]
df = pd.DataFrame(data, columns = ['Name', 'Age'])
df

We can set the numeral data type to be float:

In [None]:
data = [['Alex', 10], ['Bob', 12], ['Clarke', 13]]
df = pd.DataFrame(data, columns = ['Name', 'Age'], dtype = float)
df

One of the easiest ways to create a DataFrame is from a Dictionary:

In [None]:
data = {'Name': ['Tom', 'Jack', 'Steve', 'Ricky'], 'Age': [28,34,29,42]}
df = pd.DataFrame(data)
df

We can also edit the indexes:

In [None]:
data = {'Name': ['Tom', 'Jack', 'Steve', 'Ricky'], 'Age': [28,34,29,42]}
df = pd.DataFrame(data, index = ['rank1','rank2','rank3','rank4'])
df

List of dicts can also be used to create DataFrames:

In [None]:
data = [{'a': 1, 'b': 2}, {'a': 5, 'b': 10, 'c': 20}]
df = pd.DataFrame(data, index = ['first', 'second'])
df

Selecting columns:

In [None]:
df['a']

Adding columns:

In [None]:
df['d'] = df['a'] + df['b']
df

Delete column:

In [None]:
del df['d']
df

Row selection usin **.loc** and **.iloc**:

In [None]:
df.loc['first']

In [None]:
df.iloc[1]

Slice rows:

In [None]:
data = {'a': [1, 5, 10, 20, 100, 200], 'b': [3, 30, 15, 27, 58, 37], 'c': [99, 43, 21, 5, 27, 66]}
df = pd.DataFrame(data)
df

In [None]:
df.iloc[2:5]

Adding rows:

In [None]:
df2 = pd.DataFrame([[5, 6, 7], [7, 8, 9], [9, 10, 11]], columns = ['a', 'b', 'c'])
df2

In [None]:
df = df.append(df2)
df

Deleting rows:

In [None]:
df = df.drop(0)
df

## Matplotlib

Now let's take a look on the ploting tools for Python. There many packages for ploting, but here the focus will be on the  [Matplotlib](https://matplotlib.org/) module.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

In [None]:
x = np.arange(0, 1, 0.001)
y = np.sin(2*np.pi*x)

Making a simple plot using the **.plot** function:

In [None]:
plt.plot(x, y)

Setting the figure size:

In [None]:
plt.figure(figsize = (16, 8))
plt.plot(x, y)

We can edit the plot, by changing line color, title, labels, etc.

In [None]:
plt.figure(figsize = (16, 8))
plt.plot(x, y, 'r-', linewidth = 2)
plt.title('Plot of $\sin(x)$', fontsize = 24)
plt.xlabel('Xlabel', fontsize = 18)
plt.ylabel('Ylabel', fontsize = 18)

We can plot more than one dataset in one plot:

In [None]:
y2 = np.cos(2*np.pi*x)

In [None]:
plt.figure(figsize = (16, 8))
plt.plot(x, y, 'r-', linewidth = 3, label = '$\sin(x)$')
plt.plot(x, y2, 'b-', linewidth = 3, label = '$\cos(x)$')
plt.title('Plot of $\sin(x)$ and $\cos(x)$', fontsize = 24)
plt.xlabel('Xlabel', fontsize = 18)
plt.ylabel('Ylabel', fontsize = 18)
plt.legend()

Or do subplots:

In [None]:
fig, ax = plt.subplots(1, 2, figsize = (30, 8))
ax[0].plot(x, y, 'r-', linewidth = 3)
ax[0].set_title('Plot of $\sin(x)$', fontsize = 24)
ax[0].set_xlabel('Xlabel', fontsize = 18)
ax[0].set_ylabel('Ylabel', fontsize = 18)

ax[1].plot(x, y2, 'b-', linewidth = 3)
ax[1].set_title('Plot of $\cos(x)$', fontsize = 24)
ax[1].set_xlabel('Xlabel', fontsize = 18)
ax[1].set_ylabel('Ylabel', fontsize = 18)

Other kinds of plots, like histograms:

In [None]:
np.random.seed(1)

mu, sigma = 100, 15
x = mu + sigma * np.random.randn(10000)

plt.figure(figsize = (12, 6))
plt.hist(x, 50, density = True, facecolor = 'b', alpha = .5, histtype = 'bar') # try histtype = 'step'
plt.xlabel('Smarts')
plt.ylabel('Probability')
plt.title('Histogram of IQ')
plt.text(60, .025, r'$\mu=100,\ \sigma=15$')
plt.axis([40, 160, 0, 0.03])
plt.grid(True)
plt.show()

The package **Pandas** has its all ploting functionality **.plot**.

Now, here comes a quiz. Use the pandas plot functionality to plot in subplots all the columns of the following dataframe: 

In [None]:
x = np.arange(0, 1, 0.001)
s = np.sin(2*np.pi*x)
c = np.cos(2*np.pi*x)
t = np.tan(2*np.pi*x)

data = {'Sin': s, 'Cos': c, 'Tan': t}
df = pd.DataFrame(data)
df.head()

Now, it's time for you to shine. your code goes below:

In [None]:
# Your code now
