<a href="https://colab.research.google.com/github/BireNbarik/Metal-Forming-Lab/blob/main/activities/python-basics-11.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Python Basics 11: Basic Plots

The objective is to learn the basics of Python plotting.

We plot using the [matplotlib](https://matplotlib.org/) Python library.
Again, this is a *huge* topic and we cannot cover everything here.
In general, if you know the type of plot that you want to do a simple Google search of the type "X matplotlib" will probably send you to an example that you can adjust to your needs.

Here is how to import `matplotlib`:

In [None]:
# Both lines are necessary in a Jupyter notebook
import matplotlib.pyplot as plt
# This line tells the Jupyter notebook that the plots should just appear
# within it
%matplotlib inline

## Plotting simple functions

Let's start by plotting simple 1D functions.
We are going to plot this function:
$$
f(x) = \frac{\sin(x)}{x},
$$
for $x$ between $-10$ and $10$.
Here you go:

In [None]:
# We are going to need numpy as well
import numpy as np
# First you generate the data that you want to plot
# Here is a dense set of x's:
xs = np.linspace(-10, 10, 50)
# linspace gives you 200 equidistant points between -10 and 10
# you should remember this function:
print(xs)

In [None]:
# Now evalute the function values at each one of these points
ys = np.sin(xs) / xs
print(ys)

In [None]:
# And now we can plot. The simplet way to do it is this:
plt.plot(xs, ys);

However, I typically use the following more extensive version because it allows me to specify certain details like the size or the quality of the plot:

In [None]:
fig, ax = plt.subplots()
ax.plot(xs, ys);

Let's now add some more details in the plot.
Let's add x and y labels, and a title.

In [None]:
fig, ax = plt.subplots()
ax.plot(xs, ys)
ax.set_xlabel('$x$')
ax.set_ylabel('$y$')
ax.set_title('Some title');

You can increase the quality of the plot like this:

In [None]:
fig, ax = plt.subplots(dpi=150)
ax.plot(xs, ys)
ax.set_xlabel('$x$')
ax.set_ylabel('$y$')
ax.set_title('Some title');

Let's now add one more function to the plot.
I will add this:
$$
g(x) = \frac{\sin^2(x)}{x}.
$$

In [None]:
fig, ax = plt.subplots(dpi=150)
ax.plot(xs, ys, label='$f(x)$')
ax.plot(xs, np.sin(xs) ** 2 / xs, '--', label='$g(x)$')
ax.set_xlabel('$x$')
ax.set_ylabel('$y$')
ax.set_title('Some title')
plt.legend(loc='best')

Notice that the colors are automatically different for the two curves.
However, you have to specify a different style.
You can also change the colors:

In [None]:
fig, ax = plt.subplots(dpi=150)
ax.plot(xs, ys, 'g', label='$f(x)$')
ax.plot(xs, np.sin(xs) ** 2 / xs, 'r--', label='$g(x)$')
ax.set_xlabel('$x$')
ax.set_ylabel('$y$')
ax.set_title('Some title')
plt.legend(loc='best')

You don't have to memorize the details of the style right now. A simple Google search can reveal the info you need and you will eventually start to remember the details.

Finally, let me also throw some noisy data into the mix:

In [None]:
# This gives you ten random numbers between -10 an 10
X = 20.0 * np.random.rand(10) - 10.0
# Let's use the first function to generate the noisy y's
Y = np.sin(X) / X + 0.1 * np.random.randn(10)
# The last part added a bit of noise 
fig, ax = plt.subplots(dpi=150)
ax.plot(xs, ys, 'g', label='$f(x)$')
ax.plot(xs, np.sin(xs) ** 2 / xs, 'r--', label='$g(x)$')
ax.plot(X, Y, 'xk', label='Data')
ax.set_xlabel('$x$')
ax.set_ylabel('$y$')
ax.set_title('Some title')
plt.legend(loc='best')

## Questions

+ Write code that plots this function:
$$
f(t) = e^{-0.5t}\sin(2\pi t),
$$
for $t$ between $0$ and $5$.

## Scatter plots

Let's load the data from the previous hands-on activity:

In [None]:
# Make sure you run this on Google Colab
import requests
import os
def download(url, local_filename=None):
    """
    Downloads the file in the ``url`` and saves it in the current working directory.
    """
    data = requests.get(url)
    if local_filename is None:
        local_filename = os.path.basename(url)
    with open(local_filename, 'wb') as fd:
        fd.write(data.content)
   
# The url of the file we want to download
url = 'https://raw.githubusercontent.com/PredictiveScienceLab/data-analytics-se/master/activities/temp_price.csv'
download(url)

In [None]:
import pandas as pd
# Reads a csv file into the pandas framework
temp_price = pd.read_csv('temp_price.csv')
temp_price.head()

We are going to clean them up as we did before:

In [None]:
clean_data = temp_price.dropna(axis=0).rename(columns={'Price per week': 'week_price',
                                                       'Price per day': 'daily_price'})
clean_data.head()

Let's try and visualze the data to gain some insight about them. In this section we will look at scatter plots and histograms. If you want to look at the documentation, then click on the links.
+ [Scatter Plots](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.plot.scatter.html?highlight=scatter): documentation tells how to build a scatter plot in pandas
+ [Histograms](https://pandas.pydata.org/docs/reference/api/pandas.Series.hist.html?highlight=hist#pandas.Series.hist): documentation tells how build a histogram in pandas

In [None]:
# Building a scatter plot
fig, ax = plt.subplots()
ax.scatter(clean_data['t_unit'], clean_data['hvac'])
ax.set_xlabel('Unit Temperature (F)')
ax.set_ylabel('HVAC energy consumed (kWh)');

We observe that higher unit temperature in general leads to higher HVAC energy consumption.
However, the relation is not linear.
This is because the appartments in this building have different physical characteristics.
For example, an appartment that is at the corner of the the building has more of each external surfaces to the environment and thus it needs more energy to maintain a given temperature than an appartment that is, say, in the middle of the building.

And here is a histogram of the unit temperature:

In [None]:
# Building a histogram
fig, ax = plt.subplots()
ax.hist(clean_data['t_unit'], bins=10, color='orange')
ax.set_xlabel('Temperature (F)')
ax.set_ylabel('Number of Apartments')
ax.set_title('Temperature Frequency in Apartments');

### Questions

Using the energy data do the following
+ Build a scatter plot of price (x-axis) vs score and another plot of the price (x-axis) vs. HVAC consumed

+ Build histograms of the score and another histogram of the HVAC consumption of the apartments

For each plot include x and y labels, titles and specify the number of bins you choose. 

Once you are done plotting, write down your observations about each in a markdown cell. 

In [None]:
# Your code here

Your comments in cells like this