# UCL AI Society Machine Learning Tutorials
### Session 01. Introduction to Numpy, Pandas and Matplotlib Libraries

### Contents
1. Numpy
2. Pandas
3. Matplotlib

### Aim
At the end of this session, you will be able to:
- Understand the basics of numpy.
- Understand the basics of pandas.
- Understand the basics of matplotlib.
- Build a simple EDA (Exploratory Data Analysis) using above libraries.

## 3. Matplotlib
Matplotlib is a Python data visualisation library.Its plotting system is similar to that of MATLAB.

### 3.1 Basics of Matplotlib

In [None]:
# run this cell if you haven't installed matplotlib
!pip install matplotlib

- `%matplotlib inline` is only available for the Jupyter Notebook and Jupyter QtConsole. With this backend, the output of plotting commands is displayed inline within frontends, directly below the code cell that produce it.
- `%matplotlib tk` is also only available for the Jupyter Notebook, and with this, the output of plotting commands is displayed on new broswer.

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
# %matplotlib tk

Plot **y=sin(x)** graph.

In [None]:
# declare x and y 
x = np.arange(0, 3 * np.pi, 0.1)
y = np.sin(x)

In [None]:
# plot a graph of sin(x)
plt.plot(x, y)
plt.xlabel('x')
plt.ylabel('y')
plt.grid()
plt.title('y = sin(x)')
plt.show()

Plot multiple functions of **y = x**, **y = x^2**, and **y = x^3** in one graph.

In [None]:
# plot multiple graphs in one graph
x = np.arange(10)
x_linear = x
x_square = x ** 2
x_cubic = x ** 3

plt.plot(x, x_linear)
plt.plot(x, x_square)
plt.plot(x, x_cubic)
plt.xlabel('x axis')
plt.ylabel('y axis')
plt.grid()
plt.legend(['y = x', 'y = x^2', 'y = x^3'])
plt.title('y = x, y = x^2 and y = x^3')
plt.show()

### 3.2 Various Types of Plots
Matplotlib library supports varied plots such as bar graph, histogram, scatter plot, area plot and pie plot. For plotting graphs, IMDB-Movie-Data file will be re-loaded.

In [None]:
movie = pd.read_csv("./data/IMDB-Movie-Data.csv")
movie.columns

Let's see if there is a positive correlation between `Rating` and `Revenue (Millions)` using `scatter` function. `s` controls the diameter of circles and `alpha` controls the degree of transparency.

In [None]:
movie.plot.scatter(x = 'Rating', y = 'Revenue (Millions)', s = 10, alpha = 1)

There is a positive relationship but not that strong. Let's look into current share of programming languages that are widely used in business sector.

Use `plt.bar` and `plt.xticks` to plot a graph of Percentage over language.

In [None]:
# Data
language = ["JavaScript", "HTML/CSS", "SQL", "Python", "Java"]
percentage = [67.8, 63.5, 54.4, 41.7, 41.1]

# Generating the y positions
y_positions = range(len(percentage))

# % To Do: Create bar plot
None
None
None
None
None
None

### 3.3 (Advanced) Matplotlib Exercise

In [None]:
# TODO: Draw 4 graphs in total in one figure
# TODO: For data to put in graphs, just create your own random data with numpy
fig, axes = None

# TODO: Scatter Graph
x = np.random.randn(50)
y = np.random.randn(50)
None
None
None

# TODO: Bar Graph
x = np.arange(10)
None

# TODO: Multi-Bar Graph
x = np.random.rand(3)
y = np.random.rand(3)
z = np.random.rand(3)
multi_data = [x, y, z]

None
None
None
None
None

# TODO: Histogram Graph
data = np.random.randn(1000)
None

# TODO: Show the image and save it to png file
None
None

### What to do next?
Below websites would be helpful for your further study on matplotlib:
- [DataCamp Matplotlib Tutorial: Python Plotting](https://www.datacamp.com/community/tutorials/matplotlib-tutorial-python)
- [Matplotlib official website](https://matplotlib.org/#)
- [Python Plotting With Matplotlib (Guide)](https://realpython.com/python-matplotlib-guide/)
- [Different plotting using pandas and matplotlib](https://www.geeksforgeeks.org/different-plotting-using-pandas-and-matplotlib/)
- [Matplotlib tutorial for beginner](https://github.com/rougier/matplotlib-tutorial)