In [None]:
from IPython.core.display import HTML
table_css = 'table {align:left;display:block} '
HTML('<style>{}</style>'.format(table_css))

# **ASI WA Python Workshop**

## **Jupyter Notebook Tips**

Outside of a cell:

| key combo | action |
| :--: | ---- |
| shift+enter |  run a cell |
| b |  insert cell below | 
| a |  insert cell above |
| dd | delete selected cell |

<br>
Within a cell:

| key combo | action |
| :--: | ---- |
| Command + / |  comment out line(s) |
| Alt + Cursor | vertical selection |

**Control = Command
<br><br><br>

---
<a id='1.1'></a>

## **PIP to install external modules or packages**
PIP is a package manager for Python packages, or modules. 

**What is a Package?**
A package contains all the files you need for a module.

### Installing a package
Downloading a package is very easy. Open the command line interface and tell PIP to download the package you want. For example:
```
pip install numpy
```
### Uninstalling a package
To uninstall:
```
pip uninstall numpy
```
### Listing all packages installed
To list all packages currently installed:
```
pip list
```

### Where to find Packages
Find more packages at https://pypi.org/.

---
<br>

**This content was adapted from https://github.com/rasbt/numpy-intro-blogarticle-2020/blob/master/scipython__blog.ipynb**

---

# **Matplotlib**

---

## What is Matplotlib?
Matplotlib is a plotting library for Python created by John D. Hunter in 2003. Unfortunately, John D. Hunter became ill and past away in 2012. However, Matplot is still the most mature plotting library, and is being maintained until this day.

In general, Matplotlib is a rather "low-level" plotting library, which means that it has a lot of room for customization. **The advantage of Matplotlib is that it is so customizable; the disadvantage of Matplotlib is that it is so customizable** -- some people find it a little bit too verbose due to all the different options.

I will introduce the seaborne package later on, it is much easier to use and the closest package Python has to ggplot in R. 

In any case, Matplotlib is among the most widely used plotting library and the go-to choice for many data scientists and machine learning researchers and practictioners.

In my(that guys) opinion, the best way to work with Matplotlib is to use the Matplotlib gallery on the official website at [https://matplotlib.org/gallery/index.html](https://matplotlib.org/gallery/index.html) often. It contains code examples for creating various different kinds of plots, which are useful as templates for creating your own plots. Also, if you are completely new to Matplotlib, I recommend the tutorials at [https://matplotlib.org/tutorials/index.html](https://matplotlib.org/tutorials/index.html).

In this section, we will look at a few very simple examples, which should be very intuitive and shouldn't require much explanation.

In [None]:
# jupyter notebook command to show plots inline (in the notebook)
%matplotlib inline 

import matplotlib.pyplot as plt # pyplot makes everything a bit easier

The main plotting functions of Matplotlib are contained in the pyplot module, which we imported above. Note that the `%matplotlib inline` command is an "IPython magic" command. This particular `%matplotlib inline` is specific to Jupyter notebooks (which, in our case, use an IPython kernel) to show the plots "inline," that is, the notebook itself.

## Plotting Functions and Lines
Once imported we can call plotting functions using this syntax:
```
plt.plot_type()
plt.show()
```

In [None]:
# plot a sine wave
from math import sin

# generate interval of points in a list
x = range( 0, 50, 1)
# list comprehension to alter to sine our list 
y = [sin(ii) for ii in x]

# call the plt package and 
plt.plot(x, y)
# tell matplotlib to produce the plot
plt.show()

Add axis ranges and labels:

In [None]:
plt.plot(x, y)

plt.xlim([2, 8])
plt.ylim([0, 0.75])

plt.xlabel('x-axis')
plt.ylabel('y-axis')

plt.show()

Add maker type and a legend

In [None]:
from math import cos

# generate interval of points in a list
x = range( 0, 50, 1)
# list comprehension to alter to sine our list 
y1 = [cos(ii) for ii in x]
# list comprehension to alter to sine our list 
y2 = [sin(ii) for ii in x]

plt.plot(x, y1, label=('cos(x)'), linestyle='-', marker='o')
plt.plot(x, y2, label=('sin(x)'), linestyle='-', marker='^')

plt.ylabel('y')
plt.xlabel('x')

plt.legend(loc='lower left')
plt.show()

## Scatter Plots

In [None]:
import numpy as np # more on numpy in the next notebook

# dont worry about data generation for now
rng = np.random.RandomState(123)
x = rng.normal(size=500)
y = rng.normal(size=500)


plt.scatter(x, y, c='green')
plt.show()

#### *Task 4.1: Create a Scatter Plot*

1. Import the necessary modules: random and matplotlib.pyplot as plt.

2. Use the two lists of random data for the x and y coordinates. You can use the random.randint() function to generate random integers within a desired range.

3. Create the scatter plot using plt.scatter(), passing the x and y coordinates as arguments.

4. Add labels to the x and y axes using plt.xlabel() and plt.ylabel().

5. Set a title for the plot using plt.title().

6. Display the plot using plt.show().

In [66]:
# Task 4.1

x = [random.randint(0,10) for _ in range(100)]
y = [random.randint(0,10) for _ in range(100)]

----
## Bar Plots

In [None]:
# input data
means = [5, 8, 10]
stddevs = [0.2, 0.4, 0.5]
bar_labels = ['bar 1', 'bar 2', 'bar 3']

# plot bars
x_pos = list(range(len(bar_labels)))
plt.bar(x_pos, means, yerr=stddevs)
plt.title('Bar Plot')
plt.show()

## Histograms
One of the first ways you should look at your data.

In [None]:
# dont worry about data generation
rng = np.random.RandomState(123)
x = rng.normal(0, 20, 1000) 

# fixed bin size
bins = np.arange(-100, 100, 5) # fixed bin size

plt.hist(x, bins=bins)
plt.show()

In [None]:
rng = np.random.RandomState(123)
x1 = rng.normal(0, 20, 1000) 
x2 = rng.normal(15, 10, 1000)

plt.hist(x1, bins=50, alpha=0.5)
plt.hist(x2, bins=50, alpha=0.5)
plt.show()

## Subplots

In [None]:
import random

# Generate random data
x = [random.randint(1, 10) for _ in range(10)]
y1 = [random.randint(1, 10) for _ in range(10)]
y2 = [random.randint(1, 10) for _ in range(10)]

# Create subplots
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(10, 5))

# Plot data on the first subplot
ax1.scatter(x, y1)
ax1.set_xlabel('X-axis')
ax1.set_ylabel('Y-axis')
ax1.set_title('Scatter Plot 1')

# Plot data on the second subplot
ax2.scatter(x, y2)
ax2.set_xlabel('X-axis')
ax2.set_ylabel('Y-axis')
ax2.set_title('Scatter Plot 2')

# Adjust the spacing between subplots
plt.tight_layout()

# Display the plots
plt.show()


In [None]:
# Generate random data
x1 = [random.randint(1, 10) for _ in range(10)]
y1 = [random.randint(1, 10) for _ in range(10)]

x2 = [random.randint(1, 10) for _ in range(10)]
y2 = [random.randint(1, 10) for _ in range(10)]

x3 = [random.randint(1, 10) for _ in range(10)]
y3 = [random.randint(1, 10) for _ in range(10)]

x4 = [random.randint(1, 10) for _ in range(10)]
y4 = [random.randint(1, 10) for _ in range(10)]

# Create subplots
fig, axes = plt.subplots(2, 2, figsize=(10, 8))

# Plot data on each subplot
axes[0, 0].scatter(x1, y1)
axes[0, 0].set_xlabel('X-axis')
axes[0, 0].set_ylabel('Y-axis')
axes[0, 0].set_title('Subplot 1')

axes[0, 1].scatter(x2, y2)
axes[0, 1].set_xlabel('X-axis')
axes[0, 1].set_ylabel('Y-axis')
axes[0, 1].set_title('Subplot 2')

axes[1, 0].scatter(x3, y3)
axes[1, 0].set_xlabel('X-axis')
axes[1, 0].set_ylabel('Y-axis')
axes[1, 0].set_title('Subplot 3')

axes[1, 1].scatter(x4, y4)
axes[1, 1].set_xlabel('X-axis')
axes[1, 1].set_ylabel('Y-axis')
axes[1, 1].set_title('Subplot 4')

# Adjust the spacing between subplots
plt.tight_layout()

# Display the subplots
plt.show()


In [None]:
x = range(11)
y = range(11)

fig, ax = plt.subplots(nrows=2, ncols=3,
                       sharex=True, sharey=True)

for row in ax:
    for col in row:
        col.plot(x, y)
        
plt.show()

#### Task 4.2: Create Two Subplots

1. Generate random data for two sets of x and y coordinates.

2. Create a figure and two subplots using plt.subplots(). Set the number of rows and columns of subplots accordingly.

3. Plot the data on each subplot using the appropriate plotting function (plt.scatter(), plt.plot(), etc.).

4. Customize the subplots by setting labels, titles, or other properties using the corresponding methods (set_xlabel(), set_ylabel(), set_title(), etc.).

5. Display the subplots using plt.show().

In [None]:
# Task 4.2

# 1. Generate random data
x = [random.randint(1, 10) for _ in range(10)]
y1 = [random.randint(1, 10) for _ in range(10)]
y2 = [random.randint(1, 10) for _ in range(10)]

---
## Colors and Markers

![matplotlib_markers.png](attachment:34d3f8f5-dfad-41cf-b272-214df622a249.png)

In [None]:
x = range(11)
y = range(11)

plt.plot(x, y,
         color='blue',
         marker='^',
         linestyle='')
plt.show()

In [None]:
x = range(11)
y = range(11)

plt.plot(x, y,
         c='blue',
         marker='1',
         linestyle='')
plt.show()

<br>

#### *Task 4.3: Create Subplots with Different Markers*

1. Generate two sets of x and y coordinates. Can use range() or anything else.

2. Create a figure with three subplots using plt.subplots(). Set the number of rows and columns of subplots accordingly.

3. Plot the data on each subplot using the appropriate plotting function (plt.scatter(), plt.plot(), etc.).

4. Customize the subplots by giving each a unique color and marker type.

5. Display the subplots using plt.show().

In [None]:
# Task 4.3




## Saving Plots

The file format for saving plots can be conveniently specified via the file suffix (.eps, .svg, .jpg, .png, .pdf, .tiff, etc.). Personally, I recommend using a vector graphics format (.eps, .svg, .pdf) whenever you can, which usually results in smaller file sizes than bitmap graphics (.jpg, .png, .bmp, tiff) and does not have a limited resolution.

In [None]:
plt.plot(x, y)

plt.savefig('myplot.png', dpi=300)
plt.savefig('myplot.pdf')

plt.show()

---
<br>

## **This was only meant to be a brief introduction that you can use as a reference.**

<br>

---

NumPy and Matplotlib reference material:

- [The official NumPy documentation](https://docs.scipy.org/doc/numpy/reference/index.html)
- [The official Matplotlib Gallery](https://matplotlib.org/gallery/index.html)
- [The official Matplotlib Tutorials](https://matplotlib.org/tutorials/index.html)


Optional references books for using NumPy and SciPy:

- Rougier, N.P., 2016. [From Python to NumPy](http://www.labri.fr/perso/nrougier/from-python-to-numpy/).
- Oliphant, T.E., 2015. [A Guide to NumPy: 2nd Edition](https://www.amazon.com/Guide-NumPy-Travis-Oliphant-PhD/dp/151730007X). USA: Travis Oliphant, independent publishing.
- Varoquaux, G., Gouillart, E., Vahtras, O., Haenel, V., Rougier, N.P., Gommers, R., Pedregosa, F., Jędrzejewski-Szmek, Z., Virtanen, P., Combelles, C. and Pinte, D., 2015. [SciPy Lecture Notes](http://www.scipy-lectures.org/intro/numpy/index.html).
- Harris, C.R., Millman, K.J., van der Walt, S.J. et al. [Array Programming with NumPy](https://www.nature.com/articles/s41586-020-2649-2). Nature 585, 357–362 (2020). 