# Python Week 5

October 12, 2019

This week, we dig into numpy and matplotlib more since these two packages are frequently used for processing and plotting  data. We are slowly entering into data science. 

For numpy, we need to understand reshaping and slicing. Reshaping means change the shape of dataset for example, from a 2D matrix form to a 1D vector form. Slicing is a convenient, simple notation to access partial data. If our data is a 3D matrix like cube, how do we plot/handle the top layer like a 2D image? This kind of job can be done easily using the slicing technique. We'll experiment with 28x28 figures in the MNIST dataset. A small portion of it is provided for your convenience in the github directory. 
For matplotlib, we try to visualize 2D data (image) as we plotted time series data last week. For 2D plotting, we'll use plt.imshow and many options will be tested to make our plots more beautiful and impressive.

Enjoy this week's example.

In [None]:
import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
from matplotlib import cm

## slicing

In [None]:
my1d = np.random.randn(5)

In [None]:
my1d

In [None]:
# how do we access the first element?
print(my1d[0])

In [None]:
# how about first two elements?
print(my1d[0:2])

In [None]:
# how about first, third, fifth, skipping by one.
print(my1d[0::2]) # print(my1d[0:5:2]) will give you the same result
# please not that my1d[0:1] will give you only the first element.
# because the second index is not included.
# my1d[n1:n2] will show elements from n1 to n2-1.
# let's try it
print(my1d[0:5:2])

In [None]:
# how to access the last element?
# will this work? 
print(my1d[4])

In [None]:
# A correct way is 
print(my1d[4]) # since numpy arrays are using 0-based index, meaning the index starts from 0 instead of 1

# this is *NOT CONVENIENT* because you need to know the length and to do minus 1 to the length
# A much more convenient way is
print(my1d[-1])

In [None]:
# how about the last two elements?
print(my1d[-2:]) # print(my1d[-2:-1]) does NOT do the job because 

In [None]:
# Let's try 2D data
my2d = np.random.randn(3, 2)

In [None]:
my2d

In [None]:
# Let's print the first row.
print(my2d[0,:])

In [None]:
# Can you print the first TWO rows? Yes, you can.
print(my2d[0:2,:])

In [None]:
# How about the first column of the first two rows?
print(my2d[0:2, 0])

In [None]:
# Let's make it into a 1D vector
my1dfrom2d = my2d.flatten()

In [None]:
my1dfrom2d

In [None]:
my2d.reshape(6)
# can you tell differences between my2d.reshape(6,1) and my2d.reshape(6)

In [None]:
# we can use negative indices like -1, -2 as we did for 1D arrays.
my2d[-1,:]

In [None]:
my2d[-1,-1]

## plotting a histogram

In [None]:
## Experimenting with axis

import matplotlib
mydata = np.random.randn(10000, 16, 16);

#matplotlib.rc('axes', linewidth=2.0, edgecolor='black', facecolor='green')

# let's check the histogram of mydata
fig = plt.figure(figsize=(10,8), linewidth=4)
#rect = fig.patch
#rect.set_facecolor('lightgoldenrodyellow')

#ax1 = fig.add_axes([0.1, 0.3, 0.4, 0.4])
ax1 = plt.gca()
ax2 = fig.get_axes()
print(ax1==ax2)

#rect = ax1.patch
#rect.set_facecolor('lightslategray')
#rect.set_linewidth(9)

for axis in ['top', 'bottom', 'left', 'right']:
    ax1.spines[axis].set_linewidth(5)
    
plt.hist(mydata.ravel(), bins=64, range=(-2.95, 2.95), fc='g', ec='k', linewidth=4);
plt.title('Data histogram', fontsize=28)
plt.xlabel('data', fontsize=24)
plt.ylabel('frequency', fontsize=24)
plt.tick_params(labelsize=20)




## plotting 2D data

In [None]:

# let's plot one of them

plt.figure(figsize=(8,6)) #if you want to change the size of the figure
imgplt = plt.imshow(mydata[0,:,:], cmap = cm.RdBu, alpha=1.0, clim=(-2, 2)) # alpha for transparency
#plt.set_cmap('jet') This is a way to change its colormap to a different one.

# in order to find a list of colormaps
cms = plt.colormaps()

# what's the type of 'cms'?
#  
imgplt.set_cmap('cool')

# you can change the range of data to map to the colormap
imgplt.set_clim(-5,5)

plt.title('plot using ' + 'cool')

#plt.imshow(mydata[0,:,:], vmin=-0.5, vmax=0.5)
#plt.colorbar() # plot(add) a vertical scale bar on the right
# if you want a horizontal scale bar at the bottom of the main figure
plt.colorbar(orientation='horizontal', fraction = 0.1)


In [None]:
plt.imshow?

In [None]:
## How can we plot the last image of size 16x16 ?
## Can you complete the following command, plt.imshow(mydata[   ])?

In [None]:
## Since it is more fun to play with numbers, let's use MNIST datasets
xtest = np.load('xtest.pickle', allow_pickle=True)
ytest = np.load('ytest.pickle', allow_pickle=True)
## Question: what's the dimension of xtest?


In [None]:
xtest.shape

In [None]:
## Let's plot the first image of 28 x 28.
plt.imshow(xtest[0, :,:])

In [None]:
## Now, sometimes we need to convert 2D images into 1D vectors.
## For example, my2d = xtest[0,:,:] can be put into a 1D vector of length 28x28
my2dImage  = xtest[0, :, :]
my1d_flattened = my2dImage.flatten()
# or
my1d_reshaped = my2dImage.reshape(28*28,)

In [None]:
## Let's check the shape (dimension) of "my1d"
my1d_flattened.shape

In [None]:
my1d_reshaped.shape

In [None]:
xtest2d = np.reshape(xtest, (10000,-1))

In [None]:
xtest2d.shape

In [None]:
#Let's make sure that the first row is the image of number "7" we saw above
plt.imshow(xtest2d[0,:].reshape(-1,28))

## Plotting using subplots

In [None]:

# Let's test 

xtest = np.load('xtest.pickle', allow_pickle=True)
ytest = np.load('ytest.pickle', allow_pickle=True)

fig, axs = plt.subplots(2,2, figsize=(10,10))

fig.suptitle('MNIST figures', fontsize=28)
fig.subplots_adjust(hspace=0.5, wspace=0.5)
for i in range(2):
    for j in range(2):
        axs[i,j].imshow(xtest[i+2*j,:,:]+30*np.random.randn(28,28), cmap='gnuplot2')
        axs[i,j].set_xlabel('i', fontsize=20)
        axs[i,j].set_ylabel('j', fontsize=20)
        axs[i,j].set_title(str(ytest[i+2*j]), fontsize=30)
        axs[i,j].tick_params(labelsize=20)
        
#plt.subplots_adjust(wspace=1, hspace=1)
    

## Colors

In [None]:
import matplotlib.colors as mcolors

In [None]:
colormapping = mcolors.get_named_colors_mapping()
colormapping

In [None]:
mcolors.BASE_COLORS
## color name and rgb values

In [None]:
mcolors.CSS4_COLORS
## color name and 