# Matplot lib - All your plotting functions under one roof (almost!)



Matplotlib is the standed plotting library used by most, which allows simple plotting. 

So let's have a try!



In [None]:
# This will plot a simple scatter graph of points. 
# The points will have all different sizes just for visual appearance, as well as varied colours
%matplotlib inline
# Import the required libraries
import numpy as np
import matplotlib.pyplot as plt

# Lets say we want to plot 50 points
N = 50

# Generate some random data
x = np.random.rand(N)
y = np.random.rand(N)

# Add a random colour field
colors = np.random.rand(N)

# Alter the size of the particles randomly
area = np.pi * (15 * np.random.rand(N))**2  # 0 to 15 point radiuses

# Let's plot it! Simple right? Plot x, y. Alpha = the transparency of the points - very useful! 
plt.scatter(x, y, s=area, c=colors, alpha=0.5)

# Now for some labels....and a title (they always help)
plt.xlabel('Random x')
plt.ylabel('Random y')
plt.title('Awesome!')

# To see the figure we need this...
plt.show()

# Source: http://matplotlib.org/examples/shapes_and_collections/scatter_demo.html 



# Now lets have a look at some real data! 


In [None]:
# First of all, lets read some data from a CSV file. 
#"Figure 1. Average Global Sea Surface Temperature, 1880-2015",,,
#Source: EPA's Climate Change Indicators in the United States: www.epa.gov/climate-indicators,,,"Data source: NOAA, 2016",,,
# Units: temperature anomaly (°F),,,
# https://www.epa.gov/sites/production/files/2016-08/sea-surface-temp_fig-1.csv 

# Import our libraries
import csv
import matplotlib.pyplot as plt
import os

In [None]:
file_name = os.path.join(os.path.pardir, 'data', 'sea-surface-temp_fig-1.csv')

In [None]:
!head {file_name}

Reading CSV files can be tricky, especially if they have comples headers and format styles. Another source of problems is encoding. For example, in the file used in this exercise, the header contains a degree symbol on the line 5. So when Python tries to read the file, it stumbles upon this symbol and cannot decode it. Luckily, the rest of the file is UTF-8 compliant.

So what should you do? You might just open the file with a text editor and delete that line. But then you will loose valuable information about the contents of the file. And what if there are hundreds of these files to process?

So let's keep the header and try to skip it while reading.

### 1. "Brute force": open the file in binary mode, skip the header and decode the rest

In [None]:
# Create some empty variables 
years = []
anoms = []
# Header is 6 lines
skip_rows = 6
# Let's open the dataset, using the csv reader
with open(file_name, 'rb') as csvfile:
    raw_lines = csvfile.readlines()
    # Now we have to go through line by line. 
    for row in csv.reader([i.decode() for i in raw_lines[skip_rows+1:]],
                          delimiter=','):
        years.append(row[0])
        anoms.append(row[1])
years = np.array(years)
anoms = np.array(anoms)

### 2. `numpy.genfromtxt()`

In [None]:
data = np.genfromtxt(file_name, delimiter=',', skip_header=skip_rows, names=True,
                     dtype=None)
years = data['Year']
anoms = data['Annual_anomaly']

Note: `pandas.read_csv()` function has a similar functionality and returns similar data structure.

### 3. `numpy.recfromcsv()` (almost the same, slightly different array type)

In [None]:
data2 = np.recfromcsv(file_name, skip_header=skip_rows)
years = data2['year']
anoms = data2['annual_anomaly']

In [None]:
# Time to create our plot and give it a red colour because it is alarming! 
plt.plot(years, anoms, c='red')

# How about adding some labels? 
plt.xlabel('Year')
plt.ylabel('Temperature (oF)')
plt.title('Annual Anomaly')
plt.show()

# Now lets see if we can make some subplots

In [None]:
# Here we create a subplot 1x2 and share the axis = True
f, (ax1, ax2) = plt.subplots(1, 2, sharey=True)
# Plot a line graph
ax1.plot(x, y)

# Plot a scatter graph
ax2.scatter(x, y)

# Show me the plots! 
f.subplots_adjust(hspace=0)

In [None]:
import numpy as np
import matplotlib.pyplot as plt
 
# Create some random numbers
n = 100000
x = np.random.randn(n)
y = (1.5 * x) + np.random.randn(n)
 
# Plot data
fig1 = plt.figure()
plt.plot(x,y,'.r')
plt.xlabel('x')
plt.ylabel('y')
 
# Estimate the 2D histogram
nbins = 200
H, xedges, yedges = np.histogram2d(x,y,bins=nbins)
 
# H needs to be rotated and flipped
H = np.rot90(H)
H = np.flipud(H)
 
# Mask zeros
Hmasked = np.ma.masked_where(H==0,H) # Mask pixels with a value of zero
 
# Plot 2D histogram using pcolor
fig2 = plt.figure()
plt.pcolormesh(xedges,yedges,Hmasked)
plt.xlabel('x')
plt.ylabel('y')
cbar = plt.colorbar()
cbar.ax.set_ylabel('Counts')

plt.show()

# Source: https://oceanpython.org/2013/02/25/2d-histogram/ 