## Minimal example of Ordinary Least Squares regression

To run the example code click in the first code cell below to select it, then press `Shift Enter' or click the `Run` button on the tool bar above. This will run the code in that cell and jump to the next cell. Repeat the process to work through the example. Note, the cells MUST be run in sequence, from first to last.

### Import the packages we need

In [1]:
'''
---------------------------------------------------------------------
Minimal example of Ordinary Least Squares regression

                                           Roderick Brown, 31/3/2020
 --------------------------------------------------------------------
'''
# This line enables interactive plots
%matplotlib notebook

# The line below is not a cooment, it sets ascii code to use
# -*- coding: utf-8 -*-

# Import required modules
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt


### Create some data

Use `numpy.array` routine to create arrays containing x values and y values. We have set the data type to _float_, using `dtype=float`, which means decimal numbers, i.e. from floating point

In [2]:
# Create two arrays with the x and y values
x = np.array([1.1,2.3,3.1,3.8,5.1], dtype=float)
y = np.array([3.2,6.5,6.8,9.2,10.9], dtype=float)

# print the arrays to the screen
print ('x values:', x)
print ('y values:', y)

x values: [1.1 2.3 3.1 3.8 5.1]
y values: [ 3.2  6.5  6.8  9.2 10.9]


### Plot the data on a graph

In [3]:
# Plot the x and y values
fig = plt.figure(1)  # This line creates a figure object to plot to
plt.scatter(x,y) # This line draws a scatter plot in the current figure object
plt.show() # This command draws the figure on the screen

<IPython.core.display.Javascript object>

### Do the regression stuff...

In [10]:
# Call the scipy.stats.linregress routine
slope, intercept, r_value, p_value, std_err = stats.linregress(x,y)

# Print some of the returned values of interest
print (f'OLS slope: {slope:04.2f} +/- {std_err:04.2f}')
print (f'OLS intercept: {intercept:04.2f}')
print (f'R value: {r_value**2:04.2f}')


OLS slope: 1.90 +/- 0.21
OLS intercept: 1.46
R: 0.96


### Plot the data and the fitted straight line

In [13]:
# Create a new plot with the x and y data and show the fitted line
fig = plt.figure(2)

plt.xlim(0,6)   # Set the limits of the x axis
plt.ylim(0,12)  # Set the limits of the y axis

plt.scatter(x,y,label='data',color='b') # Plot the data

# Set limits for plotting fitted line, from x=0 to x max
xline = [0,x[-1]]  # Index -1 means count from the end, i.e. last element in array x
yline = [intercept,(x[-1]*slope+intercept)]

plt.plot(xline, yline,'r--',label='OLS regression extended') # Plot the fitted line

# Just plot fitted line using fitted parameters directly
plt.plot(x,((x*slope) + intercept), 'r-',label='OLS regression line') # Plot the fitted line

plt.ylabel('Y values')
plt.xlabel('X values')

plt.legend(loc='upper left')

# Write plot to pdf file. Change file extension to *.png or *.tif as required)
#plt.savefig('ols_example.pdf', format='pdf')

# Show the plot on screen
plt.show()


<IPython.core.display.Javascript object>