**The Zen of Python**

- Beautiful is better than ugly.
- Explicit is better than implicit.
- Simple is better than complex.
- Complex is better than complicated.
- Flat is better than nested.
- Sparse is better than dense.
- Readability counts.
- Special cases aren't special enough to break the rules.
- Although practicality beats purity.
- Errors should never pass silently.
- Unless explicitly silenced.
- In the face of ambiguity, refuse the temptation to guess.
- There should be one-- and preferably only one --obvious way - to do it.
- Although that way may not be obvious at first unless you're - Dutch.
- Now is better than never.
- Although never is often better than *right* now.
- If the implementation is hard to explain, it's a bad idea.
- If the implementation is easy to explain, it may be a good - idea.
- Namespaces are one honking great idea -- let's do more of - those!

**Common Imports**

The code below shows some of the commonly used imports for working with data in Python

Numpy is a library which allows for working with large arrays of data very quickly and easily

Scipy is a scientific library for data analysis, which we'll just be using one function from, but many more exist for doing many basic and complex types of analysis.

Matplotlib is a plotting library for generating graphs of various kinds. If you install matplotlib locally, these graphs can easily be saved to many different file types and allows for zooming and panning around the graph before saving.

In [None]:
import numpy as np
import scipy as sc
import matplotlib.pyplot as plt

**Some Numpy Basics**  
When using numpy arrays, operations are performed elementwise if it makes sense to do so  
This allows for some neat code, as shown below

In [None]:
import numpy as np

x = np.arange(10)
print(x)
print(10*x)
print(x+10)

When doing operations between arrays, operations are performed per element in each

In [None]:
import numpy as np

x = np.arange(10)
y = np.arange(10)

print(x*y)
print(x+y)

**Starting Out Graphing**

We'll start out by doing a basic plot using matplotlib, specifically pyplot.
These imports are hidden inside of the grapher class to make things a little neater

I've also made some helper functions for generating data
The linearData function returns three things:
 - An array of x co-ordinates
 - An array of y co-ordinates
 - The function that this data is based on

The data that this returns has some random noise added to prevent it just being a perfect straight line (and to give us a reason for modelling later)

In [None]:
from grapher import *

x, y, f = linearData()

plt.scatter(x, y)
plt.plot(x, f(x), 'g')

The above code should have generated a scatter plot of the linear function, as well as a green line showing the function that generated this data without noise. Try changing the 'g' in the above code to an 'r' if you prefer the colour red to green.  
Your options here are b, g, r, c, m, y, k and w for various different colours

If you want to play around with the above code, the linearData function accepts a few parameters, defaults shown below. 

**Linear Data information**

linearData(n=100, m=2, c=5):
 - n is the number of data points to generate 
 - m is the slope of the function (Increase this number to make the graph steeper) 
 - c is the y offset of the function (Increase this number to lift the graph up)

You can also swap out the linear function for quadratic data if you'd prefer, this also has a couple options for parameters.

**Quadratic Data information**  
quadraticData(n=100, a=1, b=0, c=0):
 - n is the number of data points to generate 
 - a-c are the coefficients for the quadratic function as shown below
 
 $a\cdot x^2 + b\cdot x + c$

**Basic modelling**  
Now we'll move on to a basic model  
We're going to generate some linear data, and then try to fit a linear model to it 
To anyone who has done stats, this may seem like a solved problem, but we're going to start here as a basic example and move to more complex functions later  

We'll first generate our data set, then use curve_fit to find the values used to generate the function, then plot these to show the results

In [None]:
from grapher import *

n = 100
m = 2
c = 5

x, y, f = linearData(n,m,c)
fitFunction = linearFunction

parameters, __ = curve_fit(fitFunction, x, y)

print("Fits:")
print("m: %.2f" % parameters[0])
print("c: %.2f" % parameters[1])
print("Actual values:")
print("m: %.2f" % m)
print("c: %.2f" % c)

plt.title('Data vs fit')
plt.scatter(x,y)
plt.plot(x, fitFunction(x, *parameters), 'r')
plt.show()

plt.title('Function vs fit')
plt.plot(x, fitFunction(x, *parameters), 'r')
plt.plot(x, f(x))

To adjust what's being fit in the above, just change the data and the function
Like earlier, I've also made a quadraticFunction for fitting there
You can play around with variables as well

**Slightly more complex fitting**  
Now to a more interesting fit, a sin function

sinData(n=100, a=1, b=1, c=0):
 - n is the number of data points to generate 
 - a-c are the coefficients for the sin function as shown below
 
 $a\cdot\sin(b\cdot x + c)$
 
 You can play around with the parameters again, an interesting one to adjust here is n. Note that n can't be less than 3, as we're trying to fit 3 parameters. (Try it at a low number, around 5-10)

In [None]:
from grapher import *

n = 10
a = 5
c = 0.5

# It's not very good at fitting this one :)
b = 1

x, y, f = sinData(n,a,b,c)
fitFunction = sinFunction

parameters, __ = curve_fit(fitFunction, x, y)

print("Fits:")
print("a: %.2f" % parameters[0])
print("b: %.2f" % parameters[1])
print("c: %.2f" % parameters[2])
print("Actual values:")
print("a: %.2f" % a)
print("b: %.2f" % b)
print("c: %.2f" % c)

#replace x with altx in the plotting if you want to see a smooth sin curve even with few data points
altx = np.linspace(-4,10,1000)

plt.title('Data vs fit')
plt.scatter(x,y)
plt.plot(x, fitFunction(x, *parameters), 'r')
plt.show()

plt.title('Function vs fit')
plt.plot(x, fitFunction(x, *parameters), 'r')
plt.plot(x, f(x))

This one is a lot more finicky, and stretches the limits of how curve fit is being used here.  
There are ways to improve this performance, but I think they're out of scope here as they require a bit of a better data setup and some knowledge of experimental data processing. I'd be glad to chat about them if anyone is interested!

You can use this to fit any function you want, as shown throughout this workbook  
The code below shows the format of the functions that were used to generate data and used for fitting throughout

In [None]:
from grapher import *

def linearData(n=100, m=2, c=5):
	x = np.arange(n)
	
	y = randomArray(n,0.9,1.1)*(x*m + c)
	y = y + randomArray(n, -10, 10)

	f = lambda x: x*m+c

	return x,y,f

def linearFunction(x, m=2, c=5):
	return x*m +c