# Machine Learning From First Principles
### A Workshop by Jack Fraser-Govil


Hello, and welcome to this notebook. Along with the notes, presentation and discussions which we will be having throughout the day, this document will walk you through a series of tasks and exercises which will culminate, by the end of the workshop, in you possessing a fully working Feedforward Neural Network which you have designed and written from scratch. 

In doing so, we hope that you will gain a degree of insight into how these machines are `thinking' under the hood, and thereby pull back some of the mystery and folklore that surrounds ML, and thereby gain a peek inside the black box. We hope that this knowledge will help inform the discussions and presentations for the rest of this conference. 

## What Do I Need?

In order to run this notebook, all you need is the ability to run an ipython/Jupyter notebooks, and some standard python libraries: mostly ```numpy``` and ```matplotlib```. You do *not* need to have ```tensorflow``` or ```pytorch```: the entire point of this exercise is to create a learning framework without relying on pre-built tools.

You *will* need to have read, or be confident that you already know the mathematics contained within [the provided notes](../Notes/Notes.pdf). If you haven't read them yet (and are not confident that your skillset includes the vector-chain rule and matrix algebra), then I suggest you speed-read Chapters 2, 3 and 4.

The final thing you need is a willingness to engage with some of the murkier, more difficult and mathematically complex components of the workshop: teasing out what these things *mean* is why you are here; so if something sticks out to you as something you don't understand, don't just let it wash over you -- please ask and we can discuss. 


## What if I get stuck?

The first port of call is always to ask me, one of my helpers, or simply someone near to you, for guidance. 

If there's a concept you just can't grasp during the day, I will point you to some resources that you can look at afterwards, and hopefully things will become clearer in retrospect. 

If you get stuck on one of the coding exercises, or cannot complete it before it is time to move on, you will find a [second notebook in this same directory](notebook_solved.ipynb). This notebook will have all of the coding exercises solved for you. That is not to say that it is 'correct', only that this notebook will be *functional*. If you get really, really, really stuck, you can start stealing code from that document. 

##### However

Whilst it is certainly possible (and permitted) for you to do this at any time, the entire purpose of this workshop is for you to come away with the satisfaction of having written your own learning networks from scratch. We are, by definition, reinventing the wheel here: everything we make will be orders of magnitude worse than pytorch or tensorflow; the principle is that **you learn best by doing it yourself**.

If you're going to use the 'cheat sheet', then try, first of all, to simply look at it to understand what it is doing, and then implement your own version. Don't directly copy whole chunks of code you don't understand, as that sort of defeats the point of today's efforts.


## Jack, Why is this Python Written Like Garbage?

That's a very good and very important question. **I am not a python developer**. As a member of the High Performance Algorithms group, I spend most of my time buried in C++ and C code. I have written this workshop assuming that, since the majority of existing ML products are pythonic, that python would be the most accessible language. There are often times where my python code is written like someone used to C-syntax, and blithely unaware to the beautiful, elegant, pythonic way of doing things.

I'm afraid you'll just have to deal with my slightly sub-par python. 

What was **not** a result of subpar python is the choice not to use ```pandas``` for data. I made a conscious decision to keep things as 'mathematical' as possible, rather than abstracted into dataframes. You get a list of vectors (and maybe some labels), and have to deal with them manually, that's part of the bare-bones experience!

I also use ```CamelCase```, not ```snake_case```. Fight me.

## Exercise 1: The Pretrained Perceptron

Just to get the juices flowing, let's get started with an easy project.

**In the ```Data``` directory is a file ```cuteness.tex```, which contains (non-dimensional) measures of size and furriness of some animals, as well as some human-curated labels (1 for cute, 0 for note cute). Please write a Perceptron function to evaluate (without the labelled data) which of these creatures the model currently mislabels.**  

I've provided some basic functions and a class framework for you to work from.

Recall that for an input vector $\mathbf{x}$ and weights $\mathbf{w}$, the perception algorithm is:

\begin{equation}
P(\mathbf{x}) = \begin{cases} 1 & \text{if } \mathbf{w}\cdot \begin{bmatrix} 1 \\ \mathbf{x} \end{bmatrix} > 0 \\ 0 & \text{else}\end{cases}
\end{equation}

In [27]:
#If you are running this in Google Colabs, then you need to uncomment these lines; they act to clone the workshop git locally so that you have access to the data we will be using.
# !git clone https://github.com/wtsi-hpag/xAIWorkshop.git
# %cd xAIWorkshop/Code

# This is the extent of fancy libraries we will be using
import numpy as np
import matplotlib.pyplot as pt

##load and parse the data manually. No pandas to be seen here! This function will be used repeatedly to load in our test data. 
def LoadCategoricalData(file):
	names = []
	labels = []
	data = []
	with open(file) as file:
		for line in file:
			entries = line.rstrip().split(' ')
			names.append(entries[0])
			labels.append(int(entries[1]))
			data.append(np.array([ float(entries[i]) for i in range(2,len(entries))]))
	return names, labels, data

names, labels, data = LoadCategoricalData("Data/cuteness.dat")

In [None]:
## Here's a basic plotting function, feel free to write your own
def PlotData(positions, labels,names=None):
	labels = [int(b) for b in labels] ## just to make sure the data is ints we can index in, not bools
	cols = ["red","blue"]
	x,y = zip(*positions)
	pt.scatter(x,y,color=np.array(cols)[labels])
	if names is not None:
		for i,pos in enumerate(positions):
			pt.annotate(names[i],pos)
	pt.show()

pt.title("True Assignments")
pt.xlabel("Size")
pt.ylabel("Fur")
PlotData(data,labels,names)

In [29]:
class Perceptron:

	def __init__(self):
		self.Weights = np.array([1,1,-1]) ## we'll use these pre-provided weights for now

	def Train(self,data,labels):
		##ignore this for now -- you'll fill it in later.
		return None

	def Predict(self,input):
		
		##this should return a value of 0 or 1, depending one how the Perceptron classifies the input vector
		## If you're feeling advanced, you might write this function to operate both on a single vector and a list-of-vectors. 
		return None ##??

## Initialise, then call your Perceptron on data. 

## Exercise 2: Training The Perceptron

You should (hopefully) now have a working Perceptron; the only issues is a) it's absolutely terrible and b) I had to give you the weights. 

The next challenge is to write up a training algorithm for the Perceptron. 

**Add a method (Train) to the Perceptron, which iterates over the dataset a number of times (you choose - when is 'enough'?) and updates the weights**

Remember, given a datum $\mathbf{x}$, a prediction $P$ and the correct label $L$ the update formula is:

\begin{equation}
	\mathbf{w} \to \mathbf{w} + r \times \left(L - P(\mathbf{x}) \right)\mathbf{x}
\end{equation}
Where $r$ is your learning rate. This formula serves to nudge the weights in a direction of the vector $\mathbf{x}$ whenever the classifier makes a false prediction (when $L \neq P$)

In [30]:
P = Perceptron()

P.Train(data,labels) ## fill in the Train method above!

## then test!

## Exercise 3: The Wrench in the Works

If everything has gone to plan, you now have an algorithm that can correctly identify whether a given animal is cute or note, based on measures of their size and amount of fur. 

Now try running your Perceptron training & testing routines on a new dataset: ```cuteness_augmented```.

**What goes wrong?**

In [31]:
names, labels, data = LoadCategoricalData("Data/cuteness_augmented.dat")
P = Perceptron()

P.Train(data,labels) 

## then test it, and find out what has gone wrong....

### Optional Task: The Nonlinear Perceptron

It is perfectly possible to generate a Perceptron which can deal with these difficult and non-linear cases: the trick is that you have to put the nonlinearity in yourself. 

If you have time, try the following:

**Write a nonlinear perceptron which transforms the input vector (x,y) into a nonlinear vector (such as (x,y,x^2,y^3 sin(x))), and then uses this as the vector in a standard Perceptron algorithm. Does it show any improvements? How much work does it take to make the algorithm perfectly separate our cute animals from the uggos?**



In [32]:
class NonLinearPerceptron:

	def __init__(self):
		self.Weights = None
	
	def Train(self, data, labels):
		return None
	def Predict(self, input):
		nonLinear = self.ConvertLinearVector(input) ## this is the magic trick. Why does it work?

		return None
	def ConvertLinearVector(self, input):
		transformedVector = input ## you should do something here to add in non-linearity!
		return transformedVector


### Optional Task: The XOR Problem

If you've managed to get a non-linear Perceptron working, you might be feeling pretty pleased with yourself! You've managed to (with a single node, in modern terminology) fit a highly nonlinear classifier. Let's see how far we can push it...

**In the Data directory are a series of datasets called ```XOR_n``` (where ```n``` ranges from 10 to 320). Try fitting these with your non-linear classifier. What goes wrong?**

In [33]:
names, labels, data = LoadCategoricalData("Data/XOR_10.dat")
P = NonLinearPerceptron()
P.Train(data,labels)
## then test -- and repeat with XOR_20, 40, 80 and higher
## What goes wrong?