# Chapter Zero: Facts about sets

## Create a simple if then statement

In logic `if... then...` is represented by `->`


In python we use an `if... else...` as seen below:

```
if condition:
    do something
else:
    do something else
```

 The `if` condition is the `if... then...` and the `else` encompases everything past the `if... then...`


Below is a simple if else to compare if one quantity is greater than another

In [1]:
a = 2
b = 3

if a > b:
    print(True)
else:
    print(False)

False


What if we want to compare all of the operators

Not just greater than

We would want to make somehting like a general comparison function.

Something that took two quantities and an operator then evaluated its truthyness

Like this sudo code below:

```
SUDO CODE
if a "operator" b then do something

where "operatop" exists in set {'>','<','>=','<=','=='}
```

In [2]:
import operator

def get_truth(a, op, b):
    ops = {'>': operator.gt,
           '<': operator.lt,
           '>=': operator.ge,
           '<=': operator.le,
           '==': operator.eq}
    # not sure 
    return ops[op](a, b)

print(f"{a}> {b}:  {get_truth(a, '>', b)}")
print(f"{a}< {b}:  {get_truth(a, '<', b)}")
print(f"{a}>={b}:  {get_truth(a, '>=', b)}")
print(f"{a}<={b}:  {get_truth(a, '<=', b)}")
print(f"{a}=={b}:  {get_truth(a, '==', b)}")

2> 3:  False
2< 3:  True
2>=3:  False
2<=3:  True
2==3:  False


Here you can see that instead of an `if... else...` statement we use a more general function `get_truth(quantity1, operator, quantity2)`

What we can do with this is start to teach computers things about quantities.

Really just about numbers since the operators in this settins only work with numerical quantities.

This is where the magic happens, we can figure out logic without even having to evaluate the opperand.

## Introduction to simple neural networks

A neural network is a simple device that allows you to take an input and produce and output.

In this case our network would take an input vector of 2 numbers say `[1,2]` and produce an output vector of booleans e.g. `True` or `False` but for each operator in our operator set `{'>','<','>=','<=','=='}`

So essitially if our input is `[1,2]` our output vector would be `[False, True, False, True, False]`. However in order for our simple neural network to read our data we will need to convert our boolean vector to `[0,1,0,1,0]` instead.

Since the input will be a 2x1 vecor of dense inputs and the output will be a 5x1 vector of sparse outputs it will be easier to learn dense to sparse, than the other way around.

To train a network to learn this code without actually programming the logic, we will need to produce a set of examples.

Our input data will be 10k samples of pairs of randomly generated numbers and our output data will be 10k vectors each of length 5.

First lets geberate our input, then let's generate our output.

### Generating our training data

In [3]:
import numpy as np

# generate a vector of random number shape (10000,2)
input_vec = np.random.randn(10000, 2)

# Create a function that take an input pair and prodcues length 5 vector
def generate_output(input_vec):
    a, b = input_vec
    ops = ['>','<','>=','<=','==']
    return np.array(
        [int(get_truth(a, op, b)) for op in ops]
    )

# Let's look at the first example that we generated
input_vec[0], generate_output(input_vec[0])

(array([-0.85529728, -0.48681754]), array([0, 1, 0, 1, 0]))

In [4]:
# Now crete our output that we are interested in learning
# by applying the output function along the axis of our input

output_vec = np.apply_along_axis(generate_output, 1, input_vec)

input_vec[0:5], output_vec[0:5]

(array([[-0.85529728, -0.48681754],
        [ 1.48658393, -0.98458064],
        [-1.12342034,  0.45699529],
        [-0.59698888, -0.01480697],
        [-0.37187381,  0.58280404]]), array([[0, 1, 0, 1, 0],
        [1, 0, 1, 0, 0],
        [0, 1, 0, 1, 0],
        [0, 1, 0, 1, 0],
        [0, 1, 0, 1, 0]]))

## Building a simple neural network

Now that we have generated our training data we can work on:

1. putting the data into the proper data format for torch
2. create our simple test case network for 2x10x5
3. experiment with larger networks if simple network not enough

Some of the issues I see arrising from our early network experimentation are that we have very unbalanced class situation meaning that there will be very few or no examples of equality, since they are randomly drawn.

Also because it is sample from a random normal the range of the network will be small and useless for larger numbers.

However let's experiment and see if any of this is true.

Here is our simple neural network:

[insert picture here]

### Introduction to Pytorch

[PyTorch](pytorch.org) is cool

We'll use it here to create neral networks.

#### Create models

First lets check which device we are using

In [6]:
import torch

# Get cpu or gpu device for training.
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Using {device} device")

Using cpu device


We are not using the GPU yet becuase we don't have any networks to accelerate.

In [9]:
import torch.nn as nn

# Define model
class NeuralNetwork(nn.Module):
    def __init__(self):
        super().__init__()
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(2,10),
            nn.ReLU(),
            nn.Linear(10, 5)
        )

    def forward(self, x):
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork().to(device)

This is the most simple class based method code. Personally I am used to the functional method as I used in [this repo](https://github.com/rcgalbo/xor_example).

It is still pretty self explanatory. We define our class which is `NeuralNetwork` that extends the `torch.nn.Module` class. All that means is that there is a thing called `torch.nn.Module` and now our class `NeuralNetwork` is that thing too. We could have called it anything this is just straight coppied from the torch docs.

Then we du this thing `super().__init__()` which I am pretty sure is juist a way of initializing the class `torch.nn.Module`. Don't hurt your brain thinking about it too much just remeber you have to type that first in your `__init__(self)`. `nn.Flatten()` does exactly what it sounds like. I think that means it flattens.

Here is where it gets interesting. We pack all of our network code inside of `self.linear_relu_stack` by calling `nn.Sequential()` which is saying that we will pass all of our data through our network sequentially. Inside of that `nn.Sequential()` is our layer definition function

```
  nn.Linear(2,10),                # input 2 dims, output 10 dims
  nn.ReLU(),                      # Rectified Linear Unit Activation 
  nn.Linear(10, 5),               # input 10 dims, output 5 dims
```

In [10]:
model

NeuralNetwork(
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (linear_relu_stack): Sequential(
    (0): Linear(in_features=2, out_features=10, bias=True)
    (1): ReLU()
    (2): Linear(in_features=10, out_features=5, bias=True)
  )
)

#### Optimizing parameters

We need something called a loss function that specifies how we will peanalize the model for being wrong. 

We will use the function `nn.CrossEntropyLoss()` as our loss function

We will also use SGD as our optimizer

In [11]:
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=1e-3)

#### Training Loop

Here is where the magic happens and the model parameters are updated by our optimizer gradient descent.