In [64]:
import numpy
import random

## Data defined

Where index is as follows: the length, width and where it's weight should be defined.

0 = blue,
1 = red

In [166]:
flowers = [
    [3, 1.5, 1],
    [2, 1, 0],
    [4, 1.5, 1],
    [3, 1, 0],
    [3.5, 0.5, 1],
    [2, 0.5, 0],
    [5.5, 1, 1], 
    [4.4, 1, "?"]
]

## NN is our Neural Network function

Accepting matching data, with their weight application and bias generated later in numpy's random number generator. Also defining sigmoid, an expo function which packs any data into values between 0 and 1, negative or positive. M = Measurement.

In [103]:
def feed(m1, m2, w1, w2, b):
    z = m1 * w1 + m2 * w2 + b
    return sigmoid(z)

def sigmoid(x):
    return 1/(1 + numpy.exp(-x))

In [169]:
w1, w2, b = 7.526382310815231, 3.4870577682723902, -26.253997846696628

## Running an example, with previously assigned random weights and bias

We are running this for our data set's first set of data:

**A red flower (val 1) which has a length 3 and height 1.5**

In [172]:
def run_model_predictions():
    for idx in range(len(flowers)):
        prediction = feed(flowers[idx][0], flowers[idx][1], w1, w2, b)
        target     = flowers[idx][2]
        print("{0} = {1}".format("red" if prediction >= 0.5 else "blue", "red" if target >= 0.5 else "blue"))

def predict_from_m(m1, m2):
    prediction = feed(m1, m2, w1, w2, b)
    target     = "?"
    print("{0} = {1}".format("red" if prediction >= 0.5 else "blue", "?"))


def model():
    correct = 0
    for i in range(len(flowers)):
        prediction = 1 if feed(flowers[i][0], flowers[i][1], w1, w2, b) >= 0.5 else 0
#         print(prediction)
        if flowers[i][2] == prediction:
            correct += 1
    print("Model: {0}% effective, overall {1}/{2}".format(round(correct/len(flowers)*100, 2), correct, len(flowers)))
       

run_model_predictions()
model()
    


red = red
blue = blue
red = red
blue = blue
red = red
blue = blue
red = red
Model: 100.0% effective, overall 7/7


## Cost Function (Squared Error Cost Function)

Squaring the difference between the target (or real) and the prediction, helps us get an idea of how to minimize the cost and get better predictions.

**When the slope is positive** it means that there must be subtracted a positive integer to get closer to the zero approximation (once at 0, the cost function is at it's equalibrium), on the other side **when the slope is negative** that means it is approaching the 0-point, so it must be subtracted by a negative number.

### Note that prediction is the only value which changes, so that is our "goal" value

In [86]:
# cost function to tell the computer how far out it was, lower = better
def prediction_cost(prediction, target):
    return (prediction - target) ** 2

# getting the slope of this function, as it requires tweaking for backwards integration
def slope_cost(prediction, target):
    return 2 ** (prediction - target) * 1

e1 = error(prediction, target)
s1 = slope(prediction, target)

print("The error for prediction is: {0}".format(e1))
print(", With a slope of {0}".format(s1))

The error for prediction is: 0.23990334068177915
, With a slope of 1.404249496973327


## Converting the slope into a closer depiction of the point value

To do this: positive slopes require subtracting a positive float, and for negative slopes, subtracting negative floats. So, to recap, the slope function is the differential (gradient finder) for our cost function - the **cost function** being the prime way of measuring whether or not our computer is doing well in the ability to guess the colour from given data.

Below is the actual training and integration of the model to find the best w1/w2/b values.

**Feed forward** means to combine the weights and bias with the predefined values.

In [168]:
def random_weights():
    return [numpy.random.randn(), numpy.random.randn(), numpy.random.randn()]

def train(weights=False):
    if weights == False:
        w1, w2, b = numpy.random.randn(), numpy.random.randn(), numpy.random.randn()
    else:
        w1, w2, b = weights
        
    print(w1, w2, b)
    # how drastic changes are made, 0.05 = 20% difference
    learning_rate = 0.2
    
    
    for i in range(50000):
        # picking a flower
        idx    = random.randint(0, len(flowers)-1)
        point  = flowers[idx]
        target = flowers[idx][2]
        
        # feeding forward, getting prediction
        # the predicted value that we got, we get the derivative of that later on
        z = point[0] * w1 + point[1] * w2 + b
        prediction = sigmoid(z)
        
        # getting the cost and slope of pred
        cost = (prediction - target) ** 2
        dcost = 2 * (prediction - target)

        # derivative of z-prediction, using the more-sigmoid rule since we packed it up
        # this is the derivative of sigmoid
        dprediction = sigmoid(z) * (1-sigmoid(z))

        # getting the derivatives of the individual weight/bias variables,
        # since we are getting the derivative and lowering a level usign the
        # power rule, we are left with:
        # w1 * length -> length, i.e. w1 * point[0] -> point[0], 2x -> 2
        # these are all the PARTIALS, which means what does z change by when
        # we look at each of the individual points
        dw1 = point[0]
        dw2 = point[1]
        db  = 1

        # bringing the cost changes (slope) through each function
        # through the square, then sigmoid value (backward integration)
        # then finally through whatever is multiplying our parameter of interest (length
        # x width) which is the last piece derivative
        
        # the difference in cost with respect to the randomized weights,
        # this is the entire backwards working from cost downwards before
        # the weights start to change, and we individually
        # apply all values
        d_cost_wrt_weights = dcost * 1 * dprediction * 1
        
        dcost_dw1 = d_cost_wrt_weights * dw1
        dcost_dw2 = d_cost_wrt_weights * dw2
        dcost_db  = d_cost_wrt_weights * db

        # updating our parameters with the above
        w1 -= learning_rate * dcost_dw1
        w2 -= learning_rate * dcost_dw2
        b  -= learning_rate * dcost_db
        
    print(w1, w2, b)
    
train()

1.2291775507539453 0.2850064793472833 -0.8719092321483759
7.526382310815231 3.4870577682723902 -26.253997846696628
