In [1]:
%matplotlib inline
from utils import *

Using gpu device 0: Tesla K80 (CNMeM is disabled, cuDNN 5103)
Using Theano backend.


### Linear models
1. a linear model is a model where each row is calculated as 
2. sum(row * weights)
3. where weights is learned from the data
4. and is the same for every row

In [2]:
x = random((30,2))

In [3]:
y = np.dot(x, [2, 3]) + 1

This is _not_ our model - this is our "data" - a set of relationships between the input x and output y. 

We're going to pretend that we don't know the transformation function (y = x [2,3] + 1) and try to get the same y outputs from x using the linear model we're about to build.

In [10]:
lm = Sequential([Dense(1, input_dim=2)])

# lm is the linear model

# lm = Sequential()
# lm.add(Dense(x, y))
# is the same as 
# lm = Sequential([Dense(x, input_shape=(y,)])
# is the same as 
# lm = Sequential([Dense(x, input_dim=y)])
# ... I think
# 'dim' stands for 'dimensions' btw because it took me a while

# the docs describe Dense() as 'just your fully connected NN layer'

# the model will take as input arrays of shape (*, y)
# and output arrays of shape(*, x)

1. lm is actually our linear model
2. it takes as input arrays of shape (\*, 2)
3. it outputs arrays of shape (\*, 1)

In [7]:
print(x.shape)
print(y.shape)

(30, 2)
(30,)


Oh look - what a coincidence!

Okay so we have a linear model that takes the right sized inputs and produces the right sized outputs. What exactly is it doing? Just multiplying our inputs by random matrices? **Check on this.**

Sometimes there'a an activation step here, but not this time.

```
lm.add(Dense(?, activation='softmax')
```

In [11]:
lm.compile(optimizer=SGD(lr=0.1), loss="mse")

For now, we're going to tell our model how we want it to improve its guesses. And we're saying, use stochastic gradient descent with a learning rate of 0.1 (we'll cover this later). 

And we need some way to keep score of whether our model is getting better or worse, and we're saying, use mean squared error (we'll cover this later too).

**Ok model, give us an error score**

In [12]:
lm.evaluate(x, y, verbose=1)



29.163684844970703

Is this good or bad? The Wikipedia page for mean squared error says "values closer to zero are better" so I'm going to take that as "could be improved".

This shouldn't be surprising though - if you've been paying attention you'll realize *the model hasn't actually been trained*.

In [13]:
lm.fit(x, y, nb_epoch=5, batch_size=1)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f2d3ba795d0>

We told our model to go over the data 5x (**what does that mean exactly?**) - notice the diminishing loss function on the side.

In [14]:
lm.evaluate(x, y, verbose=1)



0.023221131414175034

**Ok model, do it five more times.**

In [16]:
lm.fit(x, y, nb_epoch=5, batch_size=1)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f2d3ba763d0>

In [18]:
lm.evaluate(x, y, verbose=1)



0.0015310090966522694

Hey, that's a lot closer to 0!

**Ok model, what function did you use to produce this output?**

In [19]:
lm.get_weights()

[array([[ 1.9352],
        [ 2.9023]], dtype=float32), array([ 1.1161], dtype=float32)]

Remember our original y function?
```
y = np.dot(x, [2, 3]) + 1
```
The weights used by our model turned out pretty close!