# Introduction to Regression with Neural Networks in Tensorflow

There are many definition for a regression model but in this case, i'm going to simplify it: predicting a numerical variable based on some other combination of variables, even shorter... predicting a number

In [None]:
import tensorflow as tf
from keras.utils import plot_model
print(tf.__version__)

# Creating Data to view and fit

In [None]:
import numpy as np
import matplotlib.pyplot as plt

# Create faetures
x = np.array([-7.0, -4.0, -1.0, 2.0, 5.0, 8.0, 11.0, 14.0])

# Craete labels
y = np.array([3.0, 6.0, 9.0, 12.0, 15.0, 18.0, 21.0, 24.0])

plt.scatter(x, y)


# Input and Output shapes

In [None]:
# Creating a demo tensor for our housing prediction problems
house_info = tf.constant(["bedroom", "bathroom", "kitchen"])
house_price = tf.constant([939700])
house_info.shape, house_price.shape

In [None]:
x_ten = tf.constant(x)
y_ten = tf.constant(y)
x_rs = tf.reshape(x_ten, [1, 8])
y_rs = tf.reshape(y_ten, [1, 8])
x_rs.shape, y_rs.shape

# Steps in modelling with Tensorflow

1. **Craeting a model** - define the input and output layers, as well as the hidden layers of a deep learning model
2. **Compiling a model** - define a loss function (in other words, the fuction which tells our model how wrong it is) and the optimizer (tells our model how to improve the patterns in learning) and evaluation metrices (what we can use to interpret of our model).
3. **Fitting the model** - letting the model trying to find patterns between X & Y (features and labels) 

In [None]:
# Settting a random seed
tf.random.set_seed(42)

# Create a model using the Sequential API
model = tf.keras.Sequential([
  tf.keras.layers.Dense(1)
])

# Compile the model
model.compile(loss="mae", optimizer="sgd", metrics=['mae'])

# Fit the model
model.fit(x_rs, y_rs, epochs=10)

In [None]:
# Try and make predictions using our model
x_rs = tf.constant([[17.0, 20.0, 23.0, 26.0, 29.0, 32.0, 35.0, 38.0]])
model.predict(x_rs)

# Improving our model

We can improve our model, by altering the steps we took to create our model

1. **Craeting a model** - Here we might add more layers, increase the number of hidden units (all called neurons) within each of the hidden layers, change the activation function of each layer.
2. **Compiling a model** - Here we might change the optimization function or perhaps the learning rate of the optimization function of each layer.
3. **Fitting the model** - Here we might fit the model for more epochs (leave it training for longer) or on more data (give the model more examples to learn from).

In [None]:
# Let's rebuild our model

# 1. Create the model
model = tf.keras.Sequential([
    tf.keras.layers.Dense(150, activation="relu"),
    tf.keras.layers.Dense(150, activation="relu"),
    tf.keras.layers.Dense(150, activation="relu"),
    tf.keras.layers.Dense(1)    
])

# 2. Compile the model
model.compile(loss=tf.keras.losses.mae,
              optimizer=tf.keras.optimizers.Adam(lr=0.001),
              metrics=["mae"])

# 3. Fit the model (this time we'll train for longer)
model.fit(x_rs, y_rs, epochs=100)

# Evaluating a model

In practical, a typical workflow you'll go through when building neural networks is:

```
Build a model --> fit it --> tweak the model --> evaluate it...
```

In [None]:
# Make a bigger dataset
x = tf.range(-100, 100, 4)
x

# Make labels for the dataset
y = x + 10
y

# Visualize the data
plt.scatter(x, y)

## The 3 sets...

* **Training set** - The model learns from this data, which is typically 70-80% of the total data you have available.
* **Validation set** - The model gets tuned on this data, which is typically 10-15% of the total data you have available
* **Test set** - The model gets evaluated on this data to test what it has learned, this set is typically 10-15% of the total data you have available

In [None]:
# Check the length how many samples we have
len(x)

# Split the data into train and test sets
x_train = x[:40] # First 40 are training samples (80% of the data)
y_train = y[:40] # First 40 are training samples (80% of the data)

x_test = x[40:] # last 10 are test samples (20% of the data)
y_test = y[40:] # last 10 are test samples (20% of the data)


## Visualizing the data

Now we've got data training and test sets let's visualize it

In [None]:
plt.figure(figsize=(10, 7))

# Plot training data in blue
plt.scatter(x_train, y_train, c="b", label="Training data")

# Plot test data in green
plt.scatter(x_test, y_test, c="g", label="Test data")

# Show a legend
plt.legend()

In [None]:
# Building a neural network for our data

tf.random.set_seed(42)

# 1. Create a model 
model = tf.keras.Sequential([
    tf.keras.layers.Dense(10, input_shape=[1])
])

# 2. Compile the model
model.compile(loss = tf.keras.losses.mae,
              optimizer = tf.keras.optimizers.Adam(),
              metrics = ["mae"])

model.summary()

* Total params - total number of parameters in the model.
* Trainable params - these are the parameters (patters) the model can update as it trains
* Non-trainable params - these parameters aren't updated during training (this is typical when we bring in laready learn patters or parameters from other models during **transfer learning**).

📖 **Resource:** For a more in-depth overview of the trainable parameters within a layer, check out [MIT's Introduction to Deep learning](https://www.youtube.com/watch?v=7sB052Pz0sQ&list=PLtBw6njQRU-rwp5__7C0oIVt26ZgjG9NI) video

![image](https://3.bp.blogspot.com/-JYbNelUE66k/XhOdLxanW5I/AAAAAAAACVI/fD375RZr0BAtoosh-e5Zbzz-gEll6bmxwCLcBGAsYHQ/s1600/intro.png)

🛠 **Exercise:** Try play around with the number of hidden units in the dense layer, see how that effects the number of parameters (total and trainble) by calling `model.summary()`.

In [None]:
# Fitting the model
model.fit(x_train, y_train, epochs=100, verbose=0)

plot_model(model=model, show_shapes=True)