In [2]:
import tensorflow as tf

# Linear Model Complexity
![Parameter Cont](Lesson7/parameters.png)
If you have **N** inputs and **k** outputs you have **(N+1)k parameters**.  Because the model is linear, the kind of interactions that we are capable of representing with the model are limited.  Can model $y=X_1+X_2$ but not $y=X_1\cdot X_1$
* Are efficient
* Are stable
    * Small changes in input, can never yield big changes in the output
    * Derivatives are very nice. **A Constant**
* Would like to keep parameters inside big linear functions.   Also want the entire model to be non-linear

# Rectified Linear Units
![relu](Lesson7/relu.png)
![nn with relus](Lesson7/nn-relu.png)

Insert a ReLu in the middle.  Now have two matrices. One from inputs to ReLu, another one connecting ReLus to the classifier.
* Function is now non-linear
* Have a new parameter to tune (Number H: Number of ReLU Units)

We now have a "2-layer" neural network
1. First layer consists of the set of weights and biases applied to X and padded through ReLUs.  The output of this layer is fed to the next one, but is not observable outside the network, hence it is known as a *hidden layer.*
2. The second layer consists of the weights and biases applied to these intermediate outputs, followed by the softmax function to generate probabilities.

# TensofFlow ReLUs
A Rectified linear unit (ReLU) is a type of activation function that is defined as $f(x) = max(0, x)$.  In TensorFlow it is **tf.nn.relu()**
```python
# Hidden Layer with ReLU activation function
hidden_layer = tf.add(tf.matmul(features, hidden_weights), hidden_biases)
hidden_layer = tf.nn.relu(hidden_layer)

output = tf.add(tf.matmul(hidden_layer, output_weights), output_biases)
```

Above code applies the **tf.nn.relu()** function to the **hidden_layer**, effectively turning off any negative weights and acting like an on/off switch.  Adding additional layers, like the **output** layer, after an activation function turns the model into a nonlinear function.  Allows network to solve more complex problems.

In [None]:
# Remove the previous weights and bias
tf.reset_default_graph()

# Solution is available in the other "solution.py" tab
import tensorflow as tf

output = None
hidden_layer_weights = [
    [0.1, 0.2, 0.4],
    [0.4, 0.6, 0.6],
    [0.5, 0.9, 0.1],
    [0.8, 0.2, 0.8]]
out_weights = [
    [0.1, 0.6],
    [0.2, 0.1],
    [0.7, 0.9]]

# Weights and biases
weights = [
    tf.Variable(hidden_layer_weights),
    tf.Variable(out_weights)]
biases = [
    tf.Variable(tf.zeros(3)),
    tf.Variable(tf.zeros(2))]

# Input
features = tf.Variable([[1.0, 2.0, 3.0, 4.0], [-1.0, -2.0, -3.0, -4.0], [11.0, 12.0, 13.0, 14.0]])

# TODO: Create Model
hidden_layer = tf.add(tf.matmul(features, weights[0]), biases[0])
hidden_layer = tf.nn.relu(hidden_layer)

output_layer = tf.add(tf.matmul(hidden_layer, weights[1]), biases[1])

init = tf.global_variables_initializer()

# TODO: Print session results
with tf.Session() as sess:
    sess.run(init)
    output = sess.run(output_layer)
    print(output)
    
## Solution.py
hidden_layer = tf.add(tf.matmul(features, weights[0]), biases[0])
hidden_layer = tf.nn.relu(hidden_layer)
logits = tf.add(tf.matmul(hidden_layer, weights[1]), biases[1])

# TODO: Print session results
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    print(sess.run(logits))

# Chain Rule
The benefits of building a network that stacks simple operations is that it makes the math very simple.  This is important in respect to taking the derivatives of the functions.  This is because of the chain rule
![Chain Rule](Lesson7/chain-rule.png)

# Backprop
Network is a stack of simple operations.  Some have parameters and some don't (transforms vs ReLUs).  Data flows from the input to the output.  To compute the derivatives, you create a new graph the flows backwards.  It gets combined with the chain rule and produces gradients.  Graph can be derived completely from the individual operations in the network. Most frameworks will do it for you.  Makes computing derivatives of complex functions very efficient.  As long as the function is made up of simple blocks with simple derivatives.  For every batch, you will run the forward prop, and then the backprop.  It will give you the gradients for each of the weights.  Then apply the gradients with a learning rate to the original weights and update them. Repeat many many times. How model gets optimized.  **Each block in the backprop often takes twice the memory and compute as the forward prop**
![Backprop](Lesson7/backprop.png)

# Deep Neural Networks in TensorFlow
How to use the logistic classifier to build a deep neural network

## TensorFlow MNIST
```python
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets(".", one_hot=True, reshape=False)
```

MNIST provided by TensorFlow, which batches and One-Hot encodes the data.

## Learning Parameters
```python
import tensorflow as tf

# Parameters
learning_rate = 0.001
training_epochs = 20
batch_size = 128  # Decrease batch size if you don't have enough memory
display_step = 1

n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)
```
Focus on architecture of multilayer NN, not parameter tuning

## Hidden layer parameters
```python
n_hidden_layer = 256  # layer number of features
```
Determines the seize of the hidden layer in the NN.  Also known as the width of the layer.

## Weights and Biases
```python
# Store layers weight and bias
weights = {
    'hidden_layer': tf.Variable(tf.random_normal([n_input, n_hidden_layer])),
    'out': tf.Variable(tf.random_normal([n_hidden_layer, n_classes]))
}
biases = {
    'hidden_layer': tf.Variable(tf.random_normal([n_hidden_layer])),
    'out': tf.Variable(tf.random_normal([n_classes]))
}
```

Deep NN use multiple layers with each layer requiring it's own weights and bias.

## Input
```python
# tf Graph input
x = tf.placeholder("float", [None, 28, 28, 1])
y = tf.placeholder("float", [None, n_classes])

x_flat = tf.reshape(x, [-1, n_input])
```

The tf.reshape() function above reshapes the 28px by 28px matrices in x into row vectors of 784px.

## Multilayer Perceptron
```python
# Hidden layer with RELU activation
layer_1 = tf.add(tf.matmul(x_flat, weights['hidden_layer']), \
    biases['hidden_layer'])
layer_1 = tf.nn.relu(layer_1)
# Output layer with linear activation
logits = tf.add(tf.matmul(layer_1, weigths['out']), biases['out'])
```

## Optimizer
```python
# Define loss and oprimizer
cost = tf.reduce_mean(\
    tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\
    .minimize(cost)
```

## Session
```python
# Initializing the variables
init = tf.global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)
    # Training cycle
    for epoch in range(training_epochs):
        total_batch = int(mnist.train.num_example/batch_size)
        for i in range(total_batch):
            batch_x, batch_y = mnist.train.next_batch(batch_size)
            # Run optimization op (backprop) and cost op (to get loss value)
            sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
    
```

# Training a Deep Learning Network
Now we have a 2-layer NN.  Can make it bigger/mor complex by increasing the size of the layer in the middle. But increasing H is not very efficient in general.  Have to make it very big.  This is where deep learning comes into place.  Instead in increasing H, we add layers.

**Wider vs Deeper**
* Parameter efficiency.  Much more performace with fewer parameters by going deeper rather than wider
* A lot of natural phenomena that we are interested in tend to have a hierachical structure that deep models natural capture.  For example lines at the lowest layer then they are combined to geometric shapes then objects.

# Save and Restore TensorFlow Models
Trainging a model can take hours.  Once a session is losed, you lose all trained weights and biases.  Fortunately we can save out progress using **tf.trin.Saver()**

## Saving Variables
If you're using TensorFlow 0.11.0RC1 or newer, a file called "model.ckpt.meta" will also be created. This file contains the TensorFlow graph.

In [4]:
# Remove the previous weights and bias
tf.reset_default_graph()

# The file path to save the data
save_file = './model.ckpt'

# Two Tensor Variables: weights and bias
weights = tf.Variable(tf.truncated_normal([2, 3]))
bias = tf.Variable(tf.truncated_normal([3]))

# Class used to save and/or restore Tensor Variables
saver = tf.train.Saver()

with tf.Session() as sess:
    # Initialize all the Variables
    sess.run(tf.global_variables_initializer())
    
    # Show the values of weights and bias
    print('Weights:')
    print(sess.run(weights))
    print('Bias:')
    print(sess.run(bias))
    
    # Save the model
    saver.save(sess, save_file)

Weights:
[[-0.21627854 -1.14732623 -0.25619784]
 [-0.03114265  0.08963913 -0.18392591]]
Bias:
[ 0.06306206  1.43337607 -0.20846258]


## Loading Variables

In [5]:
# Remove the previous weights and bias
tf.reset_default_graph()

# Two Variables: weights and bias
weights = tf.Variable(tf.truncated_normal([2,3]))
bias = tf.Variable(tf.truncated_normal([3]))

# Class used to save and/or restore Tensor Variables
saver = tf.train.Saver()

with tf.Session() as sess:
    # Load the weights and bias
    saver.restore(sess, save_file)
    
    # Show the values of weights and bias
    print('Weights:')
    print(sess.run(weights))
    print('Bias:')
    print(sess.run(bias))


Weights:
[[-0.21627854 -1.14732623 -0.25619784]
 [-0.03114265  0.08963913 -0.18392591]]
Bias:
[ 0.06306206  1.43337607 -0.20846258]


## Save a Trained Model

In [4]:
# Remove previous Tensors and Operations
tf.reset_default_graph()

from tensorflow.examples.tutorials.mnist import input_data
import numpy as np

learning_rate = 0.001
n_input = 784
n_classes = 10

# Import MNIST data
mnist = input_data.read_data_sets('.', one_hot=True)

# Features and Labels
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])

# Weights and bias
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))

# Logits
logits = tf.add(tf.matmul(features, weights), bias)

# Define loss and optimizer
cost = tf.reduce_mean(\
    tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\
    .minimize(cost)
    
# Calculate accuracy
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# Train the model, then save the weights

import math

save_file = './tain_model.ckpt'
batch_size = 128
n_epochs = 100

saver = tf.train.Saver()

# Launch the graph
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    # Training Cycle
    for epoch in range(n_epochs):
        total_batch = math.ceil(mnist.train.num_examples / batch_size)
        
        # Loop over all batches
        for i in range(total_batch):
            batch_features, batch_labels = mnist.train.next_batch(batch_size)
            sess.run(
                optimizer,
                feed_dict={features:batch_features, labels: batch_labels})
            
        # Print status for every 10 epochs
        if epoch % 10 == 0:
            valid_accuracy = sess.run(
                accuracy,
                feed_dict={
                    features: mnist.validation.images,
                    labels: mnist.validation.labels})
            print('Epoch {:<3} - Validation Accuracy: {}'.format(
                epoch,
                valid_accuracy))
        
    saver.save(sess, save_file)
    print('Trained Model Saved.')

Extracting .\train-images-idx3-ubyte.gz
Extracting .\train-labels-idx1-ubyte.gz
Extracting .\t10k-images-idx3-ubyte.gz
Extracting .\t10k-labels-idx1-ubyte.gz
Epoch 0   - Validation Accuracy: 0.11999998986721039
Epoch 10  - Validation Accuracy: 0.29259997606277466
Epoch 20  - Validation Accuracy: 0.43119996786117554
Epoch 30  - Validation Accuracy: 0.5143998861312866
Epoch 40  - Validation Accuracy: 0.5794000029563904
Epoch 50  - Validation Accuracy: 0.6195999383926392
Epoch 60  - Validation Accuracy: 0.6505999565124512
Epoch 70  - Validation Accuracy: 0.6721998453140259
Epoch 80  - Validation Accuracy: 0.694399893283844
Epoch 90  - Validation Accuracy: 0.7111998200416565
Trained Model Saved.


In [6]:
saver = tf.train.Saver()

# Launch the graph
with tf.Session() as sess:
    saver.restore(sess, save_file)
    
    test_accuracy = sess.run(
    accuracy,
    feed_dict={features: mnist.test.images, labels: mnist.test.labels})
    
print('Test Accuracy: {}'.format(test_accuracy))

Test Accuracy: 0.726699948310852


# Loading the Weights and Biases into a New Model
Sometimes you might want to adjust, or "finetune" a model that you have already trained and saved.

However, loading saved Variables directly into a modified model can generate errors. Let's go over how to avoid these problems.

## Naming Error
TF uses a string identifier for Tensors and Operations called **name**.  If a name is not given, TF will create one automatically.  TF will give the first node the name **< Type>**, and then the name **< Type>_< number>** for subsequent nodes. How can this affect loading a model with a different order of **weights** and **bias**

In [1]:
import tensorflow as tf

# Remove the previous weights and bias
tf.reset_default_graph()

save_file = './model.ckpt'

# Two Tensor Variables: weights and bias
weights = tf.Variable(tf.truncated_normal([2, 3]))
bias = tf.Variable(tf.truncated_normal([3]))

saver = tf.train.Saver()

# Print the name of Weights and Bias
print('Save Weights: {}'.format(weights.name))
print('Save Bias: {}'.format(bias.name))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    saver.save(sess, save_file)
    
# Remove the previous weights and bias
tf.reset_default_graph()

# Two Variables: weights and bias
bias = tf.Variable(tf.truncated_normal([3]))
weights = tf.Variable(tf.truncated_normal([2, 3]))

saver = tf.train.Saver()

# Print the name of Weights and Bias
print('Load Weights: {}'.format(weights.name))
print('Load Bias: {}'.format(bias.name))

with tf.Session() as sess:
    # Load the weights and bias - ERROR
    saver.restore(sess, save_file)

Save Weights: Variable:0
Save Bias: Variable_1:0
Load Weights: Variable_1:0
Load Bias: Variable:0


InvalidArgumentError: Assign requires shapes of both tensors to match. lhs shape= [2,3] rhs shape= [3]
	 [[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@Variable_1"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](Variable_1, save/RestoreV2_1/_1)]]

Caused by op 'save/Assign_1', defined at:
  File "C:\Anaconda3\envs\carnd-term1\lib\runpy.py", line 184, in _run_module_as_main
    "__main__", mod_spec)
  File "C:\Anaconda3\envs\carnd-term1\lib\runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\ipykernel\__main__.py", line 3, in <module>
    app.launch_new_instance()
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\traitlets\config\application.py", line 658, in launch_instance
    app.start()
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\ipykernel\kernelapp.py", line 474, in start
    ioloop.IOLoop.instance().start()
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\zmq\eventloop\ioloop.py", line 177, in start
    super(ZMQIOLoop, self).start()
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\tornado\ioloop.py", line 887, in start
    handler_func(fd_obj, events)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\tornado\stack_context.py", line 275, in null_wrapper
    return fn(*args, **kwargs)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\zmq\eventloop\zmqstream.py", line 440, in _handle_events
    self._handle_recv()
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\zmq\eventloop\zmqstream.py", line 472, in _handle_recv
    self._run_callback(callback, msg)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\zmq\eventloop\zmqstream.py", line 414, in _run_callback
    callback(*args, **kwargs)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\tornado\stack_context.py", line 275, in null_wrapper
    return fn(*args, **kwargs)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\ipykernel\kernelbase.py", line 276, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\ipykernel\kernelbase.py", line 228, in dispatch_shell
    handler(stream, idents, msg)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\ipykernel\kernelbase.py", line 390, in execute_request
    user_expressions, allow_stdin)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\ipykernel\ipkernel.py", line 196, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\ipykernel\zmqshell.py", line 501, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\IPython\core\interactiveshell.py", line 2717, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\IPython\core\interactiveshell.py", line 2821, in run_ast_nodes
    if self.run_code(code, result):
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\IPython\core\interactiveshell.py", line 2881, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-1-57737654c782>", line 29, in <module>
    saver = tf.train.Saver()
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\tensorflow\python\training\saver.py", line 1040, in __init__
    self.build()
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\tensorflow\python\training\saver.py", line 1070, in build
    restore_sequentially=self._restore_sequentially)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\tensorflow\python\training\saver.py", line 675, in build
    restore_sequentially, reshape)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\tensorflow\python\training\saver.py", line 414, in _AddRestoreOps
    assign_ops.append(saveable.restore(tensors, shapes))
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\tensorflow\python\training\saver.py", line 155, in restore
    self.op.get_shape().is_fully_defined())
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\tensorflow\python\ops\gen_state_ops.py", line 47, in assign
    use_locking=use_locking, name=name)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\tensorflow\python\framework\ops.py", line 2327, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "C:\Anaconda3\envs\carnd-term1\lib\site-packages\tensorflow\python\framework\ops.py", line 1226, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): Assign requires shapes of both tensors to match. lhs shape= [2,3] rhs shape= [3]
	 [[Node: save/Assign_1 = Assign[T=DT_FLOAT, _class=["loc:@Variable_1"], use_locking=true, validate_shape=true, _device="/job:localhost/replica:0/task:0/gpu:0"](Variable_1, save/RestoreV2_1/_1)]]


You'll notice that the name properties for weights and bias are different than when you saved the model. This is why the code produces the "Assign requires shapes of both tensors to match" error. The code saver.restore(sess, save_file) is trying to load weight data into bias and bias data into weights.  Do it manually.

In [3]:
import tensorflow as tf

tf.reset_default_graph()

save_file = './model.ckpt'

weights = tf.Variable(tf.truncated_normal([2, 3]), name='weights_0')
bias = tf.Variable(tf.truncated_normal([3]), name='bias_0')

saver = tf.train.Saver()

# Print the name of Weights and Bias
print('Save Weights: {}'.format(weights.name))
print('Save Bias: {}'.format(bias.name))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    saver.save(sess, save_file)
    
tf.reset_default_graph()

bias = tf.Variable(tf.truncated_normal([3]), name='bias_0')
weights = tf.Variable(tf.truncated_normal([2, 3]), name='weights_0')

saver = tf.train.Saver()

# Print the name of Weights and Bias
print('Save Weights: {}'.format(weights.name))
print('Save Bias: {}'.format(bias.name))

with tf.Session() as sess:
    saver.restore(sess, save_file)
    
print('Loaded Weights and Bias successfully.')

Save Weights: weights_0:0
Save Bias: bias_0:0
Save Weights: weights_0:0
Save Bias: bias_0:0
Loaded Weights and Bias successfully.


# Regularization
Deep models only shine when we have enough data to train them.  The **skinny jeans** problem.  Network that is just the right size for the data is very, very hard to optimize.  In practice, always train networks that are way too big for out data. Then we try our best to prvent them from overfitting.

First way to prevent overfitting, look at the performance at the validation set and stop training as soon as we stop improving. Called early termination.  Still the best way to prevent network from over optimizing on the training set.

Second way is to apply regularization.  Applying artificial constraints on the network that implicitly reduce the number of free parameters. While not making it more difficult to optimize.  This is like having stretchy pants!  THis is called $L_2$ regularization.

Nice thing about this is that it is very simple.  Only adding it to loss, so structure does not have to change.  Can compute derivative by hand. L2 Norm stands for the sum of the squares of the individual elements of the vector.

![L2 Regularization](Lesson7/L2-regularization.png)

# Dropout
Very recent.  Kill of random activations for ecery example that we train our network on.  Taking half of the data that is flowing through the network and destroying it.  NN can never rely on any given activation to be present.  They may be squashed at any time,  Forces the NN to have redundancy. Makes things more robust and prevents overfitting.  Acts as if its taking a concensus.  If dropout does not work, should probably be using a bigger network.

![Dropout](Lesson7/dropout.png)

We want something deterministic when we evaluate the network.  We don't want the randomness.  We will want the concensus of this redundant modal.  Take consensus by averaging the activations.  Trick to make sure expectation holds.  During training zero out dropouts, and also scale remaining activations by a factor of two.

![Dropout 2](Lesson7/dropout-2.png)

# TensorFlow Dropout
![Dropout Diagram](Lesson7/dropout-node.jpeg)
source: [*Dropout: A Simple Way to Prevent Neural Networks from Overfitting*](https://www.cs.toronto.edu/~hinton/absps/JMLRdropout.pdf)

Temporarily drops units, inlcuding its incoming and outgoing connections.  Use **tf.nn.dropout()** in TensorFlow.
```python
keep_prob = tf.placeholder(tf.float32) # probability to keep units

hidden_layer = tf.add(tf.matmul(features, weights[0]), bias[0])
hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)

logits = tf.add(tf.matmul(hidden_layer, weights[1]), biases[1])
```
**tf.nn.dropout()** parameters
1. **hidden_layer**: the tensor to which to apply dropout
2. **keep_prob**: the probability to keeping any given unit

In order to compensate for droppout units, it multiplies all units that are kept by 1/keep_prob

For training, good starting value for **keep_prob** is 0.5. During testing it should be 1.

### quiz.py

In [3]:
# Solution is available in the other "solution.py" tab
import tensorflow as tf

hidden_layer_weights = [
    [0.1, 0.2, 0.4],
    [0.4, 0.6, 0.6],
    [0.5, 0.9, 0.1],
    [0.8, 0.2, 0.8]]
out_weights = [
    [0.1, 0.6],
    [0.2, 0.1],
    [0.7, 0.9]]

# Weights and biases
weights = [
    tf.Variable(hidden_layer_weights),
    tf.Variable(out_weights)]
biases = [
    tf.Variable(tf.zeros(3)),
    tf.Variable(tf.zeros(2))]

# Input
features = tf.Variable([[0.0, 2.0, 3.0, 4.0], [0.1, 0.2, 0.3, 0.4], [11.0, 12.0, 13.0, 14.0]])

# TODO: Create Model with Dropout
keep_prob = tf.placeholder(tf.float32)

hidden_layer = tf.add(tf.matmul(features, weights[0]), biases[0])
hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)

logits = tf.add(tf.matmul(hidden_layer, weights[1]), biases[1])

# TODO: Print logits from a session
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    output = sess.run(logits, feed_dict={keep_prob:0.5})
    
    print(output)

[[  6.57999945   8.45999908]
 [  0.71400005   0.91800004]
 [  4.72000027  28.3200016 ]]
