# Convolutional Neural Network Applications

# Deep Learning Libraries

<img src="https://www.houseofbots.com/images/news/4015/cover.png" width=500>



<img src="https://images.ctfassets.net/sc14p6l3fnnh/3TxuyeJrWZcIShYarhqV3i/0e1fb927536970856fe3c8bb4c65ba7a/pytorch-1-.png">

<img src="https://github.com/jordanott/DeepLearning/blob/master/Figures/tf_vs_pytorch.png?raw=true">

In [21]:
import tensorflow as tf
import numpy as np
import torch

N, D_in, H, D_out = 64, 1000, 100, 10

# Tensorflow

1. Build computational graph describing our computation (including finding paths for backprop)
2. Reuse the same graph on every iteration

In [22]:
# Create placeholders for data; these get filled when we execute the graph.
x = tf.placeholder(tf.float32, shape=(None, D_in))
y = tf.placeholder(tf.float32, shape=(None, D_out))

# Create Variables for the weights and initialize them with random data.
w1 = tf.Variable(tf.random_normal((D_in, H)))
w2 = tf.Variable(tf.random_normal((H, D_out)))

# Note that this doesn't actually perform any operations
h = tf.matmul(x, w1); h_relu = tf.maximum(h, tf.zeros(1))
y_pred = tf.matmul(h_relu, w2)
# Compute loss using operations on TensorFlow Tensors
loss = tf.reduce_sum((y - y_pred) ** 2.0)

# Compute gradient of the loss with respect to w1 and w2
grad_w1, grad_w2 = tf.gradients(loss, [w1, w2])

# To update weights we need to evaluate the graph
learning_rate = 1e-6
new_w1 = w1.assign(w1 - learning_rate * grad_w1)
new_w2 = w2.assign(w2 - learning_rate * grad_w2)
# Now we have built our computational graph;enter a TensorFlow session
with tf.Session() as sess:
    # Run the graph once to initialize the Variables w1 and w2.
    sess.run(tf.global_variables_initializer())

    # Create np arrays holding the real data for the inputs and targets 
    x_value = np.random.randn(N, 1000); y_value = np.random.randn(N, D_out)
    for _ in range(500):
        # Each time it executes we want to bind x_value to x and y_value to y
        loss_value, _, _ = sess.run([loss, new_w1, new_w2], 
                feed_dict={x: x_value, y: y_value})

# Tensorflow Keras Wrapper?!?


In [3]:
import tensorflow.keras

* There's already a whole library for keras
* Why is it included in tensorflow?

* In summary **WTF**


# PyTorch: Fundamental Concepts 

* Tensor: Like a numpy array, but can run on GPU
* Autograd: Package for building computational graphs out of Tensors, and automatically computing gradients
* Module: A neural network layer; may store state or learnable weights
* I like it because the syntax is almost identical to numpy but with GPU support

In [19]:
# Create random Tensors to hold input and outputs.
x = torch.randn(N, D_in, dtype=torch.float)
y = torch.randn(N, D_out, dtype=torch.float)

# Create random Tensors for weights.
w1 = torch.randn(D_in, H, dtype=torch.float, requires_grad=True)
w2 = torch.randn(H, D_out, dtype=torch.float, requires_grad=True)

optimizer = torch.optim.SGD([w1,w2], lr=1e-3)

for t in range(500):
    # Forward pass: compute predicted y using operations on Tensors; these are exactly the same operations we used to compute the forward pass using
    y_pred = x.mm(w1).clamp(min=0).mm(w2)

    # Compute and print loss using operations on Tensors.
    loss = (y_pred - y).pow(2).sum()
    
    # Use autograd to compute the backward pass. This call will compute the gradient of loss with respect to all Tensors with requires_grad=True.
    # After this call w1.grad and w2.grad will be Tensors holding the gradient of the loss with respect to w1 and w2 respectively.
    loss.backward()

    optimizer.step()

PyTorch autograd looks a lot like TensorFlow: in both frameworks we define a computational graph, and use automatic differentiation to compute gradients. The biggest difference between the two is that TensorFlow’s computational graphs are static and PyTorch uses dynamic computational graphs.

In TensorFlow, we define the computational graph once and then execute the same graph over and over again, possibly feeding different input data to the graph. In PyTorch, each forward pass defines a new computational graph.

Static graphs are nice because you can optimize the graph up front; for example a framework might decide to fuse some graph operations for efficiency, or to come up with a strategy for distributing the graph across many GPUs or many machines. If you are reusing the same graph over and over, then this potentially costly up-front optimization can be amortized as the same graph is rerun over and over.

One aspect where static and dynamic graphs differ is control flow. For some models we may wish to perform different computation for each data point; for example a recurrent network might be unrolled for different numbers of time steps for each data point; this unrolling can be implemented as a loop. With a static graph the loop construct needs to be a part of the graph; for this reason TensorFlow provides operators such as tf.scan for embedding loops into the graph. With dynamic graphs the situation is simpler: since we build graphs on-the-fly for each example, we can use normal imperative flow control to perform computation that differs for each input.

# Summary

* If you want something high level that works fast: **Keras**
* Need to write custom layers and do fancy computations in your network: **PyTorch**


* If you hate yourself: **Tensorflow**


# S


* Resnet
* Inception
* 

# Image Segmentation

# Object Detection

# Image Captioning

* Show attend and tell

* Transposed Convolution
* FCN
* Unet
* PSP

* YOLO
* SSD