# Introduction

- PyTorch and TensorFlow are two major libraries
  - TensorFlow developed by Google
  - PyTorch is backed by Facebook
    - Developed for Python and has the Keras library built-in
- Keras wraps the numerical computing complexity of TensorFlow
- "TensorFlow absorbed Keras as part of its library"

# Backgroud

## Overview of some deep learning libraries 

- Machine Learning is a board topic
- Deep learning, in particular, is a way of using neural networks for machine learning
- The concept of a neural network is probably older than machine learning, going back to the 1950s
- Below are some famous libraries for neural networks and deep learning
  - The differences between PyTorch and TensorFlow
- Deep learning has gained attention in the last decade
- Before that, there was little confidence is how to train a multilayer perceptron neural network with multiple layers
- Before deep learning, we had a neural network library called **libann**
- It was a library for C++
- One of the earliest libraries for deep learning was Caffe, developed at UC Berkeley for computers vision problems
- Developed in C++ but with a Python interface
- We can build projects in Python with the network defined in JSON like syntax
- Chainer is another Python library, with easy syntax - less common these days but the API in Keras and PyTorch is similar
- Theano is another library but now obsolete, but was a major library for deep learning
- Earlier versions of Keras allow you to chose between Theano or TensorFlow for the backend
  - Neither Theano or TensorFlow are deep learning libraries precisely, rather then are tensor libraries that make matrix operations and differentiation handy
  - Where deep learning operations can be built
  - Hence Theano and TensorFlow are considered replacements for each other from Keras' perspective
- Apache MXNet and Microsoft CNTK

# PyTorch and TensorFlow 

- The common way of defining a deep learning model is PyTorch is to create a class

In [1]:
import torch

ModuleNotFoundError: No module named 'torch'

In [2]:
from tensorflow.keras.models import Sequential

- One major difference between PyTorch and Keras syntax is in the training loop. 

# Chap 2 - Intro to TensorFlow
- Is it a library created by Google
- It allows for fast numerical computing 
- Foundation library that can be used for deep learning
- Or a wrapper library can be used to simply using TensorFlow
- This chapter covers:
  - About the TensorFlow library
  - How to define, compile, and evaluate simple symbolic expressions in TensorFlow
- Open-source library for fast numerical computing, created and maintained by Google, released under Apache 2.0 open source license
- The API is Python based, but you can access the underlying C++ API
- Unlike other deep learning libraries like Theano, TensorFlow is designed for use in Research / Development and for Production systems eg: RankBrain and DeepDream
- Can run on PC, GPUs, mobile devices and large scale distributed systems

- **How to install**
  - pip install tensorflow

- **My first example in TensorFlow**
- Computation in TensorFlow is described in terms of data flow and operations in the structure of a directed graph
- Terms in describing a directed graph:
  - **Nodes** : perform computation and can have inputs or outputs. 
    - Data that moves between nodes are known as **tensors** which are multi-dimensional arrays or real values
  - **Edges** : The graphs shows the flow of data, branching, looping and updating. Specials edges can be used to synchronize behavior in the graph. eg: waiting for computation to complete
  - **Operation** : An abstract term for computation lol - an add or multiply operation

- **Computation with TensorFlow**
  - The below code shows how to define values as **tensors** and execute operation

In [5]:
import tensorflow as tf

In [6]:
a = tf.constant(10)
b = tf.constant(32)

In [8]:
print(a + b)

tf.Tensor(42, shape=(), dtype=int32)


## Linear Regression with TensorFlow

In [9]:
import numpy as np

In [14]:
X_data = np.random.rand(100).astype(np.float32)
y_data = X_data * 0.1 + 0.3

In [13]:
W = tf.Variable(tf.random.normal([1]))
b = tf.Variable(tf.zeros([1]))

In [15]:
def mse_loss():
    y = W * X_data + b
    loss = tf.reduce_mean(tf.square(y - y_data))
    return loss   

In [18]:
optimizer = tf.keras.optimizers.Adam()
for step in range(5000):
    optimizer.minimize(mse_loss, var_list=[W,b])
    if step % 500 == 0:
        print(step, W.numpy(), b.numpy())

AttributeError: 'Adam' object has no attribute 'minimize'

In [20]:
# Training loop
for step in range(5000):
    with tf.GradientTape() as tape:
        # Record the operations for gradient computation
        loss_value = mse_loss()

    # Compute gradients with respect to W and b
    gradients = tape.gradient(loss_value, [W, b])

    # Apply gradients to variables using the optimizer
    optimizer.apply_gradients(zip(gradients, [W, b]))

    if step % 500 == 0:
        print(step, W.numpy(), b.numpy())

0 [1.4301988] [-0.00099999]
500 [1.0829537] [-0.19162789]
1000 [0.8327605] [-0.07242619]
1500 [0.5913301] [0.05106386]
2000 [0.3908629] [0.15289497]
2500 [0.24704511] [0.22569285]
3000 [0.1604788] [0.2694466]
3500 [0.11899895] [0.29040247]
4000 [0.10420582] [0.29787543]
4500 [0.1005907] [0.29970163]


# Chap 3 - AutoGrad to solve Regression problem
- More Gradient Decent aye
- TensorFlow is usually used for building neural networks
- TensorFlow can also be used as an optimization tool using AutoGrad - more Gradient Decent
- TensorFlow is a library with automatic differentiation capability
- I can use it to solve numerical optimization problems with GD
- This is the algorithm to train neural networks - AutoGrad ....
- This chapter covers:
  - How TensorFlow's automatic differentiation engine "AutoGrad" works.
  - How to make use of AutoGrad and an "optimizer" to solve an optimization problem

## AutoGrad in TensorFlow

In [None]:
# Creating a constant matrix
import tensorflow as tf

In [24]:
# Well more like a vector, but apparently similar to numpy vectors
x = tf.constant([1,2,3]) # Allows values to be changed
# x = tf.Variable([1,2,3]) # Is immutable
# !! Import when you run a gradient tape


In [23]:
print(x)
print(x.shape)
print(x.dtype)
print(x**2)

tf.Tensor([1 2 3], shape=(3,), dtype=int32)
(3,)
<dtype: 'int32'>
tf.Tensor([1 4 9], shape=(3,), dtype=int32)


In [28]:
x = tf.Variable(3.6)
with tf.GradientTape() as tape:
    y = x*x
dy = tape.gradient(y, x)
print(dy)    

tf.Tensor(7.2, shape=(), dtype=float32)


## AutoGrad for Polynomial Regression

In [4]:
import tensorflow as tf
import numpy as np 

N = 20
polynomial = np.poly1d([1,2,3])
X = np.random.uniform(-10, 10, size=(N,1))
Y = polynomial(X)
XX = np.hstack([X*X, X, np.ones_like(X)])
w = tf.Variable(tf.random.normal((3,1))) # The three coefficients
x = tf.constant(XX, dtype=tf.float32) # Input sample
y = tf.constant(Y, dtype=tf.float32) # Output sample
optimizer = tf.keras.optimizers.Nadam(learning_rate=0.01)
print(w)

for _ in range(5000): 
    with tf.GradientTape() as tape:
        y_pred = x @ w
        mse = tf.reduce_sum(tf.square(y - y_pred))
    grad = tape.gradient(mse, w)
    optimizer.apply_gradients([(grad, w)])
print(w)

<tf.Variable 'Variable:0' shape=(3, 1) dtype=float32, numpy=
array([[-1.0751817 ],
       [ 1.3354455 ],
       [ 0.20098118]], dtype=float32)>
<tf.Variable 'Variable:0' shape=(3, 1) dtype=float32, numpy=
array([[1.0001751],
       [1.9998642],
       [2.9871678]], dtype=float32)>
