<a href="https://colab.research.google.com/github/TeAmP0is0N/Repo-2020/blob/master/Tensorflow%2520in%2520practice/ConciseImplementationofLinearRegression.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Implementation of Linear Regression by using high level APIs of Deep Learning frameworks

In [None]:
pip install -U d2l

In [7]:
from d2l import tensorflow as d2l
import numpy as np
import tensorflow as tf

In [8]:
true_w = tf.constant([2, -3.4])
true_b = 4.2
features, labels = d2l.synthetic_data(true_w, true_b, 1000)

### Reading the Dataset
Rather than rolling our own iterator, we can call upon the existing API in a framework to read data. We pass in `features` and `labels` as arguments and specify `batch_size` when instantiating a data iterator object. Besides, the boolean value is_train indicates whether or not we want the data iterator object to shuffle the data on each epoch (pass through the dataset).

In [10]:
def load_array(data_arrays, batch_size, is_train=True):
  """Construct a Tensorflow data iterator."""
  dataset = tf.data.Dataset.from_tensor_slices(data_arrays)
  if is_train:
    dataset = dataset.shuffle(buffer_size=100)
  dataset = dataset.batch(batch_size)
  return dataset

batch_size = 10
data_iter = load_array((features, labels), batch_size)

In [11]:
next(iter(data_iter))

(<tf.Tensor: shape=(10, 2), dtype=float32, numpy=
 array([[-0.48509452, -1.0031143 ],
        [-1.2455248 , -1.0678812 ],
        [-0.4060649 , -1.3355876 ],
        [ 0.61740464, -0.78729147],
        [ 0.00265126,  0.49059355],
        [ 2.4839485 , -0.2647907 ],
        [-0.19049111, -0.7339733 ],
        [-1.0771749 , -0.33752418],
        [-0.38383046,  1.2489277 ],
        [ 0.7332773 , -1.1825895 ]], dtype=float32)>,
 <tf.Tensor: shape=(10, 1), dtype=float32, numpy=
 array([[ 6.6360393],
        [ 5.3382854],
        [ 7.9428115],
        [ 8.126984 ],
        [ 2.5201936],
        [10.0890665],
        [ 6.317945 ],
        [ 3.2030745],
        [-0.8054451],
        [ 9.695987 ]], dtype=float32)>)

### Defining the Model

In [12]:
# `keras` is the high level API for Tensorflow
net = tf.keras.Sequential()
net.add(tf.keras.layers.Dense(1))

### Initializing Model Parameters
Before using net, we need to initialize the model parameters, such as the weights and bias in the linear regression model. Deep learning frameworks often have a predefined way to initialize the parameters. Here we specify that each weight parameter should be randomly sampled from a normal distribution with mean 0 and standard deviation 0.01. The bias parameter will be initialized to zero.
The `initializers` module in TensorFlow provides various methods for model parameter initialization. The easiest way to specify the initialization method in Keras is when creating the layer by specifying `kernel_initializer`. Here we recreate net again.






In [13]:
initializer = tf.initializers.RandomNormal(stddev=0.01)
net = tf.keras.Sequential()
net.add(tf.keras.layers.Dense(1, kernel_initializer=initializer))

The code above may look straightforward but you should note that something strange is happening here. We are initializing parameters for a network even though Keras does not yet know how many dimensions the input will have! It might be 2 as in our example or it might be 2000. Keras lets us get away with this because behind the scenes, the initialization is actually deferred. The real initialization will take place only when we for the first time attempt to pass data through the network. Just be careful to remember that since the parameters have not been initialized yet, we cannot access or manipulate them.

### Defining the Loss Function
The MeanSquaredError class computes the mean squared error, also known as squared  $L_2$  norm. By default it returns the average loss over examples.

In [14]:
loss = tf.keras.losses.MeanSquaredError()

### Defining the Optimization Algorithm
Minibatch stochastic gradient descent is a standard tool for optimizing neural networks and thus Keras supports it alongside a number of variations on this algorithm in the optimizers module. Minibatch stochastic gradient descent just requires that we set the value `learning_rate`, which is set to 0.03 here.

In [15]:
trainer = tf.keras.optimizers.SGD(learning_rate=0.03)

### Training
You might have noticed that expressing our model through high-level APIs of a deep learning framework requires comparatively few lines of code. We did not have to individually allocate parameters, define our loss function, or implement minibatch stochastic gradient descent. Once we start working with much more complex models, advantages of high-level APIs will grow considerably. However, once we have all the basic pieces in place, the training loop itself is strikingly similar to what we did when implementing everything from scratch.
To refresh your memory: for some number of epochs, we will make a complete pass over the dataset (train_data), iteratively grabbing one minibatch of inputs and the corresponding ground-truth labels. For each minibatch, we go through the following ritual:

- Generate predictions by calling `net(X)` and calculate the `loss l` (the forward propagation).
- Calculate gradients by running the backpropagation.
- Update the model parameters by invoking our optimizer.
For good measure, we compute the loss after each epoch and print it to monitor progress.

In [18]:
num_epochs = 3
for epoch in range(num_epochs):
  for X, y in data_iter:
    with tf.GradientTape() as tape:
      l = loss(net(X, training=True), y)
    grads = tape.gradient(l, net.trainable_variables)
    trainer.apply_gradients(zip(grads, net.trainable_variables))
  l = loss(net(features), labels)
  print(f'epoch {epoch + 1}, loss {1:f}')

epoch 1, loss 1.000000
epoch 2, loss 1.000000
epoch 3, loss 1.000000


Below, we compare the model parameters learned by training on finite data and the actual parameters that generated our dataset. To access parameters, we first access the layer that we need from net and then access that layer’s weights and bias. As in our from-scratch implementation, note that our estimated parameters are close to their ground-truth counterparts.

In [19]:
w = net.get_weights()[0]
print('error in estimating w', true_w - tf.reshape(w, true_w.shape))
b = net.get_weights()[1]
print('error in estimating b', true_b - b)

error in estimating w tf.Tensor([-2.4557114e-04  2.6226044e-05], shape=(2,), dtype=float32)
error in estimating b [0.00099897]
