In [1]:
!pip install d2l

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting d2l
  Downloading d2l-0.17.5-py3-none-any.whl (82 kB)
[K     |████████████████████████████████| 82 kB 711 kB/s 
[?25hCollecting pandas==1.2.4
  Downloading pandas-1.2.4-cp37-cp37m-manylinux1_x86_64.whl (9.9 MB)
[K     |████████████████████████████████| 9.9 MB 50.0 MB/s 
[?25hCollecting numpy==1.21.5
  Downloading numpy-1.21.5-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
[K     |████████████████████████████████| 15.7 MB 45.6 MB/s 
[?25hCollecting requests==2.25.1
  Downloading requests-2.25.1-py2.py3-none-any.whl (61 kB)
[K     |████████████████████████████████| 61 kB 7.4 MB/s 
[?25hCollecting matplotlib==3.5.1
  Downloading matplotlib-3.5.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (11.2 MB)
[K     |████████████████████████████████| 11.2 MB 46.2 MB/s 
Collecting fonttools>=4.22.0
  Downloading fonttools-4.33.3-py3-none-any.whl (930 kB

In [2]:
%matplotlib inline
import random
import tensorflow as tf
from d2l import tensorflow as d2l

In [3]:
import numpy as np
import tensorflow as tf
from d2l import tensorflow as d2l

true_w = tf.constant([2, -3.4])
true_b = 4.2
features, labels = d2l.synthetic_data(true_w, true_b, 1000)

**3.3.2. Reading the Dataset**

Rather than rolling our own iterator, we can call upon the existing API in a framework to read data. We pass in features and labels as arguments and specify batch_size when instantiating a data iterator object. Besides, the boolean value is_train indicates whether or not we want the data iterator object to shuffle the data on each epoch (pass through the dataset).

In [4]:
def load_array(data_arrays, batch_size, is_train=True):  
    """Construct a TensorFlow data iterator."""
    dataset = tf.data.Dataset.from_tensor_slices(data_arrays)
    if is_train:
        dataset = dataset.shuffle(buffer_size=1000)
    dataset = dataset.batch(batch_size)
    return dataset

batch_size = 10
data_iter = load_array((features, labels), batch_size)

In [5]:
next(iter(data_iter))

(<tf.Tensor: shape=(10, 2), dtype=float32, numpy=
 array([[ 0.03118525,  0.3282368 ],
        [-1.0496702 , -1.4271742 ],
        [-0.80557007,  0.07771834],
        [-0.7114149 ,  0.6930696 ],
        [ 1.356667  , -0.00182305],
        [-0.9881606 ,  0.8983895 ],
        [ 0.7274418 ,  0.33707333],
        [ 1.5584396 , -1.6080852 ],
        [-1.2263938 , -1.1571891 ],
        [-0.8424664 , -0.18973094]], dtype=float32)>,
 <tf.Tensor: shape=(10, 1), dtype=float32, numpy=
 array([[ 3.1511965 ],
        [ 6.956218  ],
        [ 2.3142188 ],
        [ 0.42951968],
        [ 6.9306803 ],
        [-0.84538233],
        [ 4.512798  ],
        [12.783806  ],
        [ 5.697266  ],
        [ 3.1667323 ]], dtype=float32)>)

**3.3.3. Defining the Model**

When we implemented linear regression from scratch in Section 3.2, we defined our model parameters explicitly and coded up the calculations to produce output using basic linear algebra operations. You should know how to do this. But once your models get more complex, and once you have to do this nearly every day, you will be glad for the assistance. The situation is similar to coding up your own blog from scratch. Doing it once or twice is rewarding and instructive, but you would be a lousy web developer if every time you needed a blog you spent a month reinventing the wheel.

In [6]:
# `keras` is the high-level API for TensorFlow
net = tf.keras.Sequential()
net.add(tf.keras.layers.Dense(1))

**3.3.4. Initializing Model Parameters**

Before using net, we need to initialize the model parameters, such as the weights and bias in the linear regression model. Deep learning frameworks often have a predefined way to initialize the parameters. Here we specify that each weight parameter should be randomly sampled from a normal distribution with mean 0 and standard deviation 0.01. The bias parameter will be initialized to zero

In [7]:
initializer = tf.initializers.RandomNormal(stddev=0.01)
net = tf.keras.Sequential()
net.add(tf.keras.layers.Dense(1, kernel_initializer=initializer))

**3.3.5. Defining the Loss Function**

In [8]:
loss = tf.keras.losses.MeanSquaredError()

**3.3.6. Defining the Optimization Algorithm**

In [9]:
trainer = tf.keras.optimizers.SGD(learning_rate=0.03)

**3.3.7. Training**

You might have noticed that expressing our model through high-level APIs of a deep learning framework requires comparatively few lines of code. We did not have to individually allocate parameters, define our loss function, or implement minibatch stochastic gradient descent. Once we start working with much more complex models, advantages of high-level APIs will grow considerably. However, once we have all the basic pieces in place, the training loop itself is strikingly similar to what we did when implementing everything from scratch.

In [10]:
num_epochs = 3
for epoch in range(num_epochs):
    for X, y in data_iter:
        with tf.GradientTape() as tape:
            l = loss(net(X, training=True), y)
        grads = tape.gradient(l, net.trainable_variables)
        trainer.apply_gradients(zip(grads, net.trainable_variables))
    l = loss(net(features), labels)
    print(f'epoch {epoch + 1}, loss {l:f}')

epoch 1, loss 0.000280
epoch 2, loss 0.000100
epoch 3, loss 0.000100


In [11]:
w = net.get_weights()[0]
print('error in estimating w', true_w - tf.reshape(w, true_w.shape))
b = net.get_weights()[1]
print('error in estimating b', true_b - b)

error in estimating w tf.Tensor([-0.00122023  0.00021625], shape=(2,), dtype=float32)
error in estimating b [0.00046062]
