* The default Colab environment already comes with TensorFlow and Keras installed.
* All the other libraries except them are added by using pip command with **!** symbol.

```python
!pip install package_name
```

* To use the GPU runtime with Colab, select Runtime > Change Runtime Type in the menu and select GPU for the Hardware Accelerator.

***

# Constant Tensors and Variables

* To do something on TensorFlow, tensors are used. Also, tensors must be created with some initial value.

* For instance, it could be created with all-zero, all-one or random values.

In [None]:
import tensorflow as tf

## All-Zero Tensors

In [None]:
x = tf.zeros(shape=(2,1))
print(x)

tf.Tensor(
[[0.]
 [0.]], shape=(2, 1), dtype=float32)


## All-One Tensors

In [None]:
y = tf.ones(shape=(2,1))
print(y)

tf.Tensor(
[[1.]
 [1.]], shape=(2, 1), dtype=float32)


## Random Value Tensors

In [None]:
z = tf.random.normal(shape=(2,1), mean=0., stddev=1.)
print(z)

tf.Tensor(
[[-0.4876375]
 [-2.4249372]], shape=(2, 1), dtype=float32)


It makes a tensor of random values drawn from a normal distribution with mean 0 and standard deviation 1.

In [None]:
w = tf.random.uniform(shape=(2,1), minval=0., maxval=1.)
print(w)

tf.Tensor(
[[0.02193677]
 [0.06778741]], shape=(2, 1), dtype=float32)


It makes a tensor of random values drawn from a uniform distribution between 0 and 1.

* A significant difference between NumPy arrays and TensorFlow tensors is that TensorFlow tensors **aren’t assignable**.
* **tf.Variable** is the class meant to manage modifiable state in TensorFlow.

In [None]:
q = tf.Variable(initial_value=tf.random.normal(shape=(2,1)))
print(q)

<tf.Variable 'Variable:0' shape=(2, 1) dtype=float32, numpy=
array([[-1.6440791 ],
       [-0.29095608]], dtype=float32)>


That's the way of creating a variable.

In [None]:
q.assign(tf.ones(shape=(2,1)))
print(q)

<tf.Variable 'Variable:0' shape=(2, 1) dtype=float32, numpy=
array([[1.],
       [1.]], dtype=float32)>


This is the way of reassigning.

In [None]:
q[0,0].assign(7.)
print(q)

<tf.Variable 'Variable:0' shape=(2, 1) dtype=float32, numpy=
array([[7.],
       [1.]], dtype=float32)>


That's called as subset value assigning.

In [None]:
q.assign_add(tf.ones((2,1)))
print(q)

<tf.Variable 'Variable:0' shape=(2, 1) dtype=float32, numpy=
array([[8.],
       [2.]], dtype=float32)>


TensorFlow offers basic arithmethical operations such as **assign_add**, **square**, **sqrt**, **matmul** etc.

**matmul** takes the product of two tensors.

***

# GradientTape

* TensorFlow seems to look a lot like NumPy. But here’s something NumPy can’t do: retrieve the gradient of any differentiable expression with respect to any of its inputs.

In [None]:
input_var = tf.Variable(initial_value=3.)
print(input_var)

with tf.GradientTape() as tape:
  result = tf.square(input_var)
  gradient = tape.gradient(result, input_var)

  print(result)
  print(gradient)

<tf.Variable 'Variable:0' shape=() dtype=float32, numpy=3.0>
tf.Tensor(9.0, shape=(), dtype=float32)
tf.Tensor(6.0, shape=(), dtype=float32)


* This is most commonly used to retrieve the gradients of the loss of a model with respect to its weights: **gradients = tape.gradient(loss, weights)**.

> So far, you’ve only seen the case where the input tensors in tape gradient() were TensorFlow variables. It’s actually possible for these inputs to be any arbitrary tensor. However, only trainable variables are tracked by default. With a constant tensor, you’d have to manually mark it as being tracked by calling _tape.watch()_ on it.



In [None]:
input_const = tf.constant(3.)
print(input_const)

with tf.GradientTape() as tape:
  tape.watch(input_const)
  result = tf.square(input_const)
  gradient = tape.gradient(result, input_const)

  print(result)
  print(gradient)

tf.Tensor(3.0, shape=(), dtype=float32)
tf.Tensor(9.0, shape=(), dtype=float32)
tf.Tensor(6.0, shape=(), dtype=float32)


> Why is this necessary? Because it would be too expensive to preemptively store the information required to compute the gradient of anything with respect to anything. To avoid wasting resources, the tape needs to know what to watch.

* The gradient tape is a powerful utility, even capable of computing second-order gradients.