## 🔮 Deep Learning in Practice


<img src="https://i.imgur.com/wjTxjxS.png">


### Have Considered TensorFlow


Only as a tensor library (i.e., like Numpy) but which also offers GPU support and automatic differentiation


<img src="https://i.imgur.com/mFrR2eg.png" />


<table>
  <tr> <th>tf</th> <th>tf.data</th> <th>tf.keras</th> <th>Deployment</th> </tr>
  <tr> 
  <td> Wraps all other modules and offers tensor functionality on GPU and automatic differentiation </td> 
  <td> Functionality related to reading and handling data </td> 
  <td> Mainly exports layers, losses, etc. (similar to torch.nn) but actually does much more than that </td> 
  <td> TensorFlow.js to train and deploy models in JavaScript, TFLite to deploy on edge devices (mobile) and more deployment friendly stuff</td> 
</table>

Notice that unlike PyTorch:
- There is much better support for deployment on different platforms
- TensorFlow is more popular than PyTorch
- Large part of the high-level functionality is delegated to `tf.keras` (and we will see later that `keras` can operate completely independently from tensorflow)

### 📊 Reading and Handling Data

Like PyTorch, we wrap our data in a Dataset object. 

Usually you will use `tf.data.Dataset.from_tensor_slices`, what it does is take a tensor (or something that could be converted to that) of shape `(N1,N2,...)` and splits it into `N1` tensors of shape `(N2,N3,...)`.

In [16]:
import tensorflow as tf

dataset = tf.data.Dataset.from_tensor_slices([[1, 2], [3, 4], [5, 6]]) # not indexable but iterable.
print(dataset.element_spec)                                            # can't use shape  on whole dataset but this gives shape of one element

print(f"It's an instance of Dataset: {isinstance(dataset, tf.data.Dataset)}")
for element in dataset: print(element)         

TensorSpec(shape=(2,), dtype=tf.int32, name=None)
It's an instance of Dataset: True
tf.Tensor([1 2], shape=(2,), dtype=int32)
tf.Tensor([3 4], shape=(2,), dtype=int32)
tf.Tensor([5 6], shape=(2,), dtype=int32)


Let's look at some basic operations we can do with our dataset object:

In [17]:
# filtering
filtered_dataset = dataset.filter(lambda x : x[0] > 2)
# mapping
mapped_dataset = dataset.map(lambda x : x**2)
# shuffling 
shuffled_dataset = mapped_dataset.shuffle(buffer_size=len(mapped_dataset))
for element in shuffled_dataset: print(element)         

tf.Tensor([25 36], shape=(2,), dtype=int32)
tf.Tensor([1 4], shape=(2,), dtype=int32)
tf.Tensor([ 9 16], shape=(2,), dtype=int32)


Zipping is also possible (and can be handled with `from_tensor_slices` as we will see later)

In [18]:
import tensorflow as tf

# Dummy features and labels
x_data = tf.data.Dataset.from_tensor_slices([[1.0, 2.0], [3.0, 4.0], [5.0, 6.0]])
y_data = tf.data.Dataset.from_tensor_slices([0, 1, 0])  # Binary classification labels

# Combine features and labels
dataset = tf.data.Dataset.zip((x_data, y_data))

# Iterate over the combined dataset
for x, y in dataset:
    print("Feature:", x.numpy(), "Label:", y.numpy())

Feature: [1. 2.] Label: 0
Feature: [3. 4.] Label: 1
Feature: [5. 6.] Label: 0


Batching

In [19]:
# Batch the dataset
batch_size = 2
batched_dataset = dataset.batch(batch_size, drop_remainder=True).prefetch(1)        
# prefetch "When the GPU is propagating on the current batch
# we want the CPU to process the next batch of data so that it is immediately ready."

# Iterate over the batched dataset
for xb, yb in batched_dataset:
    print("Batch of Features:", xb.numpy())
    print("Batch of Labels:", yb.numpy())

# Unlike PyTorch, no independent concept of Dataloader:
print(f"It's an instance of Dataset: {isinstance(batched_dataset, tf.data.Dataset)}")

Batch of Features: [[1. 2.]
 [3. 4.]]
Batch of Labels: [0 1]
It's an instance of Dataset: True


Splitting is possible but requires more manual work

In [20]:
import tensorflow as tf

# Example dataset
x_data, y_data = tf.random.uniform([10, 5, 5]), tf.random.uniform([10, 1])
dataset = tf.data.Dataset.from_tensor_slices((x_data, y_data))

# Batch the dataset
batch_size = 2
batched_dataset = dataset.batch(batch_size, drop_remainder=True)

# Determine the number of batches for training and testing
num_batches = len(list(batched_dataset))
train_size = int(0.8 * num_batches)
test_size = num_batches - train_size

# Split the dataset into training and testing
train_dataset = batched_dataset.take(train_size)            # take the first train_size elements (batches)
test_dataset = batched_dataset.skip(train_size)             # skip the first train_size elements then take rest

# Verify the split
print(f"Total batches: {num_batches}")
print(f"Training batches: {len(list(train_dataset))}")
print(f"Testing batches: {len(list(test_dataset))}")

Total batches: 5
Training batches: 4
Testing batches: 1


This is what we will at most need for this tutorial but you can dive deeper in the following official [TF tutorial](https://www.tensorflow.org/guide/data).

<img src="https://i.imgur.com/lEo6rtk.png">

Now let's shift our focus to Keras:

<img src="https://i.imgur.com/SbcHrMK.png">

As can be seen it looks like a standalone deep learning library. The reality is that it uses TensorFlow under the hood to implement different functionalities (e.g., layers, losses, optimizers, etc.) just like how `torch.nn` uses `torch` and its other modules to implement the same constructs.

To be precise, this fact has been true for so long, but as we will see at the end of the tutorial, `Keras 3` (most recent release) goes beyond only supporting such constructs for TensorFlow (but for now let's ignore this fact).