<a href="https://colab.research.google.com/github/snehalad/fundamentals-of-tensorflow/blob/main/neural_network_classification_in_tensorflow.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## TensorFlow 2.7.0 input shape breaking changes

If you're running [notebook 02](https://github.com/mrdbourke/tensorflow-deep-learning/blob/main/02_neural_network_classification_in_tensorflow.ipynb) of the TensorFlow Deep Learning course, you might run into some issues when working with `model_3`.

When [TensorFlow 2.7.0 was released](https://github.com/tensorflow/tensorflow/releases/tag/v2.7.0) one of the major breaking changes was:

> `tf.keras:`
> * The methods `Model.fit()`, `Model.predict()`, and `Model.evaluate()` will no longer uprank input data of shape `(batch_size,)` to become `(batch_size, 1)`. This enables Model subclasses to process scalar data in their `train_step()`/`test_step()`/`predict_step()` methods.
> Note that this change may break certain subclassed models. You can revert back to the previous behavior by adding upranking yourself in the `train_step()`/`test_step()`/`predict_step()` methods, e.g. `if x.shape.rank == 1: x = tf.expand_dims(x, axis=-1)`. Functional models as well as Sequential models built with an explicit input shape are not affected.

This means that inputs with shape `(batch_size,)` won't automatically be upranked to become `(batch_size, 1)`.

In notebook 02, `X_reg_train` is of shape `(150, )` which is like `(batch_size=150, )`.

```python
X_reg_train.shape
>>> (150,)
```
So for the model to work we have to uprank it on the last dimension to be `(batch_size=150, 1)`:

```python
tf.expand_dims(X_reg_train, axis=1).shape
>>> TensorShape([150, 1])
```

If you get any shape issues through the notebook and are running TensorFlow 2.7.0 this is likely one of the causes.

See more in this GitHub discussion: https://github.com/mrdbourke/tensorflow-deep-learning/discussions/278 

## Before with `tf.keras.losses.BinaryCrossentropy()` (errors after TensorFlow 2.7.0)

You'll see something like: 

```
ValueError: Exception encountered when calling layer "sequential" (type Sequential).
    
    Input 0 of layer "dense" is incompatible with the layer: expected min_ndim=2, found ndim=1. Full shape received: (None,)
    
    Call arguments received:
      • inputs=tf.Tensor(shape=(None,), dtype=int64)
      • training=True
      • mask=None
```



In [None]:
import numpy as np
import tensorflow as tf
print(tf.__version__)

2.7.0


In [None]:
# Set random seed
tf.random.set_seed(42)

# Create some regression data
X_regression = np.arange(0, 1000, 5)
y_regression = np.arange(100, 1100, 5)

# Split it into training and test sets
X_reg_train = X_regression[:150]
X_reg_test = X_regression[150:]
y_reg_train = y_regression[:150]
y_reg_test = y_regression[150:]

In [None]:
# Set random seed
tf.random.set_seed(42)

# 1. Create the model (this time 3 layers)
model_3 = tf.keras.Sequential([
  tf.keras.layers.Dense(100), # add 100 dense neurons
  tf.keras.layers.Dense(10), # add another layer with 10 neurons
  tf.keras.layers.Dense(1)
])

# 2. Compile the model
model_3.compile(loss=tf.keras.losses.BinaryCrossentropy(),
                optimizer=tf.keras.optimizers.Adam(), # use Adam instead of SGD
                metrics=['accuracy'])

# 3. Fit our model to the data
model_3.fit(X_reg_train, y_reg_train, epochs=5)

Epoch 1/5


ValueError: ignored

## After with `tf.keras.losses.BinaryCrossentropy()` (no error)

The two main fixes here are:
1. Explicitly define the input shape to the model with `input_shape=(None, 1)`.
2. Expand the dimensions of the input to `model.fit()` with `tf.expand_dims(x, axis=-1)` (where `x` is the variable you're fitting on)

In [None]:
# Set random seed
tf.random.set_seed(42)

# 1. Create the model (this time 3 layers)
model_3 = tf.keras.Sequential([
  # Before TensorFlow 2.7.0
  # tf.keras.layers.Dense(100), # add 100 dense neurons

  ## After TensorFlow 2.7.0 ## 
  tf.keras.layers.Dense(100, input_shape=(None, 1)), # add 100 dense neurons with input_shape defined (None, 1) = look at 1 sample at a time
  tf.keras.layers.Dense(10), # add another layer with 10 neurons
  tf.keras.layers.Dense(1)
])

# 2. Compile the model
model_3.compile(loss=tf.keras.losses.BinaryCrossentropy(),
                optimizer=tf.keras.optimizers.Adam(), # use Adam instead of SGD
                metrics=['accuracy'])

# 3. Fit our model to the data
model_3.fit(tf.expand_dims(X_reg_train, axis=-1), # <- expand dimension on final axis 
            y_reg_train, 
            epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7fc888f6d2d0>

## Before with `tf.keras.losses.mae` (errors after TensorFlow 2.7.0)

You'll see the same issue if you're using MAE as a loss function.

It's back to the input data.

Does it satisfy the new requirement of being `(batch_size, 1)`? Or not?

Remember one of the most common issues with TensorFlow and deep learning is input and output shapes. 

Run this code below and you'll see an error like:

```
ValueError: Exception encountered when calling layer "sequential" (type Sequential).
    
    Input 0 of layer "dense" is incompatible with the layer: expected min_ndim=2, found ndim=1. Full shape received: (None,)
    
    Call arguments received:
      • inputs=tf.Tensor(shape=(None,), dtype=int64)
      • training=True
      • mask=None
```

In [None]:
# Setup random seed
tf.random.set_seed(42)

# Recreate the model
model_3 = tf.keras.Sequential([
  tf.keras.layers.Dense(100),
  tf.keras.layers.Dense(10),
  tf.keras.layers.Dense(1)
])

# Change the loss and metrics of our compiled model
model_3.compile(loss=tf.keras.losses.mae, # change the loss function to be regression-specific
                optimizer=tf.keras.optimizers.Adam(),
                metrics=['mae']) # change the metric to be regression-specific

# Fit the recompiled model
model_3.fit(X_reg_train, 
            y_reg_train, 
            epochs=5)

Epoch 1/5


ValueError: ignored

## After with `tf.keras.losses.mae` (no error)

The fix comes in the input to `model.fit()`.

The input was previously `X_reg_train`, now it's going to be `tf.expand_dims(X_reg_train, axis=-1)`.

In [None]:
# Setup random seed
tf.random.set_seed(42)

# Recreate the model
model_3 = tf.keras.Sequential([
  tf.keras.layers.Dense(100),
  tf.keras.layers.Dense(10),
  tf.keras.layers.Dense(1)
])

# Change the loss and metrics of our compiled model
model_3.compile(loss=tf.keras.losses.mae, # change the loss function to be regression-specific
                optimizer=tf.keras.optimizers.Adam(),
                metrics=['mae']) # change the metric to be regression-specific

# Fit the recompiled model
model_3.fit(tf.expand_dims(X_reg_train, axis=-1), # <- Expand the final dimension of the input 
            y_reg_train, 
            epochs=5)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7fc8899619d0>

## Why this happens

This happens because `model.fit()` no longer upranks single dimension tensors from `(batch_size, )` to `(batch_size, 1)`.

So we either have to:
* Do it ourselves with `tf.expand_dims()`
* Define the `input_shape` to the model

Both of these were seen above.

To illustrate once more, let's check the dimensions of the input data `X_reg_train`.

In [None]:
# Original input dimensions
X_reg_train.ndim, X_reg_train.shape

(1, (150,))

In [None]:
# Upranked input dimensions
tf.expand_dims(X_reg_train, axis=-1).ndim, tf.expand_dims(X_reg_train, axis=-1).shape

(2, TensorShape([150, 1]))