# Background

## Introduction

As captcha is a common verification tool that hinders automated robot, we are going to create a CNN to break the captcha.

## Loading Prerequisites and Dependencies

In [155]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
import pathlib
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split
from generalUtils.generalUtils import LogFile
from generalUtils.mlUtils import KFoldCV
from generalUtils.notification import slack_message
from PIL import Image

In [156]:
DATA_PATH = pathlib.Path('captcha')
LABEL_PATH = DATA_PATH / 'captcha_code.csv'
PHOTO_PATH = DATA_PATH / 'captcha'

In [157]:
LABEL_FILE_COLUMN = 'LabelFile'
LABEL_TARGET_COLUMN = 'Target'

---

# Loading Data and Processing

## Checking the label dataframe

In [158]:
labelDf = pd.read_csv(LABEL_PATH, header = None, names = [LABEL_FILE_COLUMN, LABEL_TARGET_COLUMN])

In [159]:
labelDf.sample(5)

Unnamed: 0,LabelFile,Target
1526,1527_captcha.png,5161
1765,1766_captcha.png,9716
329,330_captcha.png,1857
723,724_captcha.png,2426
391,392_captcha.png,882


In [160]:
labelDf.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2001 entries, 0 to 2000
Data columns (total 2 columns):
 #   Column     Non-Null Count  Dtype 
---  ------     --------------  ----- 
 0   LabelFile  2001 non-null   object
 1   Target     2001 non-null   int64 
dtypes: int64(1), object(1)
memory usage: 31.4+ KB


We can see that our "Target" column has been inferred as the type `int`, which is not the case as the **trailling** zero are also useful. Thus we have to parsed the target column into text.

In [161]:
def parse_int_to_four_digit_text(integer):
    length_of_targeted_text = 4
    stringInteger = str(integer)
    length_of_string_integer = len(stringInteger)
    if length_of_string_integer < length_of_targeted_text:
        return (length_of_targeted_text - length_of_string_integer)*"0" + stringInteger
    return stringInteger

In [162]:
parse_int_to_four_digit_text(1)

'0001'

Our parsing code is working. Now, we will use the `df.apply` method to parse the integer into our targeted format.

In [163]:
labelDf['Target'] = labelDf['Target'].astype(str)

In [164]:
labelDf['Target'] = labelDf.Target.apply(parse_int_to_four_digit_text)

In [165]:
labelDf.sample(5)

Unnamed: 0,LabelFile,Target
564,565_captcha.png,4700
1084,1085_captcha.png,636
1336,1337_captcha.png,9534
950,951_captcha.png,5379
333,334_captcha.png,3004


In [166]:
labelDf.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2001 entries, 0 to 2000
Data columns (total 2 columns):
 #   Column     Non-Null Count  Dtype 
---  ------     --------------  ----- 
 0   LabelFile  2001 non-null   object
 1   Target     2001 non-null   object
dtypes: object(2)
memory usage: 31.4+ KB


Now we can see that our dataframe is behaving well.

## Checking our data

As we can find the image files using the `labelDf`, we are going to use split the dataset into training and testing data first.

In [167]:
X = labelDf.LabelFile.values
y = labelDf.Target.values

In [168]:
X

array(['1_captcha.png', '2_captcha.png', '3_captcha.png', ...,
       '1999_captcha.png', '2000_captcha.png', '2001_captcha.png'],
      dtype=object)

In [169]:
_X_train, _X_test, _y_train, _y_test = train_test_split(X, y, test_size = 0.2)

Now our data is split into training and testing data. We need to write a function to **parse** the input photo into numpy array.

In [170]:
def image_file_path_to_nd_array(filepath):
    image = Image.open(filepath)
    data = np.asarray(image)
    print(type(data))
    print(data.shape)
    return data

We would like to test on function and have a look on the data.

In [171]:
image_file_path_to_nd_array(PHOTO_PATH / _X_train[0])

<class 'numpy.ndarray'>
(40, 135, 4)


array([[[204, 170, 204, 255],
        [204, 170, 204, 255],
        [204, 180, 204, 255],
        ...,
        [204, 199, 204, 255],
        [204, 213, 204, 255],
        [204, 213, 204, 255]],

       [[204, 175, 204, 255],
        [204, 175, 204, 255],
        [205, 186, 205, 255],
        ...,
        [205, 199, 201, 255],
        [204, 207, 197, 255],
        [204, 207, 197, 255]],

       [[204, 196, 204, 255],
        [204, 196, 204, 255],
        [211, 207, 211, 255],
        ...,
        [213, 197, 192, 255],
        [204, 186, 172, 255],
        [204, 186, 172, 255]],

       ...,

       [[204, 196, 204, 255],
        [204, 196, 204, 255],
        [211, 207, 207, 255],
        ...,
        [200, 210, 207, 255],
        [184, 196, 204, 255],
        [184, 196, 204, 255]],

       [[204, 178, 204, 255],
        [204, 178, 204, 255],
        [206, 188, 196, 255],
        ...,
        [178, 191, 194, 255],
        [162, 178, 204, 255],
        [162, 178, 204, 255]],

       [[204

We can see that the image has a height of **40** pixels, width of **135** pixels and with 4 channels. The last channel is always 255, which is actually the **opacity** of the photo. We can drop the opacity as it doesn't help with our understanding to the data.

In [172]:
def image_file_path_to_nd_array(filepath):
    image = Image.open(filepath)
    data = np.asarray(image)
    data = data[...,:3] # here is the magic works! we ignore all the previous dimensions and in the last dimension we only take the first 3 elements!
    return data

In [173]:
image_file_path_to_nd_array(PHOTO_PATH / _X_train[0])

array([[[204, 170, 204],
        [204, 170, 204],
        [204, 180, 204],
        ...,
        [204, 199, 204],
        [204, 213, 204],
        [204, 213, 204]],

       [[204, 175, 204],
        [204, 175, 204],
        [205, 186, 205],
        ...,
        [205, 199, 201],
        [204, 207, 197],
        [204, 207, 197]],

       [[204, 196, 204],
        [204, 196, 204],
        [211, 207, 211],
        ...,
        [213, 197, 192],
        [204, 186, 172],
        [204, 186, 172]],

       ...,

       [[204, 196, 204],
        [204, 196, 204],
        [211, 207, 207],
        ...,
        [200, 210, 207],
        [184, 196, 204],
        [184, 196, 204]],

       [[204, 178, 204],
        [204, 178, 204],
        [206, 188, 196],
        ...,
        [178, 191, 194],
        [162, 178, 204],
        [162, 178, 204]],

       [[204, 170, 204],
        [204, 170, 204],
        [204, 180, 191],
        ...,
        [168, 183, 188],
        [153, 170, 204],
        [153, 170, 204]]

Now we can see that our data is correct. By using a loop to get the images, we can train our model to have convolution and pooling on these numpy arrays.

## Getting all the numpy array representation of the training and testing datasets

In [174]:
X_train = []
X_test = []

In [175]:
for trainDataPath in _X_train:
    X_train.append(image_file_path_to_nd_array(PHOTO_PATH / trainDataPath))
for testDataPath in _X_test:
    X_test.append(image_file_path_to_nd_array(PHOTO_PATH / testDataPath))

In [176]:
X_train[:3]

[array([[[204, 170, 204],
         [204, 170, 204],
         [204, 180, 204],
         ...,
         [204, 199, 204],
         [204, 213, 204],
         [204, 213, 204]],
 
        [[204, 175, 204],
         [204, 175, 204],
         [205, 186, 205],
         ...,
         [205, 199, 201],
         [204, 207, 197],
         [204, 207, 197]],
 
        [[204, 196, 204],
         [204, 196, 204],
         [211, 207, 211],
         ...,
         [213, 197, 192],
         [204, 186, 172],
         [204, 186, 172]],
 
        ...,
 
        [[204, 196, 204],
         [204, 196, 204],
         [211, 207, 207],
         ...,
         [200, 210, 207],
         [184, 196, 204],
         [184, 196, 204]],
 
        [[204, 178, 204],
         [204, 178, 204],
         [206, 188, 196],
         ...,
         [178, 191, 194],
         [162, 178, 204],
         [162, 178, 204]],
 
        [[204, 170, 204],
         [204, 170, 204],
         [204, 180, 191],
         ...,
         [168, 183, 188],
  

In [177]:
X_train = np.array(X_train)
X_test = np.array(X_test)

In [178]:
X_train

array([[[[204, 170, 204],
         [204, 170, 204],
         [204, 180, 204],
         ...,
         [204, 199, 204],
         [204, 213, 204],
         [204, 213, 204]],

        [[204, 175, 204],
         [204, 175, 204],
         [205, 186, 205],
         ...,
         [205, 199, 201],
         [204, 207, 197],
         [204, 207, 197]],

        [[204, 196, 204],
         [204, 196, 204],
         [211, 207, 211],
         ...,
         [213, 197, 192],
         [204, 186, 172],
         [204, 186, 172]],

        ...,

        [[204, 196, 204],
         [204, 196, 204],
         [211, 207, 207],
         ...,
         [200, 210, 207],
         [184, 196, 204],
         [184, 196, 204]],

        [[204, 178, 204],
         [204, 178, 204],
         [206, 188, 196],
         ...,
         [178, 191, 194],
         [162, 178, 204],
         [162, 178, 204]],

        [[204, 170, 204],
         [204, 170, 204],
         [204, 180, 191],
         ...,
         [168, 183, 188],
        

Now we have created the set of training data and testing data.

## Defining the dependencies constants in the model

In this model, we will eventually predict 10 integers in 4 digits, which is `0-9`. Thus, there are 10 labels in 4 rows.<br>
Also, as we are doing multi-class classification, we will adopt the `softmax` layer in the last layer.<br>
In the previous investigation, we found that the dimension of the captcha code is $135\times 40$ (width $\times$ height).

In [179]:
WIDTH = 135
HEIGHT = 40
NUM_CHANNELS = 3
NUM_LABELS = 10
NUM_DIGITS = 4

## Reshaping our data

In the model input fitting, we have to parse our data into a 4-dimensional tensor: (number of samples, height, width, number of channels). We need to do `reshape` and type parsing for the sake of correct modelling.

In [180]:
X_train = X_train.reshape(X_train.shape[0], HEIGHT, WIDTH, NUM_CHANNELS).astype("float32")
X_test = X_test.reshape(X_test.shape[0], HEIGHT, WIDTH, NUM_CHANNELS).astype("float32")

In [181]:
X_test.shape

(401, 40, 135, 3)

In [182]:
X_train.shape

(1600, 40, 135, 3)

Now our data are in the correct dimension to be fit into our model.

## Data Normalization

We can see that all of our data are **positive integers** and their value are much larger than 1. These large values will contribute to the diminishing gradient from our activation function and is not a good sign as our model cannot learn effectively. Thus, we need to **normalize** our data from 0 to 255 to from 0 to 1.

In [183]:
X_train /= 255
X_test /= 255

In [184]:
X_train[:3]

array([[[[0.8       , 0.6666667 , 0.8       ],
         [0.8       , 0.6666667 , 0.8       ],
         [0.8       , 0.7058824 , 0.8       ],
         ...,
         [0.8       , 0.78039217, 0.8       ],
         [0.8       , 0.8352941 , 0.8       ],
         [0.8       , 0.8352941 , 0.8       ]],

        [[0.8       , 0.6862745 , 0.8       ],
         [0.8       , 0.6862745 , 0.8       ],
         [0.8039216 , 0.7294118 , 0.8039216 ],
         ...,
         [0.8039216 , 0.78039217, 0.7882353 ],
         [0.8       , 0.8117647 , 0.77254903],
         [0.8       , 0.8117647 , 0.77254903]],

        [[0.8       , 0.76862746, 0.8       ],
         [0.8       , 0.76862746, 0.8       ],
         [0.827451  , 0.8117647 , 0.827451  ],
         ...,
         [0.8352941 , 0.77254903, 0.7529412 ],
         [0.8       , 0.7294118 , 0.6745098 ],
         [0.8       , 0.7294118 , 0.6745098 ]],

        ...,

        [[0.8       , 0.76862746, 0.8       ],
         [0.8       , 0.76862746, 0.8       ]

Now our data is also normalized. Our feature data are finally correct.

## Processing the target data

In [185]:
_y_train

array(['3907', '7262', '6576', ..., '9128', '5819', '1307'], dtype=object)

In [186]:
_y_test

array(['7379', '6747', '4604', '4786', '9386', '6400', '4386', '8302',
       '5016', '1685', '8334', '9736', '7887', '3196', '9434', '2334',
       '9981', '3771', '7026', '8557', '1560', '7318', '6755', '1879',
       '4245', '6844', '2637', '6684', '4988', '0542', '8861', '2128',
       '7165', '9693', '7221', '8186', '0555', '3575', '9149', '9460',
       '1801', '8775', '7435', '3148', '0810', '5587', '3910', '2966',
       '6564', '1655', '7182', '3883', '8432', '6413', '3366', '5463',
       '1621', '8658', '0084', '5531', '3760', '1327', '4014', '1964',
       '0807', '3953', '5244', '8484', '9183', '9715', '6322', '1951',
       '8919', '6291', '8382', '9461', '0780', '2138', '7677', '0510',
       '3483', '1230', '2178', '4956', '5754', '0319', '4200', '8115',
       '2499', '9169', '6751', '7792', '6208', '8874', '3125', '4173',
       '7160', '0791', '4983', '8423', '3008', '5423', '8615', '4665',
       '7989', '8917', '3163', '0459', '3042', '9768', '4575', '2286',
      

Now our targeted data is in the form of **4-digit text** as we need to produce the trailing 0. We can then produce a $4\times 10$ matrix for **softmax** layer.

In [187]:
def parse_digit_text_to_category(digitText):
    digitOneHotList = []
    for digit in digitText: # We need to split each text and create an one-hot label.
        digitOneHotList.append(to_categorical(digit, num_classes = NUM_LABELS))
    return np.array(digitOneHotList)
        

Let's have a look on the parsing.

In [188]:
parse_digit_text_to_category('0245')

array([[1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.]], dtype=float32)

We can see that it is producing the correct result.

In [189]:
y_train = []
y_test = []

In [190]:
for trainTargetData in _y_train:
    y_train.append(parse_digit_text_to_category(trainTargetData))

In [191]:
y_train = np.array(y_train)

In [192]:
y_train[:3]

array([[[0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.],
        [1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.]],

       [[0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 0., 1., 0., 0., 0., 0., 0., 0., 0.]],

       [[0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 0., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
        [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.]]], dtype=float32)

The first 3 `y_train` data are `8438`, `2750` and `3949`.

In [193]:
for testTargetData in _y_test:
    y_test.append(parse_digit_text_to_category(testTargetData))

In [194]:
y_test = np.array(y_test)

In [195]:
y_test[:3]

array([[[0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
        [0., 0., 0., 1., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]],

       [[0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 1., 0., 0.]],

       [[0., 0., 0., 0., 1., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 1., 0., 0., 0.],
        [1., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 1., 0., 0., 0., 0., 0.]]], dtype=float32)

In [196]:
y_train.shape

(1600, 4, 10)

In [197]:
y_test.shape

(401, 4, 10)

Finally, we have done all the data pre-processing and we can now proceed to the model training section.

---

# Model Selection and Building

## CNN Sequential Model

In this project, we will use the CNN `Conv2D` model with pooling layer twice with fully connected layer for classification.

In [205]:
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Reshape

### Model 1

In [206]:
kernelSize1 = 5
poolSize1 = 2

In [208]:
model1 = Sequential(
    [
        Conv2D(16, kernel_size=(kernelSize1,kernelSize1), padding = 'same', input_shape = (HEIGHT, WIDTH, NUM_CHANNELS), activation = 'relu'),
        MaxPooling2D(pool_size = (poolSize1,poolSize1)),
        Conv2D(32, kernel_size = (kernelSize1, kernelSize1), padding = 'same', activation = 'relu'),
        MaxPooling2D(pool_size=(poolSize1, poolSize1)),
        Dropout(0.5),
        Flatten(), ## Need to flatten the pooled feature map into 1D feature array to fit in the fully connected layer.
        Dense(128, activation = 'relu'),
        Dropout(0.5),
        Dense(NUM_LABELS*NUM_DIGITS, activation = 'softmax'),
        Reshape((NUM_DIGITS, NUM_LABELS))
    ]
)

2022-01-05 01:43:46.337604: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-05 01:43:46.736499: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-05 01:43:46.737481: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:939] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2022-01-05 01:43:46.787823: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags

In [209]:
model1.compile(loss = 'categorical_crossentropy', metrics= ['accuracy'], optimizer = 'adam')

In [210]:
history = model1.fit(X_train, y_train, validation_split=0.2, epochs = 100, batch_size = 32)

Epoch 1/100


2022-01-05 01:44:55.882805: E tensorflow/stream_executor/cuda/cuda_dnn.cc:359] Loaded runtime CuDNN library: 8.0.5 but source was compiled with: 8.1.0.  CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library.  If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.


UnknownError:  Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above.
	 [[node sequential/conv2d_8/Relu
 (defined at /venv/lib/python3.9/site-packages/keras/backend.py:4867)
]] [Op:__inference_train_function_942]

Errors may have originated from an input operation.
Input Source operations connected to node sequential/conv2d_8/Relu:
In[0] sequential/conv2d_8/BiasAdd (defined at /venv/lib/python3.9/site-packages/keras/layers/convolutional.py:264)

Operation defined at: (most recent call last)
>>>   File "/usr/lib/python3.9/runpy.py", line 197, in _run_module_as_main
>>>     return _run_code(code, main_globals, None,
>>> 
>>>   File "/usr/lib/python3.9/runpy.py", line 87, in _run_code
>>>     exec(code, run_globals)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/ipykernel_launcher.py", line 16, in <module>
>>>     app.launch_new_instance()
>>> 
>>>   File "/venv/lib/python3.9/site-packages/traitlets/config/application.py", line 846, in launch_instance
>>>     app.start()
>>> 
>>>   File "/venv/lib/python3.9/site-packages/ipykernel/kernelapp.py", line 677, in start
>>>     self.io_loop.start()
>>> 
>>>   File "/venv/lib/python3.9/site-packages/tornado/platform/asyncio.py", line 199, in start
>>>     self.asyncio_loop.run_forever()
>>> 
>>>   File "/usr/lib/python3.9/asyncio/base_events.py", line 596, in run_forever
>>>     self._run_once()
>>> 
>>>   File "/usr/lib/python3.9/asyncio/base_events.py", line 1890, in _run_once
>>>     handle._run()
>>> 
>>>   File "/usr/lib/python3.9/asyncio/events.py", line 80, in _run
>>>     self._context.run(self._callback, *self._args)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 457, in dispatch_queue
>>>     await self.process_one()
>>> 
>>>   File "/venv/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 446, in process_one
>>>     await dispatch(*args)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 353, in dispatch_shell
>>>     await result
>>> 
>>>   File "/venv/lib/python3.9/site-packages/ipykernel/kernelbase.py", line 648, in execute_request
>>>     reply_content = await reply_content
>>> 
>>>   File "/venv/lib/python3.9/site-packages/ipykernel/ipkernel.py", line 353, in do_execute
>>>     res = shell.run_cell(code, store_history=store_history, silent=silent)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/ipykernel/zmqshell.py", line 533, in run_cell
>>>     return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 2914, in run_cell
>>>     result = self._run_cell(
>>> 
>>>   File "/venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 2960, in _run_cell
>>>     return runner(coro)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/IPython/core/async_helpers.py", line 78, in _pseudo_sync_runner
>>>     coro.send(None)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3185, in run_cell_async
>>>     has_raised = await self.run_ast_nodes(code_ast.body, cell_name,
>>> 
>>>   File "/venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3377, in run_ast_nodes
>>>     if (await self.run_code(code, result,  async_=asy)):
>>> 
>>>   File "/venv/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3457, in run_code
>>>     exec(code_obj, self.user_global_ns, self.user_ns)
>>> 
>>>   File "/tmp/ipykernel_242/1839943381.py", line 1, in <module>
>>>     history = model1.fit(X_train, y_train, validation_split=0.2, epochs = 100, batch_size = 32)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 64, in error_handler
>>>     return fn(*args, **kwargs)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/engine/training.py", line 1216, in fit
>>>     tmp_logs = self.train_function(iterator)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/engine/training.py", line 878, in train_function
>>>     return step_function(self, iterator)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/engine/training.py", line 867, in step_function
>>>     outputs = model.distribute_strategy.run(run_step, args=(data,))
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/engine/training.py", line 860, in run_step
>>>     outputs = model.train_step(data)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/engine/training.py", line 808, in train_step
>>>     y_pred = self(x, training=True)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 64, in error_handler
>>>     return fn(*args, **kwargs)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/engine/base_layer.py", line 1083, in __call__
>>>     outputs = call_fn(inputs, *args, **kwargs)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 92, in error_handler
>>>     return fn(*args, **kwargs)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/engine/sequential.py", line 373, in call
>>>     return super(Sequential, self).call(inputs, training=training, mask=mask)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/engine/functional.py", line 451, in call
>>>     return self._run_internal_graph(
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/engine/functional.py", line 589, in _run_internal_graph
>>>     outputs = node.layer(*args, **kwargs)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 64, in error_handler
>>>     return fn(*args, **kwargs)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/engine/base_layer.py", line 1083, in __call__
>>>     outputs = call_fn(inputs, *args, **kwargs)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/utils/traceback_utils.py", line 92, in error_handler
>>>     return fn(*args, **kwargs)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/layers/convolutional.py", line 273, in call
>>>     return self.activation(outputs)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/activations.py", line 311, in relu
>>>     return backend.relu(x, alpha=alpha, max_value=max_value, threshold=threshold)
>>> 
>>>   File "/venv/lib/python3.9/site-packages/keras/backend.py", line 4867, in relu
>>>     x = tf.nn.relu(x)
>>> 