Tensorflow Certification
   2. Tensorflow 2.15 & Keras 3.0 conflicts of installation. Pytorch doesn't have that problem. *couldn't get vscode to work, used colab*
   3. Lots of inconsistent & incoherent features. Pytorch is coherent in features. 
   4. Side Effect of Pytorch/TF: Lots of boilerplate / glue code. `Glue Code or BoilerPlate Code: Lots of lines but only a minor functionality.` Maintain standard workflow cookbook.
   5. **Annoying to make some things work**
   6. Personal Opinion. 
      1. Use Pytorch for training loop, if you want to use tensorflow, use it just for model architecture.
      2. Future: Using Jax or Tensorflow for speedup. Then use keras 3.0, same api, and multiple backends. No need to learn different libraries. 

```python
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Set the seed
tf.random.set_seed(42)

# Preprocess data (get all of the pixel values between 1 and 0, also called scaling/normalization)
train_datagen = ImageDataGenerator(rescale=1./255)
valid_datagen = ImageDataGenerator(rescale=1./255)

# Setup the train and test directories
train_dir = "pizza_steak/train/"
test_dir = "pizza_steak/test/"

# Import data from directories and turn it into batches
train_data = train_datagen.flow_from_directory(train_dir,
                                               batch_size=32, # number of images to process at a time 
                                               target_size=(224, 224), # convert all images to be 224 x 224
                                               class_mode="binary", # type of problem we're working on
                                               seed=42)

valid_data = valid_datagen.flow_from_directory(test_dir,
                                               batch_size=32,
                                               target_size=(224, 224),
                                               class_mode="binary",
                                               seed=42)

```

# 1. Load Image Data

In [9]:
import datasets as huggingface_datasets

dataset_splits = huggingface_datasets.load_dataset("ajinkyakolhe112/pizza_vs_steak_classification")
dataset_training, dataset_validation = dataset_splits['train'], dataset_splits['test']

huggingface_datasets.get_dataset_split_names("ajinkyakolhe112/pizza_vs_steak_classification")
dataset_splits.shape, dataset_splits['train'].features

Found cached dataset parquet (/Users/ajinkya/.cache/huggingface/datasets/ajinkyakolhe112___parquet/ajinkyakolhe112--pizza_vs_steak_classification-41ef506b044d97dc/0.0.0/2a3b91fbd88a2c90d1dbbb32b460cf621d31bd5b05b934492fdef7d8d6f236ec)


  0%|          | 0/2 [00:00<?, ?it/s]

({'train': (1500, 2), 'test': (500, 2)},
 {'image': Image(decode=True, id=None),
  'label': ClassLabel(names=['pizza', 'steak'], id=None)})

# 2. Process Images

In [None]:
def processing_func(examples_batch):
    examples_batch['new'] = []
    for image in examples_batch:
        pass
    return examples_batch

# 3. Convert to Tensors

In [10]:
dataset_training_tf   = dataset_training.with_format("tf")
dataset_validation_tf = dataset_validation.with_format("tf")

training_dataset_tf   = dataset_training.to_tf_dataset(columns=["image"], label_cols=["label"], batch_size=4, shuffle=True)
validation_dataset_tf = dataset_validation.to_tf_dataset(columns=["image"], label_cols=["label"], batch_size=4, shuffle=True)

Old behaviour: columns=['a'], labels=['labels'] -> (tf.Tensor, tf.Tensor)  
             : columns='a', labels='labels' -> (tf.Tensor, tf.Tensor)  
New behaviour: columns=['a'],labels=['labels'] -> ({'a': tf.Tensor}, {'labels': tf.Tensor})  
             : columns='a', labels='labels' -> (tf.Tensor, tf.Tensor) 


In [15]:
dataset_training_tf[0]

{'image': <tf.Tensor: shape=(384, 512, 3), dtype=uint8, numpy=
 array([[[ 38,  41,  22],
         [ 34,  37,  18],
         [ 41,  44,  25],
         ...,
         [ 68,  72,  57],
         [ 63,  67,  52],
         [ 58,  62,  47]],
 
        [[ 37,  40,  21],
         [ 32,  35,  16],
         [ 36,  39,  20],
         ...,
         [ 65,  69,  54],
         [ 62,  66,  51],
         [ 59,  63,  48]],
 
        [[ 41,  44,  25],
         [ 37,  40,  21],
         [ 37,  40,  21],
         ...,
         [ 52,  56,  42],
         [ 49,  53,  39],
         [ 48,  52,  38]],
 
        ...,
 
        [[109, 114,  84],
         [110, 115,  85],
         [110, 115,  85],
         ...,
         [109, 123,  90],
         [106, 120,  87],
         [104, 118,  85]],
 
        [[110, 115,  85],
         [111, 116,  86],
         [111, 116,  86],
         ...,
         [106, 120,  87],
         [104, 118,  85],
         [101, 115,  82]],
 
        [[108, 113,  83],
         [109, 114,  84],
     

In [None]:
# import torch

# dataset_training, dataset_validation = dataset_splits['train'], dataset_splits['test']
# dataset_training, dataset_validation = dataset_training.with_format("torch"), dataset_validation.with_format("torch")

# training_dataloader   = torch.utils.data.DataLoader(dataset_training, batch_size=32, shuffle= True)
# validation_dataloader = torch.utils.data.DataLoader(dataset_training, batch_size=32, shuffle= True)


Rest 
   5. Don't remember library. (Can't be remembered anyways. Get used to documentation)
   6. Do remember deep learning concepts. (These need to be mastered.)
   7. Essential Checklist Concepts. (MUST BE MASTERED THOROUGHLY. IF WANT TO BE A MASTER.)

### TODO

- [ ] downloading **datasets** & starting training. -> ans: huggingface datasets. standard way of uploading dataset. dataset to dataloader in pytorch & in tensorflow via huggingface
- [ ] **model**. sequential is easy but can't debug properly -> ans: functional api or extending model. (2 ways)
- [ ] **training** loop from that dataset. both in pytorch & tensorflow -> 

In [11]:
import keras

# Sequential doesn't allow debugging & internal layer output. This format does.
class BaselineModel (keras.Model):
    def __init__(self, num_classes=1000):
        super().__init__()
        self.layers_list = [
            keras.layers.Conv2D        (filters= 10, kernel_size=(3,3), activation="relu", input_shape=(224, 224, 3)), 
            keras.layers.Conv2D        (filters= 10, kernel_size=(3,3), activation="relu"),
            keras.layers.MaxPool2D     (pool_size=(2,2), padding="valid"),

            keras.layers.Conv2D        (filters= 10, kernel_size=(3,3), activation="relu"),
            keras.layers.Conv2D        (filters= 10, kernel_size=(3,3), activation="relu"), 
            keras.layers.MaxPool2D     (pool_size=(2,2)),
            
            keras.layers.Flatten(),
            keras.layers.Dense         (units= 1, activation="sigmoid")
        ]

    def call(self, single_batch):
        current_input = single_batch
        for current_layer in self.layers_list:
            layer_output = current_layer(current_input)
            current_input = layer_output

        return layer_output                                 # type:ignore

In [12]:
model = BaselineModel()
# Compile the model
model.compile(
    loss="binary_crossentropy",
    optimizer=keras.optimizers.Adam(),
    metrics=["accuracy"])

  super().__init__(


In [14]:
# Fit the model
history_1 = model.fit(training_dataset_tf, epochs=5, validation_data=validation_dataset_tf)

Epoch 1/5


2024-01-09 14:40:35.128109: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_0' with dtype int64 and shape [1500]
	 [[{{node Placeholder/_0}}]]
2024-01-09 14:40:35.128312: I tensorflow/core/common_runtime/executor.cc:1197] [/device:CPU:0] (DEBUG INFO) Executor start aborting (this does not indicate an error and you can ignore this message): INVALID_ARGUMENT: You must feed a value for placeholder tensor 'Placeholder/_0' with dtype int64 and shape [1500]
	 [[{{node Placeholder/_0}}]]
2024-01-09 14:40:35.168619: W tensorflow/tsl/platform/profile_utils/cpu_utils.cc:128] Failed to get CPU frequency: 0 Hz


TypeError: Exception encountered when calling Conv2D.call().

[1mValue passed to parameter 'input' has DataType int64 not in list of allowed values: float16, bfloat16, float32, float64, int32[0m

Arguments received by Conv2D.call():
  • inputs=tf.Tensor(shape=(None, None, None, 3), dtype=int64)