# Implementation Guidelines of Sample Code (Tensorflow)

    See the annotations at every markdown blocks correspoding to each code blocks, and also # TODO annotations. :D

# Usage guideline of Jupyter Notebook (If needed)

    Installation   : https://jupyter.org/install  
    User Document  : https://jupyter-notebook.readthedocs.io/en/latest/user-documentation.html

# Test Environment (Recommended)

    In test time, we will evaluate the given codes from you with the following version of libraries.  
    So, it is highly recommended to use those packages with specific version below.

    test environment : tensorflow2

### Packages
    python      : 3.9.15  
    tensorflow  : 2.8.0  
    keras       : 2.8.0

# Import python libraries (Do not change!)

In [3]:
import os
import tensorflow as tf
from tensorflow import keras
from keras.callbacks import ModelCheckpoint
from keras import layers
import sys
import argparse
import numpy as np
import random
import cv2
from skimage import io
import pandas as pd
import matplotlib.pyplot as plt
import math
import copy
import time
import PIL
import pickle

# Define your model and hyperparameter (You can modify this part!)

    Here is the pivotal part of your competition.
    We gives a simple CNN model, for example. 
    Go make your own model! 

### Notice
    We reshapes all the input data size into constant 128x128.  
    Until further notification, use this constant size. 

In [10]:
# TODO : Set your hyperparameters (Note that img_height and img_width have to be fixed)

batch_size = 32
validation_split = 0.1
random_seed = 123
EPOCH = 2
learning_rate = 0.01


# TODO : Define your own network!

def basic_cnn(input_shape):
    cnn = keras.models.Sequential([
        layers.Rescaling(1./255.0),
        layers.Conv2D(32, kernel_size=3, input_shape=input_shape, activation='relu'),
        layers.MaxPool2D(),

        layers.Conv2D(32, kernel_size=3, activation='relu'),
        layers.MaxPool2D(),   

        layers.Conv2D(64, kernel_size=3, activation='relu'),
        layers.MaxPool2D(),

        layers.Flatten(),
        layers.Dense(1000),
        layers.Softmax(),
    ])
    return cnn

model = basic_cnn(input_shape=(128, 128, 3))

# TODO : Define your own Optimizer!

model.compile(tf.keras.optimizers.SGD(learning_rate=learning_rate),
              loss=tf.keras.losses.SparseCategoricalCrossentropy(),
              metrics=['accuracy'])

# Data load with tensorflow dataset library (Do not change!)

    data_dir : 'YOUR DIR'  

    You can define your own dataloader with API of tf.keras.utils.image_dataset_from_directory.  
    This can usually help you to reduce computational burden when dealing with high dimensional data, such as images.  

    reference url : https://www.tensorflow.org/api_docs/python/tf/keras/utils/image_dataset_from_directory

In [11]:
# TODO : set dataset path
# TODO : We recommends you to place your code and tranining dataset in the same location.

data_dir = './Koh_Young_AI_data/'

train_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=validation_split,
  subset="training",
  seed=random_seed,
  image_size=(128, 128),
  batch_size=batch_size)

val_ds = tf.keras.utils.image_dataset_from_directory(
  data_dir,
  validation_split=validation_split,
  subset="validation",
  seed=random_seed,
  image_size=(128, 128),
  batch_size=batch_size)


Found 150000 files belonging to 1000 classes.
Using 135000 files for training.
Found 150000 files belonging to 1000 classes.
Using 15000 files for validation.


# Incremental Learning. (You can modify this part!)

### WARNING:
    The training and validation datasets each SHOULD BE prepared properly beforehand.  
    If not, the submitted code from you will be immediately rejected.

### Notice
    This function do split your dataset of 1000 classes into 10 groups of 100 each.    
    So, it is needed to be implemented just once at first to split your dataset for continual learning.   
    *Again, you dont need to use this function in every tranining time if you already split your dataset into 10 groups.

    Notice the annotation codes below. (You can see this codes.)

```python
        train = train_ds.unbatch().filter(lambda x, y: tf.greater_equal(y, int(0 + 100 * iteration)))
        train = train.filter(lambda x, y: tf.greater(int(100 + 100 * iteration), y)).batch(batch_size)
        val = val_ds.unbatch().filter(lambda x, y: tf.greater(int(100 + 100 * iteration), y)).batch(batch_size)
```
### WARNING
    The final submission is a weight file that can classify all 1000 classes.   
    You can modify the code but be careful to properly submit the last weight file.

In [1]:
### Model training

for iteration in range(0, 10):
    ### Data split for continual learning
    train = train_ds.unbatch().filter(lambda x, y: tf.greater_equal(y, int(0 + 100 * iteration)))
    train = train.filter(lambda x, y: tf.greater(int(100 + 100 * iteration), y)).batch(batch_size)
    val = val_ds.unbatch().filter(lambda x, y: tf.greater(int(100 + 100 * iteration), y)).batch(batch_size)

    if iteration != 9:
        model.fit(train, validation_data=val, epochs=EPOCH)
    else:
        MODEL_SAVE_FOLDER_PATH = './model_save/'
        if not os.path.exists(MODEL_SAVE_FOLDER_PATH):
            os.mkdir(MODEL_SAVE_FOLDER_PATH)        
        model_path = MODEL_SAVE_FOLDER_PATH + 'continual_model.h5'
        cb_checkpoint = ModelCheckpoint(filepath=model_path, monitor='val_loss',
                                        verbose=1, save_best_only=True)
        
        model.fit(train, validation_data=val, epochs=EPOCH, callbacks=[cb_checkpoint])
    print(f'{str(iteration+1)} Iteration Done.')

NameError: name 'train_ds' is not defined