# 1) 라이브러리 임포트 / Import Libraries
**[KOR]** 이 셀에서는 전체 파이프라인에 필요한 라이브러리를 임포트합니다.
**[ENG]** This cell imports all libraries required for the entire pipeline.

In [None]:
! pip install transformers

In [4]:
!jupyter contrib nbextension install --user


Traceback (most recent call last):
  File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\jupyter_contrib_core\notebook_compat\nbextensions.py", line 6, in <module>
    from notebook.extensions import BaseExtensionApp
ModuleNotFoundError: No module named 'notebook.extensions'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\site-packages\jupyter_contrib_core\notebook_compat\nbextensions.py", line 10, in <module>
    from notebook.nbextensions import BaseNBExtensionApp
ModuleNotFoundError: No module named 'notebook.nbextensions'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\User\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\User\AppData\Local\Programs\Python\Pytho

In [6]:
import tensorflow as tf
import tensorflow_datasets as tfds

# Hugging Face 관련 라이브러리 / Hugging Face libraries
from transformers import (ViTImageProcessor, ViTForImageClassification,
                          TFViTModel, create_optimizer, TrainingArguments, Trainer,
                          DefaultDataCollator)

# W&B / Weights & Biases
import wandb
from wandb.integration.keras import WandbMetricsLogger, WandbModelCheckpoint

# 기타 / Others
import numpy as np
import os
from datasets import Dataset

# 2) W&B 프로젝트 초기화 / Initialize W&B
**[KOR]** 여기서는 W&B 프로젝트를 초기화합니다. 원하는 프로젝트 이름과 실험명을 지정할 수 있습니다.
**[ENG]** Here, we initialize a W&B project. Customize the project name and run name as desired.

In [7]:
wandb.init(
    project="ViT_vs_CNN_flowers", 
    name="Initial_Run", 
    reinit=True
)

wandb: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
wandb: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter, or press ctrl+c to quit:wandb: Appending key for api.wandb.ai to your netrc file: C:\Users\User\_netrc
wandb: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


# 3) 데이터셋 로드 및 전처리 / Load and Preprocess the Dataset
**[KOR]** `tf_flowers` 데이터셋을 가져와서 (image, label) 쌍으로 분할하고, 224×224로 리사이즈 및 0~1 범위 정규화 후 배치 형태로 준비합니다.
**[ENG]** Load the `tf_flowers` dataset, split it into (image, label) pairs, resize to 224×224, normalize to [0,1], and prepare it in batches.

In [None]:
# Load dataset
dataset, info = tfds.load("tf_flowers", as_supervised=True, with_info=True)
num_classes = info.features['label'].num_classes
total_examples = info.splits['train'].num_examples

train_size = int(total_examples * 0.8)
val_size = int(total_examples * 0.1)
test_size = int(total_examples * 0.1)

# Preprocessing function
def format_image(image, label, img_size=224):
    image = tf.image.resize(image, (img_size, img_size))
    image = tf.cast(image, tf.float32) / 255.0
    return image, label

ds_all = dataset['train'].map(lambda x, y: format_image(x, y, 224))

# Split dataset
train_ds = ds_all.take(train_size)
val_ds = ds_all.skip(train_size).take(val_size)
test_ds = ds_all.skip(train_size + val_size).take(test_size)

batch_size = 32
train_batches = train_ds.shuffle(1000).batch(batch_size).prefetch(tf.data.AUTOTUNE)
val_batches = val_ds.batch(batch_size).prefetch(tf.data.AUTOTUNE)
test_batches = test_ds.batch(batch_size).prefetch(tf.data.AUTOTUNE)

print(f"Number of classes: {num_classes}")
print(f"Total samples: {total_examples}")

# 4) 간단 CNN 모델 정의 / Define a Simple CNN Model
**[KOR]** 전통적인 CNN 모델과 비교하기 위해, 간단한 Conv-BN-MaxPool 구조의 CNN 모델을 정의합니다.
**[ENG]** Define a simple CNN model (Conv-BN-MaxPool architecture) for comparison with the ViT model.

In [None]:
def create_simple_cnn(num_classes):
    inputs = tf.keras.Input(shape=(224, 224, 3))
    x = tf.keras.layers.Conv2D(32, 3, activation='relu')(inputs)
    x = tf.keras.layers.MaxPool2D()(x)
    x = tf.keras.layers.Conv2D(64, 3, activation='relu')(x)
    x = tf.keras.layers.MaxPool2D()(x)
    x = tf.keras.layers.Conv2D(128, 3, activation='relu')(x)
    x = tf.keras.layers.GlobalAveragePooling2D()(x)
    x = tf.keras.layers.Dense(256, activation='relu')(x)
    outputs = tf.keras.layers.Dense(num_classes, activation='softmax')(x)
    model = tf.keras.Model(inputs, outputs)
    return model

# 5) CNN 모델 훈련 / Train the CNN Model
**[KOR]** 간단 CNN 모델을 학습시키고 W&B에 로깅합니다. CNN 성능을 파악한 뒤, 이후 ViT와 비교할 수 있습니다.
**[ENG]** Train the simple CNN model and log metrics to W&B. We can then compare its performance to the ViT model.

In [None]:
# CNN model
cnn_model = create_simple_cnn(num_classes)

wandb.init(
    project="ViT_vs_CNN_flowers", 
    name="CNN_flowers_run", 
    reinit=True
)

cnn_model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

cnn_model.fit(
    train_batches,
    validation_data=val_batches,
    epochs=3,  # 예시로 3 epoch
    callbacks=[WandbCallback()]
)

# Evaluate on test set
cnn_eval = cnn_model.evaluate(test_batches)
print("CNN Test Evaluation:", cnn_eval)
wandb.finish()

# 6) Functional API 기반 ViT 모델 빌드 함수 / Build ViT Model with Functional API
**[KOR]** 사전 학습된 ViT를 불러와, Dense와 Dropout, BatchNorm 등을 파라미터로 동적으로 추가/조절하는 함수를 만듭니다.
**[ENG]** Load a pretrained ViT, then add Dense, Dropout, and BatchNorm layers dynamically based on hyperparameters in a build function.

In [None]:
def build_vit_model(
    num_classes,
    num_dense_layers=1,
    dense_units=128,
    dropout_rate=0.3,
    use_batchnorm=True,
    base_trainable=False
):
    # Pretrained ViT (예: google/vit-base-patch16-224-in21k)
    base_vit = TFViTModel.from_pretrained("google/vit-base-patch16-224-in21k")
    base_vit.trainable = base_trainable

    inputs = tf.keras.Input(shape=(224, 224, 3), name="input_images")
    vit_outputs = base_vit(inputs).last_hidden_state  # [batch, seq_len, hidden_dim]

    # 시퀀스 차원 풀링 / Sequence dimension pooling
    x = tf.keras.layers.GlobalAveragePooling1D()(vit_outputs)

    # 동적으로 Dense 레이어 쌓기 / Dynamically stack Dense layers
    for _ in range(num_dense_layers):
        x = tf.keras.layers.Dense(dense_units, activation='relu')(x)
        if use_batchnorm:
            x = tf.keras.layers.BatchNormalization()(x)
        x = tf.keras.layers.Dropout(dropout_rate)(x)

    outputs = tf.keras.layers.Dense(num_classes, activation='softmax')(x)
    model = tf.keras.Model(inputs, outputs)
    return model

# 7) ViT 모델 학습 (단일 실험 예시) / Train the ViT Model (Single Experiment Example)
**[KOR]** 위에서 정의한 `build_vit_model`을 사용해 ViT 모델을 하나 구성해 보고, W&B에 로깅합니다.
**[ENG]** Use `build_vit_model` to create a ViT model, then train and log metrics to W&B as a single-run example.

In [None]:
# Example single-run training
wandb.init(
    project="ViT_vs_CNN_flowers",
    name="ViT_single_run",
    reinit=True
)

# Build a ViT model with some chosen hyperparameters
vit_model = build_vit_model(
    num_classes=num_classes,
    num_dense_layers=2,
    dense_units=128,
    dropout_rate=0.3,
    use_batchnorm=True,
    base_trainable=False
)

vit_model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate=1e-4),
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

vit_model.fit(
    train_batches,
    validation_data=val_batches,
    epochs=3,
    callbacks=[WandbCallback()]
)

# Evaluate on test set
test_loss, test_acc = vit_model.evaluate(test_batches)
wandb.log({"test_loss": test_loss, "test_accuracy": test_acc})
print("ViT Test Accuracy:", test_acc)
wandb.finish()

# 8) W&B Sweep 설정 예시 (sweep.yaml) / W&B Sweep Configuration Example
**[KOR]** 다양한 하이퍼파라미터(레이어 수, Dropout 비율, BatchNorm 여부 등)를 탐색하기 위한 예시 스윕 설정입니다.
**[ENG]** An example sweep configuration for exploring various hyperparameters (number of Dense layers, dropout rate, batch normalization, etc.).

In [None]:
sweep_config = '''
method: bayes
metric:
  name: val_accuracy
  goal: maximize
parameters:
  num_dense_layers:
    values: [1, 2, 3]
  dense_units:
    values: [64, 128, 256]
  dropout_rate:
    values: [0.2, 0.3, 0.5]
  use_batchnorm:
    values: [true, false]
  base_trainable:
    values: [false, true]
  learning_rate:
    values: [1e-4, 1e-5]
  batch_size:
    values: [16, 32]
  epochs:
    values: [5, 10]
'''

print(sweep_config)

method: bayes
metric:
  name: val_accuracy
  goal: maximize
parameters:
  num_dense_layers:
    values: [1, 2, 3]
  dense_units:
    values: [64, 128, 256]
  dropout_rate:
    values: [0.2, 0.3, 0.5]
  use_batchnorm:
    values: [true, false]
  base_trainable:
    values: [false, true]
  learning_rate:
    values: [1e-4, 1e-5]
  batch_size:
    values: [16, 32]
  epochs:
    values: [5, 10]


# 9) Sweep 트레이닝 함수 / Sweep Training Function
**[KOR]** Sweep 에이전트가 각 실험마다 호출할 함수입니다. 여러 조합을 자동 시도하며 결과를 W&B에 기록합니다.
**[ENG]** This function is called by the sweep agent for each experiment, automatically trying different combinations and logging results to W&B.

In [None]:
def train_sweep(config=None):
    with wandb.init(config=config):
        config = wandb.config

        # 데이터셋의 배치를 다시 설정하려면 아래와 같이 batch_size 재지정 가능
        # If you want to re-batch data based on config.batch_size, do:
        current_train_batches = train_ds.batch(config.batch_size).prefetch(tf.data.AUTOTUNE)
        current_val_batches = val_ds.batch(config.batch_size).prefetch(tf.data.AUTOTUNE)
        current_test_batches = test_ds.batch(config.batch_size).prefetch(tf.data.AUTOTUNE)

        # 모델 빌드
        model = build_vit_model(
            num_classes=num_classes,
            num_dense_layers=config.num_dense_layers,
            dense_units=config.dense_units,
            dropout_rate=config.dropout_rate,
            use_batchnorm=config.use_batchnorm,
            base_trainable=config.base_trainable
        )

        optimizer = tf.keras.optimizers.Adam(learning_rate=config.learning_rate)
        model.compile(
            optimizer=optimizer,
            loss='sparse_categorical_crossentropy',
            metrics=['accuracy']
        )

        model.fit(
            current_train_batches,
            validation_data=current_val_batches,
            epochs=config.epochs,
            callbacks=[WandbCallback()]
        )

        test_loss, test_acc = model.evaluate(current_test_batches)
        wandb.log({"test_loss": test_loss, "test_accuracy": test_acc})
        print("Test Accuracy:", test_acc)

# 10) Sweep 실행 / Launch the Sweep
**[KOR]** 아래 코드(또는 CLI 명령어)를 통해 sweep을 생성하고, 에이전트를 실행하면 다양한 아키텍처 및 HP 조합이 자동으로 탐색됩니다.
**[ENG]** Use the code (or CLI commands) below to create a sweep and run an agent, automatically exploring multiple architecture and hyperparameter combinations.

In [None]:
# (주의) 실제로 실행할 때 주석 해제
# import yaml
# sweep_config_dict = yaml.safe_load(sweep_config)
# sweep_id = wandb.sweep(sweep_config_dict, project="ViT_vs_CNN_flowers")
# wandb.agent(sweep_id, function=train_sweep)

# 요약 / Summary
**[KOR]** 위 노트북을 통해 전통적인 CNN과 사전 학습된 ViT 기반 모델을 모두 학습해보고, W&B의 Sweep 기능으로 아키텍처 및 하이퍼파라미터를 자동 탐색할 수 있습니다.
**[ENG]** This notebook allows you to train both a traditional CNN and a pretrained ViT model, and use W&B Sweeps to automatically search for optimal architectures and hyperparameters.