<a href="https://colab.research.google.com/github/iteba15/Project-Sote/blob/main/Fine_Tuning(ViT).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Fine-Tuning Vision Transformer for HCC and PAR Diagnosis
In this notebook, we will fine-tune a pre-trained Vision Transformer (ViT) model for the classification of Hepatocellular Carcinoma (HCC) and Primary Aldosteronism (PAR) using a combination of real and synthetic ultrasound images.

We will utilize libraries like TensorFlow, PyTorch, and Hugging Face Transformers.

Step-by-Step Guide
# Step 1: Setup Environment
First, we need to install the required libraries. Run the following cell to install TensorFlow, PyTorch, and Hugging Face Transformers and then necessary libraries .

In [1]:
pip install torch torchvision transformers tensorflow


Collecting nvidia-cuda-nvrtc-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_nvrtc_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (23.7 MB)
Collecting nvidia-cuda-runtime-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_runtime_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (823 kB)
Collecting nvidia-cuda-cupti-cu12==12.1.105 (from torch)
  Using cached nvidia_cuda_cupti_cu12-12.1.105-py3-none-manylinux1_x86_64.whl (14.1 MB)
Collecting nvidia-cudnn-cu12==8.9.2.26 (from torch)
  Using cached nvidia_cudnn_cu12-8.9.2.26-py3-none-manylinux1_x86_64.whl (731.7 MB)
Collecting nvidia-cublas-cu12==12.1.3.1 (from torch)
  Using cached nvidia_cublas_cu12-12.1.3.1-py3-none-manylinux1_x86_64.whl (410.6 MB)
Collecting nvidia-cufft-cu12==11.0.2.54 (from torch)
  Using cached nvidia_cufft_cu12-11.0.2.54-py3-none-manylinux1_x86_64.whl (121.6 MB)
Collecting nvidia-curand-cu12==10.3.2.106 (from torch)
  Using cached nvidia_curand_cu12-10.3.2.106-py3-none-manylinux1_x86_64.whl (56.5 MB)
Collectin

In [2]:
import torch
from torch import nn
from transformers import ViTForImageClassification, ViTFeatureExtractor
from transformers import TrainingArguments, Trainer
from datasets import load_dataset, Dataset, DatasetDict
from sklearn.model_selection import train_test_split
import numpy as np
from PIL import Image
import os


ModuleNotFoundError: No module named 'datasets'

# Step 2: Load and Pre-process Data
**Loading our Dataset**

We will load our dataset of real and synthetic ultrasound images. Our data is organized in the following folder structure:

*   data/hcc/real
*   data/hcc/synthetic

*   data/par/real
*   data/par/synthetic


The following function will load the images and their corresponding labels.

In [None]:
def load_images_from_folder(folder):
    images = []
    labels = []
    for filename in os.listdir(folder):
        img_path = os.path.join(folder, filename)
        if os.path.isfile(img_path):
            img = Image.open(img_path).convert("RGB")
            images.append(np.array(img))
            label = 1 if 'hcc' in folder else 0  # Assuming 1 for HCC, 0 for PAR
            labels.append(label)
    return images, labels

hcc_real_images, hcc_real_labels = load_images_from_folder('data/hcc/real')
hcc_synthetic_images, hcc_synthetic_labels = load_images_from_folder('data/hcc/synthetic')
par_real_images, par_real_labels = load_images_from_folder('data/par/real')
par_synthetic_images, par_synthetic_labels = load_images_from_folder('data/par/synthetic')

images = hcc_real_images + hcc_synthetic_images + par_real_images + par_synthetic_images
labels = hcc_real_labels + hcc_synthetic_labels + par_real_labels + par_synthetic_labels

dataset = Dataset.from_dict({"image": images, "label": labels})


# Train-Validation Split
Split the dataset into training and validation sets.

In [None]:
train_test_split = dataset.train_test_split(test_size=0.2)
train_dataset = train_test_split['train']
val_dataset = train_test_split['test']


# Step 3: Fine-tune Vision Transformer (ViT)
***Load Pre-trained ViT and Feature Extractor***

---



Load the pre-trained Vision Transformer model and its feature extractor from the Hugging Face library.

In [None]:
model_name = "google/vit-base-patch16-224-in21k"
feature_extractor = ViTFeatureExtractor.from_pretrained(model_name)
model = ViTForImageClassification.from_pretrained(model_name, num_labels=2)


# Define Transformation and Tokenization
Define the transformation function to preprocess the images.

In [None]:
def transform(example_batch):
    # Take a list of PIL images and turn them to pixel values
    inputs = feature_extractor([Image.fromarray(image) for image in example_batch['image']], return_tensors='pt')
    inputs['labels'] = example_batch['label']
    return inputs

# Transform dataset
train_dataset.set_transform(transform)
val_dataset.set_transform(transform)


# Define Training Arguments

Set up the training arguments.

In [None]:
training_args = TrainingArguments(
    output_dir="./results",
    evaluation_strategy="epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=5,
    weight_decay=0.01,
    logging_dir="./logs",
)


# **Initialize Trainer**
Initialize the Trainer with the model, training arguments, and datasets.

In [None]:
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=val_dataset,
    tokenizer=feature_extractor,
)


# Fine-tune the Model
Train the model on the dataset.

In [None]:
trainer.train()


# Evaluate the Model
Evaluate the model on the validation set.

In [None]:
trainer.evaluate()


# Step 4: Save the Model
Save the fine-tuned model and feature extractor for future use.

In [None]:
model.save_pretrained("./fine_tuned_vit_model")
feature_extractor.save_pretrained("./fine_tuned_vit_model")
