The process in this notebook is majorly based on [this](https://huggingface.co/docs/transformers/training) hugging face official guide.

#### References:
- https://pypi.org/project/datasets/
- https://huggingface.co/docs/transformers/training

# 0 Imports, Requirements, Etc.

In [1]:
# Select the device to use
import torch
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

print("Device: ", device)

Device:  cuda


In [2]:
# pip
!pip -q install datasets evaluate transformers wandb huggingface_hub

# Login to Hugging Face and W & B to track model training.
!huggingface-cli login
!wandb login


    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli logout` if you want to log out.
    Setting a new token will erase the existing one.
    To log in, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Enter your token (input will not be visible): 
Add token as git credential? (Y/n) n
Token is valid (permission: write

## 0.1 Track run results on W&B

In [3]:
import wandb
import random

# Start a new wandb run to track this script
wandb.init(
    # Set the project where this run will be logged
    project="jiu-jitsu-deformable-detr",

    # Track hyperparameters and run metadata
    config={
        "learning_rate": 0.001,
        "architecture": "DeformableDetr",
        "dataset": "Harmony4D",
        "epochs": 10,
    }
)

[34m[1mwandb[0m: Currently logged in as: [33mmarcelo-ponce-ardon[0m ([33mjiu-jitsu-auto-scoring[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


# 1 Prepare Dataset

## 1.1 Download Harmony4D

In [4]:
from huggingface_hub import hf_hub_download
import zipfile
import os

REPO_ID = "Jyun-Ting/Harmony4D"
TRAIN_FILENAME = "train/01_hugging.zip"

train_path = hf_hub_download(repo_id=REPO_ID, filename=TRAIN_FILENAME, repo_type="dataset")
print(f"Downloaded test zip file to {train_path}")

01_hugging.zip:   0%|          | 0.00/2.26G [00:00<?, ?B/s]

Downloaded test zip file to /root/.cache/huggingface/hub/datasets--Jyun-Ting--Harmony4D/snapshots/3fedb23fd9d1a92541d98ccbce025c695bd752e4/train/01_hugging.zip


In [5]:
import os
import shutil
from google.colab import drive
drive.mount('/content/drive')

FOLDER_NAME = "datasets/"

# If the dataset isn't already on your drive, don't copy it
if not os.path.exists("/content/drive/MyDrive/" + FOLDER_NAME + TRAIN_FILENAME):
  shutil.copy(train_path, "/content/drive/MyDrive/" + FOLDER_NAME)

Mounted at /content/drive


## 1.2 Convert to Dataset object

In [6]:
import datasets
from datasets import load_dataset

train_dataset = load_dataset('imagefolder', data_files=train_path, streaming=True)

## 1.3 Make tiny datasets for testing

In [7]:
# Optional, make tiny datasets for testing
#tiny_train_dataset = train_dataset.select(range(100))
#tiny_eval_dataset = eval_dataset.select(range(100))

# 2 Fine tuning pre-trained D-DETR model on Harmony4D

## 2.1 Select the model

In [8]:
from transformers import AutoImageProcessor, DeformableDetrForObjectDetection
# Load the model
# processor = AutoImageProcessor.from_pretrained("SenseTime/deformable-detr")
model = DeformableDetrForObjectDetection.from_pretrained("SenseTime/deformable-detr")

The cache for model files in Transformers v4.22.0 has been updated. Migrating your old cache. This is a one-time only operation. You can interrupt this and resume the migration later on by calling `transformers.utils.move_cache()`.


0it [00:00, ?it/s]

config.json:   0%|          | 0.00/4.54k [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/161M [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/102M [00:00<?, ?B/s]

Some weights of the model checkpoint at SenseTime/deformable-detr were not used when initializing DeformableDetrForObjectDetection: ['model.backbone.conv_encoder.model.layer1.0.downsample.1.num_batches_tracked', 'model.backbone.conv_encoder.model.layer2.0.downsample.1.num_batches_tracked', 'model.backbone.conv_encoder.model.layer3.0.downsample.1.num_batches_tracked', 'model.backbone.conv_encoder.model.layer4.0.downsample.1.num_batches_tracked']
- This IS expected if you are initializing DeformableDetrForObjectDetection from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DeformableDetrForObjectDetection from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


## 2.2 Training Hyperparameters

In [13]:
from transformers import TrainingArguments

# TODO: Add more arguments here if needed

training_args = TrainingArguments(
    output_dir="./deformable-detr-harmony4d",
    eval_strategy="epoch",
    remove_unused_columns=False,
    save_strategy="epoch",
    push_to_hub=True, # Save the model to the hub after it's trained
    max_steps=500,
    num_train_epochs=10,
    )

## 2.3 Evaluate

In [10]:
import numpy as np
import evaluate

metric = evaluate.load("accuracy")

def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

Downloading builder script:   0%|          | 0.00/4.20k [00:00<?, ?B/s]

## 2.4 Trainer

In [14]:
from transformers import Trainer, default_data_collator, AutoImageProcessor

processor = AutoImageProcessor.from_pretrained("SenseTime/deformable-detr")

def preprocess_images(examples):
  images = [image.convert("RGB") for image in examples["image"]]
  inputs = processor(images=images, return_tensors="pt")
  return {"pixel_values": inputs.pixel_values}

# processed_train_dataset = train_dataset.map(preprocess_images, batched=True, remove_columns=["image"])
processed_train_dataset = train_dataset.map(preprocess_images, batched=True, remove_columns=["image"])


trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=processed_train_dataset['train'],
    eval_dataset=processed_train_dataset['train'],
    data_collator=default_data_collator,
    # compute_metrics=compute_metrics,
)

In [15]:
trainer.train()



AttributeError: 'dict' object has no attribute 'convert'

# 3 Finish

In [None]:
# Push to hugging face
trainer.push_to_hub()

In [None]:
# End W&B session

import wandb

wandb.finish()