# AI CUP 2024 Spring

Official hands-on workshop for [AI CUP 2024 Spring - GenAI UAV.](https://tbrain.trendmicro.com.tw/Competitions/Details/34)

- Team ID: TEAM_5333
- Place: 18(Public), 13 (Private)
- Member:
    - Chen-Yang Yu, NCKU (Leader)
    - Yuan-Chun Chiang, NTU
    - Yu-Hao Chiang, NCKU 
    - Xin-Xian Lin, NCKU

# Roadmap

We will cover the following topics in this workshop to reproduce our best result.

0. Install Environment
1. Data Preparasion
2. Train Model (Optional, pre-trained weight is provided)
3. Test Model (Inference)
4. Transform the results into AI CUP format

# 0. Install Environment

In [None]:
%pip install -q -r requirements.txt

## Clone pytorch-CycleGAN-and-pix2pix

In [None]:
import os
import shutil

In [None]:
# check if `pytorch-CycleGAN-and-pix2pix` is already cloned
if not os.path.exists('pytorch-CycleGAN-and-pix2pix'):
    !git clone https://github.com/junyanz/pytorch-CycleGAN-and-pix2pix
else:
    print('pytorch-CycleGAN-and-pix2pix is already cloned.')

In [None]:
!pwd

# 1. Data Preparasion

we will do the following steps to prepare the data.

- 1.1 Download the dataset
- 1.2 Prepare Raw Training Data
- 1.3 Prepare Raw Testing Data
- 1.4 Prepare 2 domain-specific datasets (Road and River)
- 1.5 Copy the dataset to the pytorch-CycleGAN-and-pix2pix dataset folder

In [None]:
import os
import shutil

In [None]:
# change directory to `./dataset`
os.chdir('./dataset')

In [None]:
!pwd

## Download Dataset
The script will download the dataset if you haven't downloaded it yet.

In [None]:
!bash ../scripts/download_official_dataset.sh

## Prepare Raw Training Data
The Training dataset contains two subfolder:
- label_img: contains the draft images
- img: contains the corresponding ground truth images (drone image)

In [None]:
import os
import shutil

In [None]:
import zipfile

train_dataset_zip = '34_Competition 1_Training dataset.zip'

# unzip the train_dataset_zip
with zipfile.ZipFile(train_dataset_zip, 'r') as zip_ref:
    zip_ref.extractall()

In [None]:
train_dir = 'training_dataset'
# rename the extracted folder
os.rename('Training dataset', train_dir)

In [None]:
train_dir = './training_dataset'
print(os.listdir(train_dir))

### Rename the subfolders as trainA and trainB
mapping the folder name to the model input:
- `training_dataset/label_img` -> `training_dataset/trainA`
- `training_dataset/img` -> `training_dataset/trainB`

In [None]:
# rename the subfolders
os.rename(train_dir + '/label_img', train_dir + '/trainA')
os.rename(train_dir + '/img', train_dir + '/trainB')

### Align trainA and trainB

In [None]:
!python align_dataset.py --source_folder training_dataset

## Prepare Raw Public and Private Testing Data
1. The extracted zip file only contains `label_img` folder
2. so we need to create the parent folder `testing_dataset`
3. and move the `label_img` folder to `testing_dataset`

In [None]:
import zipfile

public_testing_dataset_zip = '34_Competition 1_public testing dataset.zip'
private_testing_dataset_zip = '34_Competition 1_Private Test Dataset.zip'
test_dir = 'testing_dataset'

# unzip the public testing dataset
with zipfile.ZipFile(public_testing_dataset_zip, 'r') as zip_ref:
	zip_ref.extractall(test_dir)

# unzip the private testing dataset
with zipfile.ZipFile(private_testing_dataset_zip, 'r') as zip_ref:
    zip_ref.extractall(test_dir)

In [None]:
test_dir = 'testing_dataset'
print(os.listdir(test_dir))

### Rename the subfolder as testA

since the ground truth images are not provided, we just need to rename the folder `label_img` as `testA`

mapping the folder name to the model input:
- `testing_dataset/label_img` -> `testing_dataset/testA`

In [None]:
os.rename(test_dir + '/label_img', test_dir + '/testA')

## Prepare 2 domain-specific datasets (Road and River)

We are going to create 4 folders in this session:
```
dataset
├── test_ROAD
│   └── testA
├── test_RIVER
│   └── testA
├── train_ROAD
|   ├── train
│   ├── trainA
│   └── trainB
└── train_RIVER
    ├── train
    ├── trainA
    └── trainB
```

### Extract the Images from Raw Training Data
Each `trainA` and `trainB` subfolders contains 2 types of images:
- River images(e.g. TRA_RI_1000000.png)
- Road images(e.g. TRA_RO_1000000.png)

so we need to create 2 folders:
- River (contains `trainA` and `trainB` subfolders, each contains river images)
- Road (contains `trainA` and `trainB` subfolders, each contains road images)

In [None]:
train_dir = 'training_dataset'
train_river_dir = 'train_RIVER'
train_road_dir = 'train_ROAD'

# create the folders
if not os.path.exists(train_river_dir):
	os.makedirs(train_river_dir)
if not os.path.exists(train_road_dir):
	os.makedirs(train_road_dir)

for subdir in os.listdir(train_dir):
	# create the subfolders if not exist
	if not os.path.exists(train_river_dir + '/' + subdir):
		os.makedirs(train_river_dir + '/' + subdir)
	if not os.path.exists(train_road_dir + '/' + subdir):
		os.makedirs(train_road_dir + '/' + subdir)
	
	# move or copy the files
	for file in os.listdir(train_dir + '/' + subdir):
		if '_RI_' in file:
			shutil.copy(train_dir + '/' + subdir + '/' + file, train_river_dir + '/' + subdir + '/' + file)
		elif '_RO_' in file:
			shutil.copy(train_dir + '/' + subdir + '/' + file, train_road_dir + '/' + subdir + '/' + file)
		else:
			print('ERROR: file name not recognized: ' + file)

### Extract the Images from Raw Testing Data
`testing_dataset/testA` subfolder contains 2 types of images:
- River images(e.g. PUB_RI_1000000.png or PRI_RI_1000000.png)
- Road images(e.g. PUB_RO_1000459.png or PRI_RO_1000459.png)

so we need to create 2 folders:
- test_RIVER (contains `testA` subfolders, only contains river images)
- test_ROAD (contains `testA` subfolders, only contains road images)

In [None]:
import os
import shutil

test_dir = 'testing_dataset'
test_river_dir = 'test_RIVER'
test_road_dir = 'test_ROAD'

# create the folders
if not os.path.exists(test_river_dir):
	os.makedirs(test_river_dir)
if not os.path.exists(test_road_dir):
	os.makedirs(test_road_dir)
 

for subdir in os.listdir(test_dir):
	# create the subfolders if not exist
	if not os.path.exists(test_river_dir + '/' + subdir):
		os.makedirs(test_river_dir + '/' + subdir)
	if not os.path.exists(test_road_dir + '/' + subdir):
		os.makedirs(test_road_dir + '/' + subdir)
	
	# move or copy the files
	for file in os.listdir(test_dir + '/' + subdir):
		if '_RI_' in file:
			shutil.copy(test_dir + '/' + subdir + '/' + file, test_river_dir + '/' + subdir + '/' + file)
		elif '_RO_' in file:
			shutil.copy(test_dir + '/' + subdir + '/' + file, test_road_dir + '/' + subdir + '/' + file)
		else:
			print('ERROR: file name not recognized: ' + file)

## Copy the dataset to the pytorch-CycleGAN-and-pix2pix dataset folder

### Copy Train

move the `train_RIVER` and `train_ROAD` folders to `../pytorch-CycleGAN-and-pix2pix/datasets`

In [None]:
# copy the folder to target folder
target_dir = '../pytorch-CycleGAN-and-pix2pix/datasets'
shutil.copytree(train_river_dir, target_dir + '/' + train_river_dir)
shutil.copytree(train_road_dir, target_dir + '/' + train_road_dir)

### Copy Test

copy the `test_RIVER` and `test_ROAD` folders to `../pytorch-CycleGAN-and-pix2pix/datasets`

In [None]:
# copy the folder to target folder
target_dir = '../pytorch-CycleGAN-and-pix2pix/datasets'
shutil.copytree(test_river_dir, target_dir + '/' + test_river_dir)
shutil.copytree(test_road_dir, target_dir + '/' + test_road_dir)

In [None]:
# reset the working directory
os.chdir('..')
!pwd

# 2. Train Model (Optional)

In [None]:
import os
import shutil

In [None]:
try:
    os.chdir('./pytorch-CycleGAN-and-pix2pix')
except FileNotFoundError:
    print("Already in the correct directory")

In [None]:
!pwd

## Train 2 domain-specific models
- one for RIVER
- one for ROAD

## Datasets

Put the dataset in the `pytorch-CycleGAN-and-pix2pix/datasets` folder

(We have finished this part at the previous sessiona)

Each dataset should have the following directory structure:

```
datasets
├── train_ROAD
|   ├── train
│   ├── trainA
│   └── trainB
└── train_RIVER
    ├── train
    ├── trainA
    └── trainB
```

## Training Arguments

- `--n_epochs`: 200 (default 100)
- `--n_epochs_decay`: 200 (default 100)

In [None]:
# add the n_epochs and n_epochs_decay parameters up to total 400 epochs for each model
! python train.py --dataroot ./datasets/train_ROAD --name ROAD_pix2pix --model pix2pix --direction AtoB --n_epochs 200 --n_epochs_decay 200 --display_id 0 --continue_train
! python train.py --dataroot ./datasets/train_RIVER --name RIVER_pix2pix --model pix2pix --direction AtoB --n_epochs 200 --n_epochs_decay 200 --display_id 0 --continue_train

# use nohup to run the training in the background
# ! nohup python train.py ... > road.log &	
# ! nohup python train.py ... > river.log &

## Training Results
after training, you can find the results in 
- `pytorch-CycleGAN-and-pix2pix/checkpoints/ROAD_pix2pix` folder
- `pytorch-CycleGAN-and-pix2pix/checkpoints/RIVER_pix2pix` folder

Each folder contains:
- `/web/index.html` for the visualization of the results
- `latest_net_G.pth` for the latest model

# 3. Test Model (Inference)

In [None]:
import os
import shutil

In [None]:
try:
    os.chdir('./pytorch-CycleGAN-and-pix2pix')
except FileNotFoundError:
    print("Already in the correct directory")

In [None]:
!pwd

## Datasets

Put the dataset in the `pytorch-CycleGAN-and-pix2pix/datasets` folder

(We have finished this part at the previous session)

Each dataset should have the following directory structure:

```
datasets
├── test_ROAD
│   └── testA
├── test_RIVER
│   └── testA
```

## Check the weights
After training, you should have a folder with the weights of the model. 

It should be located in the `pytorch-CycleGAN-and-pix2pix/checkpoints` folder.

For example, in our previous training, we have the following weights:
- `checkpoints/ROAD_pix2pix/latest_net_G.pth`
- `checkpoints/RIVER_pix2pix/latest_net_G.pth`

In [None]:
!ls checkpoints/

In [None]:
# if there is no pre-trained model, use our pre-trained model
if not os.path.exists('./checkpoints/ROAD_pix2pix') or not os.path.exists('./checkpoints/RIVER_pix2pix'):
    !bash ../scripts/download_pretrained_road_river_weight.sh

## Load testing data folder

In [None]:
test_road_dir = './datasets/test_ROAD/testA'
test_river_dir = './datasets/test_RIVER/testA'

road_model = 'ROAD_pix2pix'
river_model = 'RIVER_pix2pix'

## Inference with domain-specific models in single mode

Use the trained model to inference the testA image

Convert the `test_RIVER/testA` and `test_ROAD/testA` images to domainB images

- `--dataroot`: the folder where the testing data is located
- `--name`: the name of the model
- `--mode`l: the model mode
- `--netG`: the backbone architecture of the generator
- `--direction`: the direction of the model
- `--dataset_mod`e: single (which we don't need to prepare the paired data)
- `--num_test`: the number of testing data (default is 50)

In [None]:
# test the 2 dataset in single mode
! python test.py --dataroot {test_road_dir} --name {road_model} --model test --netG unet_256 --direction AtoB --dataset_mode single --norm batch --num_test 10000
! python test.py --dataroot {test_river_dir} --name {river_model} --model test --netG unet_256 --direction AtoB --dataset_mode single --norm batch --num_test 10000

# 4. Transform the results into AI CUP format



In [None]:
import os
import shutil

The results are stored in 
- `./results/ROAD_pix2pix/test_latest/images/`
- `./results/RIVER_pix2pix/test_latest/images/`

And there are 2 types of results in each folder:
- `{Prefix}_real` (domainA)
- `{Prefix}_fake` (domainB)

1. Store the `{Prefix}_fake.png` as `{Prefix}.jpg` to `./domain_type/test_latest/submission/`.

2. Resize the images as 428x240 (width x height) to match the AI CUP format.

3. Store the results of `ROAD` and `RIVER` in the `ROAD_RIVER_combined` folder.

3. Finally, zip the `./domain_type/test_latest/submission/` folder and submit it to AI CUP.

❗️For resize, we decide to use `INTER_CUBIC` to keep the quality of the images, since we are enlarging the images.

### ROAD_pix2pix

In [None]:
# ROAD_pix2pix
# store the fake images to the `./results/ROAD_pix2pix/test_latest/submission/` folder
import os
import shutil

source_folder = './results/ROAD_pix2pix/test_latest/images'
target_folder = './results/ROAD_pix2pix/test_latest/submission'

if not os.path.exists(target_folder):
    os.makedirs(target_folder)

for image_name in os.listdir(source_folder):
    if 'fake' in image_name:
        new_name = image_name.replace('_fake.png', '.jpg')
        shutil.copy(os.path.join(source_folder, image_name), os.path.join(target_folder, new_name))

In [None]:
# resize the image as 428x240
import os
import cv2

for image_name in os.listdir(target_folder):
    img = cv2.imread(os.path.join(target_folder, image_name))
    img = cv2.resize(img, (428, 240), interpolation=cv2.INTER_CUBIC)
    cv2.imwrite(os.path.join(target_folder, image_name), img)
print("Finished resizing images")
print(f"Size: {len(os.listdir(target_folder))}")

### RIVER_pix2pix

In [None]:
# RIVER_pix2pix
# store the fake images to the `./results/ROAD_pix2pix/test_latest/submission/` folder
import os
import shutil

source_folder = './results/RIVER_pix2pix/test_latest/images'
target_folder = './results/RIVER_pix2pix/test_latest/submission'

if not os.path.exists(target_folder):
    os.makedirs(target_folder)

for image_name in os.listdir(source_folder):
    if 'fake' in image_name:
        new_name = image_name.replace('_fake.png', '.jpg')
        shutil.copy(os.path.join(source_folder, image_name), os.path.join(target_folder, new_name))

In [None]:
# resize the image as 428x240
import os
import cv2

for image_name in os.listdir(target_folder):
    img = cv2.imread(os.path.join(target_folder, image_name))
    img = cv2.resize(img, (428, 240), interpolation=cv2.INTER_CUBIC)
    cv2.imwrite(os.path.join(target_folder, image_name), img)
print("Finished resizing images")
print(f"Size: {len(os.listdir(target_folder))}")

## Combine the ROAD and RIVER submission

Combine the ROAD and RIVER submission into the `./results/ROAD_RIVER_combined/submission` folder

In [None]:
source_road_dir = './results/ROAD_pix2pix/test_latest/submission'
source_river_dir = './results/RIVER_pix2pix/test_latest/submission'

target_dir = './results/ROAD_RIVER_combined/submission'

if not os.path.exists(target_dir):
    os.makedirs(target_dir)

for image_name in os.listdir(source_road_dir):
	shutil.copy(os.path.join(source_road_dir, image_name), os.path.join(target_dir, image_name))
for image_name in os.listdir(source_river_dir):
	shutil.copy(os.path.join(source_river_dir, image_name), os.path.join(target_dir, image_name))

In [None]:
print(f"Size: {len(os.listdir(target_dir))}")

In [None]:
# zip the fake images
shutil.make_archive(target_dir, 'zip', target_dir)

### Submit the results to AI CUP
you should have the `submission.zip` file in the `./results/ROAD_RIVER_combined` folder

submit the zip file to AI CUP