Skip to content

2Tricky4u/Bachelor-Thesis-De-occlusion-of-occluded-vehicle-images-from-drone-video

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


Logo

Bachelor Thesis: De-occlusion of occluded vehicle images from drone video

An adventure through Inpainting hidden part of vehicles with Deep Learning!
Explore the report »

View Mid-semester results · View End-semester results ·

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contact
  6. Acknowledgments
  7. Report
  8. Results sample

About The Project

Inpainting Screen Shot In urban traffic analysis, the accurate detection of vehicles plays a crucial role in generating reliable statistics for various city management applications. However, occlusions occurring in densely populated city environments pose significant challenges to vehicle detection algorithms, leading to reduced detection rates and compromised data accuracy. To try to address this issue, we compared two of the novel models that leverage machine learning techniques for inpainting occluded vehicles, with the goal of improving the overall detection rate and enhancing the reliability of city statistics.

De-occlusion involves a two-step process:

  • Occlusion detection (segmentation)
  • Inpainting (image completion)

First, an occlusion detection algorithm is employed to identify regions within the traffic scene that contain occluded vehicles.
Second, is an Inpainting model that given the occluded part, completes this hidden image fraction.

We exclusively focused on the inpainting part of images for this project.

One machine learning model, namely Repaint, is trained using only a dataset of non-occluded vehicle images and the second, namely AOT-GAN, also needs masks of the occluded part. The models learn to inpaint the occluded regions based on the available visual information. By leveraging contextual cues and vehicle appearance patterns, the model should effectively generate plausible completions of the occluded regions, restoring the missing vehicle details.

We wanted to evaluate the proposed method on a comprehensive dataset of UAV point of view single vehicles in urban traffic scenes. Finetune it with another dataset of the LUTS lab. Then comparing the ground truth image with the inpainted image to compare the results. An interesting future evaluation could be comparing the detection performance before and after applying the inpainting technique.

(back to top)

Built With

Here are the major frameworks/libraries used in our project.

  • PyTorch
  • Numpy
  • OpenCV
  • Cairo

This project was based on the following GitHub repositories.

Main Machine Learning Models used:

Evaluation Metrics:

2D Shape Generator:

(back to top)

Getting Started

To get a local copy up and running follow these simple steps.
We need to setup, for both model, the environment, download the datasets and the models, and then run the code.

Prerequisites

Get the latest version of Python, PyTorch and pip installed on your machine.
We also highly suggest the use of conda to manage the virtual environments.
We recommend using a virtual environment to run the code of each model individually.
To fetch the code from GitHub, you need to have git installed on your machine.
You can download this repo from the following command:

  git clone https://github.com/2Tricky4u/Bachelor-Thesis-De-occlusion-of-occluded-vehicle-images-from-drone-video

Installation

Repaint (Only inferring)

  1. Go to the Repaint folder
    cd Models/Repaint
  2. When in the appropriate virtual environment, install the requirements:
    pip install numpy torch blobfile tqdm pyYaml pillow  
    You should be ready to use Repaint if your machine has a GPU with CUDA support.
    Go to the Repaint Usage section to see how to use it.

Guided Diffusion (Training Repaint)

  1. Go to the Guided Diffusion folder
    cd Models/Guided-Diffusion
  2. When in the appropriate virtual environment, install the requirements:
    pip install -e .
  3. You also need the following Message Passing Interface (MPI) library:
    pip install mpi4py
    You should be ready to use Guided Diffusion and train a model for Repaint (if your machine has a GPU with CUDA support.)
    Go to the Guided Diffusion Usage section to see how to use it and launch training.

AOT-GAN

  1. Go to the AOT-GAN folder

    cd Models/AOT-GAN
  2. With conda, create a new virtual environment and install the requirements:

    conda env create -f environment.yml
  3. Activate the environment

    conda activate inpainting

    You should be ready to use AOT-GAN for inference and training (if your machine has a GPU with CUDA support.)
    Go to the AOT-GAN Usage section to see how to use it.

Evaluation Metrics

  1. Go to the Evaluation Metrics folder
    cd Metrics
  2. When in the appropriate virtual environment, install the requirements:
    pip install piq
    You should be ready to use the Evaluation Metrics.
    Go to the Evaluation Metrics Usage section to see how to use it.

(back to top)

If any of the above steps fail, please refer to the official documentation of the respective model for more information.

Usage

Dataset Creation

There, are our generated datasets for the occlusion problem Here.

main.py

To create a dataset, you need to run the script main.py in the Dataset_creation\scripts folder:

  1. Go to the Dataset_creation folder
    cd Dataset_creation\scripts
  2. Run the script
     python main.py [options -> see below]

You need to specify the input, a folder containing the images of your future dataset, and the output, a folder where the dataset will be created.
The script will create two folder with the name train for the training part and one with the name test for the test part of the dataset in the output folder.
Each of thees folder will be composed of a folder named gt for ground truth images and a folder named mask for the masks.
Dataset-creation-cmd For more information about the options, you should look at the dataset creation section of our report.

I would highly suggest to set the resize option to false, as it will take a lot of RAM to resize the images in the fly of the process and can cause unexpected results.
We provided a standalone script to address this problem.

resize.py

To resize the images of a folder using the standalone script, you need to copy the script resize.py in the Dataset_creation\scripts\standalone_script folder and paste it in the folder containing the images you want to resize. It is coded in a way that the folder you paste resizer.py should contains two other folder named gt and mask
You can edit the beginning of the script to change the size of the images and the background color.

import cv2
import numpy as np
import pathlib
import os

bg_color = [0, 0, 0]  <--- Background color
square_dim = 128    <--- Size of the images (square)

def get_files_from_folder(path):
files = os.listdir(path)
return np.asarray(files)
...

Then you need to run the script resize.py with the following command:

python resizer.py

This will create a folder named new containing the resized images in the same folder

Green Splitter

This script is used to split images which contains too much green in them.
To use it, you need to copy the script green_splitter.py in the Dataset_creation\scripts\standalone_script folder and paste it in the folder containing the images you want to split. It is coded in a way that the folder you paste green_splitter.py should contains two other folder named gt and mask
You need to edit the beginning of the script to change the input and output path.

import math
import os
import cv2
import numpy as np

original_path = "./v_patches/" <--- Path to the folder containing the images
path = "./resized/" <--- Path for the outputs

Then you need to run the script green_splitter.py with the following command:

python green_splitter.py

The script will create a folder named green containing the images with too much green in them and a folder named as you name it in path containing the other images.

Models usage

Here we will show how to use each model and how to train them.

RePaint

To launch the inferrence of the model, you need to run the script tes.py in the Models\RePaint folder:

python test.py --conf_path confs/face_train.yml

You can change the configuration file to use another model or change the parameters
Here is the configuration file with some appropriate comment how to configure it:

# Copyright (c) 2022 Huawei Technologies Co., Ltd.
# Licensed under CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International) (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://creativecommons.org/licenses/by-nc-sa/4.0/legalcode
#
# The code is released for academic research use only. For commercial use, please contact Huawei Technologies Co., Ltd.
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
# This repository was forked from https://github.com/openai/guided-diffusion, which is under the MIT license

attention_resolutions: 16,8
class_cond: false # Use conditioner
diffusion_steps: 4000 # To Not Adapt
learn_sigma: true
noise_schedule: linear #Choose the noise schedule
num_channels: 128
num_head_channels: -1
num_heads: 4
num_res_blocks: 2
resblock_updown: false
use_fp16: false
use_scale_shift_norm: true
classifier_scale: 4.0
lr_kernel_n_std: 2
num_samples: 100
show_progress: true
timestep_respacing: '250'
use_kl: false
predict_xstart: false
rescale_timesteps: false
rescale_learned_sigmas: false
classifier_use_fp16: false
classifier_width: 128
classifier_depth: 2
classifier_attention_resolutions: 16,8
classifier_use_scale_shift_norm: true
classifier_resblock_updown: false
classifier_pool: attention
num_heads_upsample: -1
channel_mult: ''
dropout: 0.0
use_checkpoint: false
use_new_attention_order: false
clip_denoised: true
use_ddim: false
latex_name: RePaint
method_name: Repaint
image_size: 128
model_path: ~/occlusion_removal/guided-diffusion/model000000.pt #~/occlusion_removal/guided-diffusion/openai-2023-06-07-22-46-09-696639/ema_0.9999_000000.pt #model000000.pt #./data/pretrained/256x256_diffusion.pt #./data/pretrained/256x256_diffusion_uncond.pt #./data/pretrained/celeba256_250000.pt
name: face_example
inpa_inj_sched_prev: true
n_jobs: 1
print_estimated_vars: true
inpa_inj_sched_prev_cumnoise: false
schedule_jump_params:
  t_T: 200 # 250 # YT : Reduce the total number of steps (w/o resampling) to remove more noise per step
  # when t_T = 50 -> 385 iterations / t_T = 200 -> 3710
  n_sample: 1
  jump_length: 5 # 10 # YT : Reduce it to resample fewer times
  jump_n_sample: 10 # YT : Apply resampling not from the beginning but only after a specific time
data:
  eval:
    paper_face_mask:
      mask_loader: true
      gt_path: ./data/datasets/gts/face
      mask_path: ./data/datasets/gt_keep_masks/face
      image_size: 128
      class_cond: false
      deterministic: true
      random_crop: false
      random_flip: false
      return_dict: true
      drop_last: false
      batch_size: 16
      return_dataloader: true
      offset: 0
      max_len: 1 # 8 # YT : They iterate over paths and sample <max_len> images
      # as for the face_example, only 1 image is available -> set max_len to 1 otherwise they sample 8 times the same gt and the mask
      paths:
        srs: ./log/face_example/inpainted
        lrs: ./log/face_example/gt_masked
        gts: ./log/face_example/gt
        gt_keep_masks: ./log/face_example/gt_keep_mask

This conf file NEED to be kept the same for the training and the inference even if we infer on Repaint model and train on Guided Diffusion model

To train the model, you need to do it exclusively from the next model, ie Guided Diffusion.

Guided Diffusion (Repaint training pipeline)

To train a model fo repaint you need to launch the script images_train.py in the Models\Guided-DIffusion-Fixed-for-Repaint folder:

python images_train.py --conf_path confs/face_train.yml

AOT-GAN

To train AOT-GAN you need to launch the script train.py in the Models\AOT-GAN folder:

python train.py --dir_image "./data/VRAI_128_WonBL/new/gt" --dir_mask "./data/VRAI_128_WonBL/new/mask" --data_train "" --data_test "" --mask_type ""  --image_size 128  --batch_size 16 --print_every 1000  --save_every 1000 --pre_train "./experiments/aotgan_128/" --resume  

To infer with AOT-GAN you need to launch the script test.py in the Models\AOT-GAN folder:

python test.py --pre_train "../experiments/G0000000.pt" --dir_mask "data/mask2" --dir_image "data/gt"

To evaluate AOT-GAN you need to launch the script eval.py in the Models\AOT-GAN folder:

python eval.py --real_dir "data/GT/1" --fake_dir "data/AOT_big/1" --metric mae psnr ssim fid

Evaluation Metrics

To evaluates some images you need to launch the script main.py in the Metrics folder:

python main.py

You can configure the script by changing the config.yaml file in the Metrics\src\config folder.

# data parameters
dataset_name: car256                        # used for saving csv. Easy to remember which dataset you're evalulating, especially, if you're using multiple datasets.
dataset_with_subfolders: False               # True
dataset_format: image                        # file_list # file_list is not implemented. I will implement it later. Easier to implement.
multiple_evaluation: False
generated_image_path: ./data/AOT_big/patch/1
ground_truth_image_path: ./data/GT/patch_big/1
return_dataset_name: False                   # Currently, no use. In future, it will be used for multi-testing.


# experiment
exp_type: ablation
model_name: difnet

# processing parameters
batch_size: 1                                # set according to your GPU/CPU
image_shape: [ 256, 256, 3 ]                 # set according to your need.
random_crop: False                           # currently, not implemented. In future, it will be used to evaluate patches.
threads: 4                                   # set according to your CPU.

# print option
print_interval_frequency: 1
show_config: True

# save options
save_results: True
save_results_path: ./logs
save_file_name: metrics
save_type: csv

(back to top)

Roadmap

  • Provide a script to create occlusion dataset
  • Generate multiples datasets for vehicles (UAV POV)
  • Selected and study two of the newest ML model for inpainting
  • Setup two different pipelines to run/train those models
  • Setup a pipeline to evaluate the models
  • Evaluate the mid-semester results of the models
  • Train successfully, with expected results, the models on the created datasets
    • Repaint
    • AOT-GAN (doesn't converge)

(back to top)

Contact

Xavier Ogay - website - xavier.ogay@epfl.ch

Mahmoud Dokmak - mahmoud.dokmak@epfl.ch

Project Link: https://github.com/2Tricky4u/Bachelor-Thesis-De-occlusion-of-occluded-vehicle-images-from-drone-video

(back to top)

Acknowledgments

We would like to thank our supervisors, Yura Tak, Robert Fonod and Prof. Geroliminis, for their guidance and support throughout the project.

(back to top)

Report

This browser does not support PDFs. Please download the PDF to view it: Download PDF.

Results

Mid-Semester Sample Results

Mid-Semester-Results

End-Semester Sample Results

https://drive.google.com/drive/folders/1OmV5iyZMOzC-QUi7ynDh4HsonT9wst51?usp=sharing End-Semester-Results