# Semantic Augmentation

This notebook contains the demo and the implementation of the paper 'SemAug: Semantically Meaningful Image Augmentations for Object Detection Through Language Grounding'.

## Introduction

Data augmentation is an essential technique in improving the generalization of deep neural networks. The majority of existing image-domain augmentations either rely on geometric and structural transformations, or apply different kinds of photometric distortions. In this paper, we propose an effective technique for image augmentation by injecting contextually meaningful knowledge into the scenes. Our method of semantically meaningful image augmentation for object detection via language grounding, SemAug, starts by calculating semantically appropriate new objects that can be placed into relevant locations in the image (the _what_ and _where_ problems). Then it embeds these objects into their relevant target locations, thereby promoting diversity of object instance distribution. Our method allows for introducing new object instances and categories that may not even exist in the training set. Furthermore, it does not require the additional overhead of training a context network, so it can be easily added to existing architectures. Our comprehensive set of evaluations showed that the proposed method is very effective in improving the generalization, while the overhead is negligible. 

This code has been modified from the [Instaboost](https://github.com/GothicAi/Instaboost.git) repository, to incorporate SemAug.

This code uses mmdetection, which is an open source object detection toolbox based on PyTorch. It is
a part of the open-mmlab project developed by [Multimedia Laboratory, CUHK](http://mmlab.ie.cuhk.edu.hk/).

<p align="center">
<img src="flowchart2.jpg" width="800" />
<!-- ![](https://modelarts-cnnorth1-market-dataset.obs.cn-north-1.myhuaweicloud.com/example-apps/yourprojectname/someimage.png) -->
</p>


### Environment Configuration

In [1]:
!pip install torch==1.1.0 torchvision==0.3.0 torchaudio
!pip install cython numpy
!pip install matplotlib==2.1.1
!pip install instaboostfast-0.1.2.tar.gz
!pip install opencv_mat-0.1.4-cp36-cp36m-linux_x86_64.whl
!pip install pytest-runner
!pip install mmcv-0.2.16.tar.gz
!pip install --upgrade scikit-image
!chmod +x compile.sh
!./compile.sh
!pip install .

Collecting torch==1.1.0
  Downloading http://repo.myhuaweicloud.com/repository/pypi/packages/69/60/f685fb2cfb3088736bafbc9bdbb455327bdc8906b606da9c9a81bae1c81e/torch-1.1.0-cp36-cp36m-manylinux1_x86_64.whl (676.9MB)
[K    100% |████████████████████████████████| 676.9MB 59.6MB/s ta 0:00:011   0% |▏                               | 3.8MB 76.9MB/s eta 0:00:09[K    0% |▎                               | 6.5MB 69.2MB/s eta 0:00:10▍                               | 8.1MB 101.2MB/s eta 0:00:07                               | 13.1MB 105.4MB/s eta 0:00:07B/s eta 0:00:10                             | 18.9MB 105.9MB/s eta 0:00:072MB 86.5MB/s eta 0:00:08eta 0:00:064% |█▌                              | 30.6MB 62.2MB/s eta 0:00:115% |█▊                              | 35.9MB 98.0MB/s eta 0:00:07/s eta 0:00:08           | 49.6MB 100.3MB/s eta 0:00:07                         | 52.6MB 50.2MB/s eta 0:00:13                         | 58.9MB 39.4MB/s eta 0:00:16 0:00:07�█▎                            | 69.6MB 9

### Download Code and Data

### Prepare datasets.

It is recommended to symlink the dataset root to `$CODE/data`.

```
CODE
├── mmdet
├── tools
├── configs
├── data
│   ├── coco
│   │   ├── annotations
│   │   ├── train2017
│   │   ├── val2017
│   │   ├── test2017
│   ├── VOCdevkit
│   │   ├── voc2007
│   │   │   ├── VOC2007
│   │   ├── voc2012
│   │   │   ├── VOC2012

```

In [2]:
# import moxing as mox
# localFolder = 'data/VOCdevkit/'
# mox.file.copy_parallel('s3://vanbdai-share-cn1/Morgan/datasets/VOCdevkit/', localFolder)

INFO:root:Using MoXing-v1.17.3.4-4b65c6b1
INFO:root:Using OBS-Python-SDK-3.20.9.1
INFO:root:Listing OBS: 1000
INFO:root:Listing OBS: 2000
INFO:root:Listing OBS: 3000
INFO:root:Listing OBS: 4000
INFO:root:Listing OBS: 5000
INFO:root:Listing OBS: 6000
INFO:root:Listing OBS: 7000
INFO:root:Listing OBS: 8000
INFO:root:Listing OBS: 9000
INFO:root:Listing OBS: 10000
INFO:root:Listing OBS: 11000
INFO:root:Listing OBS: 12000
INFO:root:Listing OBS: 13000
INFO:root:Listing OBS: 14000
INFO:root:Listing OBS: 15000
INFO:root:Listing OBS: 16000
INFO:root:Listing OBS: 17000
INFO:root:Listing OBS: 18000
INFO:root:Listing OBS: 19000
INFO:root:Listing OBS: 20000
INFO:root:Listing OBS: 21000
INFO:root:Listing OBS: 22000
INFO:root:Listing OBS: 23000
INFO:root:Listing OBS: 24000
INFO:root:Listing OBS: 25000
INFO:root:Listing OBS: 26000
INFO:root:Listing OBS: 27000
INFO:root:Listing OBS: 28000
INFO:root:Listing OBS: 29000
INFO:root:Listing OBS: 30000
INFO:root:Listing OBS: 31000
INFO:root:Listing OBS: 32000

In [1]:
!wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
!wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
!wget http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtest_06-Nov-2007.tar

!mkdir data
!tar xf VOCtrainval_11-May-2012.tar -C data
!tar xf VOCtrainval_06-Nov-2007.tar -C data
!tar xf VOCtest_06-Nov-2007.tar -C data

!mv data/VOCdevkit/VOC2012 data/VOCdevkit/voc2012/VOC2012
!mv data/VOCdevkit/VOC2007 data/VOCdevkit/voc2007/VOC2007

--2022-07-19 04:04:28--  http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
Resolving host.robots.ox.ac.uk (host.robots.ox.ac.uk)... 129.67.94.152
Connecting to host.robots.ox.ac.uk (host.robots.ox.ac.uk)|129.67.94.152|:80... ^C
--2022-07-19 04:06:04--  http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
Resolving host.robots.ox.ac.uk (host.robots.ox.ac.uk)... 129.67.94.152
Connecting to host.robots.ox.ac.uk (host.robots.ox.ac.uk)|129.67.94.152|:80... failed: Connection timed out.
Retrying.

--2022-07-19 04:08:12--  (try: 2)  http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
Connecting to host.robots.ox.ac.uk (host.robots.ox.ac.uk)|129.67.94.152|:80... failed: Connection timed out.
Retrying.

--2022-07-19 04:10:21--  (try: 3)  http://host.robots.ox.ac.uk/pascal/VOC/voc2007/VOCtrainval_06-Nov-2007.tar
Connecting to host.robots.ox.ac.uk (host.robots.ox.ac.uk)|129.67.94.152|:80... failed: Connection timed out.
Retryin

### Prepare GloVe Embeddings

Download the [GloVe embeddings](https://nlp.stanford.edu/projects/glove/), and symlink the dataset root to `$CODE/mmdet/datasets`. For reference, the embeddings used in the paper are from the Wikipedia 2014 + Gigaword 5 (6B tokens, 400K vocab, uncased, 50d, 100d, 200d, & 300d vectors, 822 MB download): glove.6B.zip.

In [3]:
# import moxing as mox
# localFolder = 'mmdet/datasets/glove.6B.300d.txt'
# mox.file.copy_parallel('s3://vanbdai-share-cn1/Morgan/COCP-main4/mmdet/datasets/glove.6B.300d.txt', localFolder)

In [1]:
!wget http://nlp.stanford.edu/data/glove.6B.zip
!unzip glove*.zip -d mmdet/datasets



URLError: <urlopen error [Errno 110] Connection timed out>

### Create object bank

Use the provided pascal object bank from the provided files or run the following script for the COCO object bank.

In [None]:
# python create_bank.py

## Train a model

mmdetection implements distributed training and non-distributed training,
which uses `MMDistributedDataParallel` and `MMDataParallel` respectively.

### Distributed training (Single or Multiples machines)

mmdetection potentially supports multiple launch methods, e.g., PyTorch’s built-in launch utility, slurm and MPI.

We provide a training script using the launch utility provided by PyTorch.

```shell
./tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> [optional arguments]
```

Supported arguments are:

- --validate: perform evaluation every k (default=1) epochs during the training.
- --work_dir <WORK_DIR>: if specified, the path in config file will be replaced.

Expected results in WORK_DIR:

- log file
- saved checkpoints (every k epochs, defaults=1)
- a symbol link to the latest checkpoint

**Important**: The default learning rate is for 8 GPUs. If you use less or more than 8 GPUs, you need to set the learning rate proportional to the GPU num. E.g., modify lr to 0.01 for 4 GPUs or 0.04 for 16 GPUs.

An example is:


In [4]:
!chmod +x tools/dist_train.sh
!mkdir ~/.cache/torch
!mkdir ~/.cache/torch/checkpoints
!mv 'tools/resnet50-19c8e357.pth' '/home/ma-user/.cache/torch/checkpoints/resnet50-19c8e357.pth'

In [7]:
!python -m torch.distributed.launch --nproc_per_node=4 tools/train_nocopy.py configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712_vanilla.py --validate  --launcher pytorch

2022-07-09 05:22:07,497 - INFO - Distributed training: True
2022-07-09 05:22:07,897 - INFO - load model from: modelzoo://resnet50

unexpected key in source state_dict: fc.weight, fc.bias

***************************************
Arguments Used in Train.py
Namespace(config='configs/pascal_voc/faster_rcnn_r50_fpn_1x_voc0712_vanilla.py', gpus=1, launcher='pytorch', local_rank=0, resume_from=None, seed=None, validate=True, work_dir='work_dirs/mmdet')
distributed: True
sh: nvidia-smi: command not found
***************************************
2022-07-09 05:22:16,763 - INFO - Start running, host: ma-user@jupyter--44-45-2d76ca5310-2dfd7e-2d11ec-2db398-2d0255ac1000de, work_dir: /home/ma-user/work/work_dirs/mmdet
2022-07-09 05:22:16,763 - INFO - workflow: [('train', 1)], max: 8 epochs
2022-07-09 05:22:38,646 - INFO - Epoch [1][50/6207]	lr: 0.01000, eta: 6:00:30, time: 0.436, data_time: 0.109, memory: 2410, loss_rpn_cls: 0.1484, loss_rpn_reg: 0.0150, loss_cls: 0.3561, acc: 94.9492, loss_reg: 0.089