---
# Josh Knize and Pradyumn Pathak Final Project Submission
---

# 1. Introduction to libraries
### We have used the following libraries in this project:
* **`Detectron 2`: Is developed and maintained by FaceBook Research**
    * The reason this was chosen because of the modularity it provides and the plethora of model configurations it gives access to
    * Each model option is deeply cunfigurable down to the different architectural blocks
    * It also has support for loading in custom backbone without minimum codebased change in that of `detectron2` itself
    * All these benefits made it an ideal choice for us, as it made integration of our custom `HebbNet` bacbone much easier
* **`Matplotlib`: Is a real standard library used for showcasing output images**
* **`torch`: is the work engine of `detectron2` and most of Deep Learning libraries**|
* **`numpy`: Is used again as another work engine of torch itself, but also is used for performing various linear algebric functions throughout the project**
---


### Project Setup:

We would recommend creating a virtual environment for this notebook:
* Recomended `Python=3.8`
* Recomended OS: `Linux` 
    * This version uses `detectron2` which only support `pip install` on `Linux`, and need to be built from source on windows.
    * Official guide to building on windows can be found ([Here](URL))

In [10]:
!git clone https://github.com/facebookresearch/detectron2.git

!pip install gdown -q || echo "Error installing gdown"
!pip install torch==1.8.1+cu111 torchvision==0.9.1+cu111 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html -q || echo "Error installing torch and related packages"
!pip install fvcore -q || echo "Error installing fvcore"
!pip install opencv-python -q || echo "Error installing opencv-python"
!pip install pycocotools -q || echo "Error installing pycocotools"
!pip install cloudpickle -q || echo "Error installing cloudpickle"
!pip install omegaconf -q || echo "Error installing omegaconf"
!pip install matplotlib -q || echo "Error installing matplotlib"
!pip install tensorboard -q || echo "Error installing tensorboard"

!echo "Downloading Dataset (This might take a while)"
!gdown https://drive.google.com/uc?id=1MM6kODxnDvkVzzOuOYYbwrP5uou1DOC- -O datasets.zip
!gdown https://drive.google.com/uc?id=13JG4Sh7Fpad8O6eB_fQe7C2wQmO7hRtj -O Hebbnet_backbone.zip

!echo "Extracting Dataset and Copying file (This might take a while)"
!python -c "import zipfile; zipfile.ZipFile('datasets.zip', 'r').extractall('.')"
!python -c "import zipfile; zipfile.ZipFile('Hebbnet_backbone.zip', 'r').extractall('Hebbnet_backbone')"
!python -c "import shutil; shutil.move('Hebbnet_backbone/__init__.py', './detectron2/detectron2/modeling/backbone/__init__.py'); shutil.move('Hebbnet_backbone/hebbnet_backbone.py', './detectron2/detectron2/modeling/backbone/hebbnet_backbone.py'); shutil.move('Hebbnet_backbone/hebbnet_backbone.yaml', './detectron2/configs/COCO-Detection/hebbnet_backbone.yaml')"


fatal: destination path 'detectron2' already exists and is not an empty directory.


"Downloading Dataset (This might take a while)"


Downloading...
From (original): https://drive.google.com/uc?id=1MM6kODxnDvkVzzOuOYYbwrP5uou1DOC-
From (redirected): https://drive.google.com/uc?id=1MM6kODxnDvkVzzOuOYYbwrP5uou1DOC-&confirm=t&uuid=3a873b66-4808-41ea-b21f-3bf79a4f348d
To: d:\College\Assignments\NNDL_Project\datasets.zip

  0%|          | 0.00/720M [00:00<?, ?B/s]
  0%|          | 524k/720M [00:00<03:25, 3.50MB/s]
  0%|          | 1.05M/720M [00:00<03:33, 3.37MB/s]
  0%|          | 1.57M/720M [00:00<03:20, 3.58MB/s]
  0%|          | 2.10M/720M [00:00<03:05, 3.86MB/s]
  0%|          | 2.62M/720M [00:00<02:53, 4.14MB/s]
  1%|          | 3.67M/720M [00:00<02:33, 4.66MB/s]
  1%|          | 4.19M/720M [00:00<02:33, 4.65MB/s]
  1%|          | 5.24M/720M [00:01<02:20, 5.07MB/s]
  1%|          | 6.29M/720M [00:01<02:16, 5.24MB/s]
  1%|          | 7.34M/720M [00:01<01:59, 5.95MB/s]
  1%|          | 8.39M/720M [00:01<01:58, 5.98MB/s]
  1%|▏         | 9.44M/720M [00:01<01:50, 6.43MB/s]
  1%|▏         | 10.5M/720M [00:01<01:51, 6.37M

"Extracting Dataset and Copying file (This might take a while)"


In [None]:
// Only run this if in Linux:
!pip install git+https://github.com/facebookresearch/detectron2.git

## !! After the above stage please move this notebook inside the `detectron2` directory, and set the following variable to the same path:

In [1]:
PATH_detectron2 = r"/home/ppathak/Praddy_CSC578/detectron2"

In [2]:
# Setting up runtime Work Directory and Importing Libraries

import os
os.chdir(PATH_detectron2)
print(os.getcwd())
from detectron2.data.datasets import register_coco_instances
from detectron2.data import DatasetCatalog, MetadataCatalog

from detectron2.config import get_cfg
from detectron2.engine import DefaultTrainer
import torch

from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.data import build_detection_test_loader

import logging

# Set logging level to WARNING to suppress detailed model architecture output
logging.getLogger("detectron2").setLevel(logging.INFO)

/home/ppathak/Praddy_CSC578/detectron2


  from .autonotebook import tqdm as notebook_tqdm


### Details about the dataset:

* This is a custom dataset created from COCO-Dataset
* It is a detection dataset only containing one object - `dog`

In [3]:
# Setting up the custom dataset for training:

register_coco_instances("coco_train_dog", {}, "../datasets/coco/annotations/dog_instances_train2017.json", "../datasets/coco/train2017_dog")
register_coco_instances("coco_val_dog", {}, "../datasets/coco/annotations/dog_instances_val2017.json", "../datasets/coco/val2017_dog")

my_dataset_metadata = MetadataCatalog.get("coco_train_dog")
my_dataset_metadata.thing_classes = ["dog"]
dataset_dicts = DatasetCatalog.get("coco_train_dog")

---
# 2. Model Design and Implementation

## 2a. We have used `Resnet-FRCNN` as our control model:
* We employ `GeneralizedRCNN` architecture to later replace the `Resnet` backbone with `HebbNet`
* The traditional `FRCNN` Architecture is as shown below:

   <a href="https://ibb.co/SJt1ZDP"><img src="https://i.ibb.co/4J8D0w7/detectron2.png" alt="detectron2" border="0"></a>

* ### The Architecture has the following blocks:

    * **Backbone Network** (`Resnet`) - Is responsible for creating feature maps from the images:
        * It learns feature maps that help `RPN` (Region Proposal Network) and `ROIHeads` to do their tasks.
        * It mainly composes of convolution layers and pooling layers like typical CNN based feature extractors.
        * The Feature extraction happens in form of convolution filters.
        * It also has 3 stage: `res2`, `res3`, `res4` and, `res5`.

    * **Region Proposal Network (`RPN`)** - Is responsible for  producing Regions boxes:
        * It learns to propose which region in the images could have potential bounding boxes.
        * It uses 2D convolutional layers (e.g., a 3x3 filter) to generate these predictions, sliding over the feature map and predicting class scores and box offsets.
        * The input to the `RPN` is the feature map produced by the backbone network (e.g., `ResNet`), typically with a size of `𝐻×𝑊×𝐶`.
        * The output consists of objectness scores and bounding box refinements for each anchor at each spatial location on the feature map.

    * **ROIHeads** - classifies the region proposals generated by the RPN and refining their bounding box coordinates:
        * The input to the `ROIHeads` is the region proposals generated by the `RPN`, along with the feature map from the backbone network.
        * Its output is the final class scores and refined bounding box coordinates for each region proposal.
        * The `ROIHeads` use `RoIAlign` to extract fixed-size features from the feature map, followed by fully connected layers to classify and refine the bounding boxes for each proposal.
---


### Training and Evaluating The `FRCNN` with `Resnet` Backbone for control:

In [6]:
# 1a. Set the configuration to load teh Resent Based FRCNN architecture:
os.environ["CUDA_VISIBLE_DEVICES"]="3"
cfg = get_cfg()
cfg.merge_from_file("configs/COCO-Detection/faster_rcnn_R_50_C4_1x.yaml") #ImageNet pre-trained
cfg.OUTPUT_DIR = f"{PATH_detectron2}/output/dog_resnet_test"
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7

cfg.DATASETS.TRAIN = ("coco_train_dog",)
cfg.DATASETS.TEST = ("coco_val_dog",)
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1
cfg.MODEL.ROI_HEADS.NMS_THRESH_TEST = 0.5
cfg.SOLVER.IMS_PER_BATCH = 64
cfg.SOLVER.BASE_LR = 0.0025
cfg.SOLVER.MAX_ITER = 50000
cfg.SOLVER.CHECKPOINT_PERIOD = 1000
cfg.TEST.EVAL_PERIOD = 1000

# run on GPU
cfg.MODEL.DEVICE = 'cuda'

In [7]:
# 1b. Load the Trainer for the defiend Resnet-FRCNN architecture and train:

trainer = DefaultTrainer(cfg)
trainer.model.to(cfg.MODEL.DEVICE)

trainer.train()

RuntimeError: CUDA error: device-side assert triggered

In [None]:
# 1c. Saving and Evaluation of the model

output_dir = f"{PATH_detectron2}/dog_resnet_test"
cfg.MODEL.WEIGHTS = output_dir + '/RESNET_BackBone_model_final.pth'

trainer.model.eval()

evaluator = COCOEvaluator("coco_val_subset", ("bbox",), False, output_dir=output_dir)
val_loader = build_detection_test_loader(cfg, "coco_val_dog")
print(inference_on_dataset(trainer.model, val_loader, evaluator))

---
# 2. Model Architecture (`HebbNet-FRCNN`)
## 2b. The Custom HebbNet-Backbone FRCNN Architecture:

* ### The HebbNet architecure is small:
    * **The use of Linear Layers**: All the layers in the architecture are linear layers:
        * This was done because implimenting convolution layer hebbian updates was too risky to finish for the time of this project (4 Weeks).
        * The Linear layers needed to be reformed in 2D layers, which was done by reshaping them back, to the expected dimension using `reshape(1, num_feature_map_channel, int(width_hideen_layer), int(width_hideen_layer))`.
        
    * **Layer Dimensions**:
        * For purposes of avoiding memory error the input_layer dimensions of the model is `128 x 128 x 3`.
        * We have `8` linear hidden layers each of a dimension `1 x 16,384`.
        * The feature map is a stack of `8` each of size `128 x 128`.
    
    * **Connecting with ROIHead**:
        * We have **skipped** `RPN` in our implimentation due to integration complexity and time constraints.
        * But still we get precomputed Region Proposals as a  default feature of `detectron2` implimentation in abscence of `RPN`.
        * The output of `HebbNet` feature maps goes to the `ROIHeads`.

## Below is the HebbNet Backbone Architecture code:
```python
class HebbNet(Backbone):
    def __init__(self, cfg):
        super().__init__()
        self.max_pool = nn.MaxPool2d(kernel_size=(5, 5), stride=2, padding=0)       # Based on testing, doing MaxPool and then adaptive pooling to keep a shape of 80x80 consistent
        self.adaptive_pool = nn.AdaptiveAvgPool2d((cfg.INPUT.MAX_SIZE_TRAIN, cfg.INPUT.MAX_SIZE_TRAIN))
        self.cfg = cfg

        self.lr = cfg.MODEL.HEBB_LR

        self.layers = nn.ModuleList()
        self.input_layer_size = cfg.INPUT.MAX_SIZE_TRAIN * cfg.INPUT.MAX_SIZE_TRAIN * 3
        self.flattened_size = self.adaptive_pool.output_size[0] * self.adaptive_pool.output_size[1] * 3
        self.hidden_layer_size = (cfg.MODEL.ROI_BOX_HEAD.FC_DIM * self.flattened_size) // (3 * cfg.MODEL.NUM_HIDDEN)
        self.flatten = nn.Flatten()    
        self.layers.append(nn.Linear(self.input_layer_size, self.hidden_layer_size, False))    
        for i in range(self.cfg.MODEL.NUM_HIDDEN-1):
            self.layers.append(nn.Linear(self.hidden_layer_size, self.hidden_layer_size, False))
        # no output layer necessary
        # self.classification_weights = nn.Linear(self.hidden_layer_size, output_layer_size, True)

        self.activation_threshold_layers = []
        self.activation_threshold_layers.append(HebbRuleWithActivationThreshold(hidden_layer_size=self.hidden_layer_size,
                                                                input_layer_size=self.flattened_size).to(self.cfg.MODEL.DEVICE))
        for i in range(self.cfg.MODEL.NUM_HIDDEN-1):
            self.activation_threshold_layers.append(HebbRuleWithActivationThreshold(hidden_layer_size=self.hidden_layer_size,
                                                                input_layer_size=self.hidden_layer_size).to(self.cfg.MODEL.DEVICE))

        self.relu = nn.ReLU()
        self.softmax = nn.LogSoftmax(dim=1)

        self._out_feature_strides = {"res4": 4}
        self._out_feature_channels = {"res4": cfg.MODEL.ROI_BOX_HEAD.FC_DIM}

        # initialize dictionary of outputs along forward pass layers / steps
        out_features = ["res4"]
        self._out_features = out_features

        # initialize input_layer_size so it can be used by padding_constraints to specify our fixed image size to generalized rcnn
        # self.input_layer_size = input_layer_size

        # for img and feature visualization
        self.img_vis = cfg.MODEL.IMG_VIS
        self.feat_vis = cfg.MODEL.FEAT_VIS
        self.feat_vis_num = cfg.MODEL.FEAT_VIS_NUM
```
---

### Importand Note:
* The FRCNN with Resnet Backbone uses **pretrained weights** on ImageNet but our HebbNet architecture is learning from scrach in this detection architecture
* The Resnet architecture in Resnet Backbone has a larger input `img-size`, allowing for higher feature throughput from the get go
---

---
## 2. Training and Evaluating the `FRNCN` with `Hebbnet` Backbone:

In [None]:
# 2a. Setting the primary configurations, ensuring we use the [Custom] HebbNet Backbone:

cfg = get_cfg()
cfg.merge_from_file("configs/COCO-Detection/hebbnet_backbone.yaml")     # Loading the Custom HebbNet Backbone Configuration template
cfg.OUTPUT_DIR = f"{PATH_detectron2}/output/dog_hebbnet_test"
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7

cfg.DATASETS.TRAIN = ("coco_train_dog",)
cfg.DATASETS.TEST = ("coco_val_dog",)
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 1
cfg.MODEL.OUTPUT_LAYER_SIZE = 1
cfg.MODEL.ROI_HEADS.NMS_THRESH_TEST = 0.5
cfg.SOLVER.IMS_PER_BATCH = 1
cfg.SOLVER.BASE_LR = .00002
cfg.SOLVER.MAX_ITER = 5500*2

# run on GPU
cfg.MODEL.DEVICE = 'cuda'

#### While Loading the secondary custom configurations:
- Our input `image-size` is `128x128`, The reason being we cannot train the model for larger `image-size` due to the following problem:
    - We run into `CUDA: Out of Memory Model`: Due to the hidden layers having way too many neurons caused due to a lack of Convolution
- We are also leaving out `PROPOSAL_GENERATOR` from our model
- The Batch size is set to 1, because we are using Hebbian Learning, but also because of memory constraints

In [None]:
# Loading Secondary configs for custom backbone:

cfg.INPUT.MIN_SIZE_TRAIN = (128,)
cfg.INPUT.MIN_SIZE_TRAIN_SAMPLING = "choice"
cfg.INPUT.MAX_SIZE_TRAIN = 128
cfg.INPUT.MIN_SIZE_TEST = 128
cfg.INPUT.MAX_SIZE_TEST = 128
# cfg.PROPOSAL_GENERATOR: PrecomputedProposals # this may be an option to potentially avoid issues with proposal generation
cfg.MODEL.ROI_HEADS.IN_FEATURES: ['res4']


- There are also major changes to the `ROIHead` and `BoXHead` in FRCNN:
    - We opted to using the `StandardROIHeads` because it made the integration of HebbNet Backbone easier
    - We are also limited the `Number Of Channels` in the `FilterMaps` that the `ROIHead` can work with to `8` due to GPU memory constraints
    - The `Number Of Fully-Connected Layers` in the `FilterMaps` is also `8` due to GPU memory constraints

In [None]:
cfg.MODEL.ROI_HEADS.NAME = "StandardROIHeads"
cfg.MODEL.ROI_BOX_HEAD.NAME = "FastRCNNConvFCHead"
cfg.MODEL.ROI_BOX_HEAD.FC_DIM = 8           # Fully Connected Channel Depth to 8
cfg.MODEL.ROI_BOX_HEAD.CONV_DIM = 8         # Channel Depth to 8
cfg.MODEL.ROI_BOX_HEAD.NUM_CONV = 2
cfg.MODEL.ROI_BOX_HEAD.NUM_FC = 2
cfg.SOLVER.CHECKPOINT_PERIOD = 9999999999999999999999999999999999

#### Finally, the following are the additional parameters exclusive to the Hebbian Backbone:

In [None]:
cfg.MODEL.NUM_HIDDEN = 8
cfg.MODEL.HEBB_LR = .00000001 # fewest number of zeros "allowed" before gradient explosion

cfg.MODEL.IMG_VIS = False
cfg.MODEL.FEAT_VIS = False
cfg.MODEL.FEAT_VIS_NUM = 0

# 3. Training Process:
* We train using the `detectron2` Pipeline
* The dataset as mentioned above is a custom COCO Dataset with only 1 object
* For Preprocessing, the `detectron2` pipeline only uses resizing, and normalizing
* We are not doing any image-augumentation

* Hyper Parameter:
    * We use batch size of 1, because of memory constraints and Hebbian Learning
    * We have a really small learning rate of just `0.00002`, as a larger learning rate leads to gradient explosion due to sever penalizing from hebbian learning
    * We have a really small hebbian learning rate of just `0.00000001`, as a larger learning rate leads to gradient explosion due to sever penalizing from hebbian learning

In [None]:
# 2b. Load the Trainer for the defiend Resnet-FRCNN architecture and train:

trainer = DefaultTrainer(cfg)
trainer.model.to(cfg.MODEL.DEVICE)

trainer.train()

In [None]:
# 2c. Saving and Evaluation of the model

output_dir = f"{PATH_detectron2}/dog_hebbnet_test"
cfg.MODEL.WEIGHTS = output_dir + '/HebbNet_BackBone_model_final.pth'

trainer.model.eval()

evaluator = COCOEvaluator("coco_val_subset", ("bbox",), False, output_dir=output_dir)
val_loader = build_detection_test_loader(cfg, "coco_val_dog")
print(inference_on_dataset(trainer.model, val_loader, evaluator))

# 4. Evaluation:
* Our initial results showed no marginal difference in performance choosing Hebbian Learning.
* The evaluation and training of `Resnet-FRCNN` is in the text file in the zip provided.
* We used just Average Precision loss to measure the performance.