Skip to content

ana-baltaretu/instance-segmentation

Repository files navigation

Instance Segmentation

Instance segmentation in event-based videos (Research project). Paper here.

For this project we are currently using: Python 3.8.12, Miniconda3 and Pytorch. This is because it should be compatible with HPC so we can make use of training the models on it.

Required background knowledge:

  • What is an event-based camera? Link1, Link2
  • Basic Machine Learning (ML) knowledge and what is a neural network (NN)? 3b1b playlist
  • Basic Pytorch knowledge. 60min tutorial
  • Image Processing and Computational Intelligence knowledge from courses like CSE2225 and CSE2530.

Setting up Miniconda (for Windows only)

Make a Virtual environment with Miniconda3 by following this youtube tutorial.

In miniconda command line:

conda create --name instance_segmentation python=3.8.12  
conda info --envs  
conda activate instance_segmentation  

Hopefully just running the following command should work:

pip install -r requirements.txt
Otherwise check this section!

For Pytorch

conda install astunparse numpy ninja pyyaml mkl mkl-include setuptools cmake cffi typing_extensions future six requests dataclasses
conda install -c conda-forge libuv=1.39
pip3 install torch==1.8.1+cpu torchvision==0.9.1+cpu torchaudio===0.8.1 -f https://download.pytorch.org/whl/torch_stable.html

Data visualization:

pip install tonic
pip install matplotlib

OpenCV:

python3.8 -m pip install opencv-python
pip install scikit-image
If Mask R-CNN is acting up read this!

Working fork of Mask R-CNN TF2 - working as of May 2022 Official Mask R-CNN - was not working with installed setup

For h5py:

pip uninstall h5py
conda install -c anaconda h5py

For imgaug:

pip3 install imgaug

For pycocotools:

pip install cython
pip install git+https://github.com/philferriere/cocoapi.git#egg=pycocotools^&subdirectory=PythonAPI

For scipy:

pip install -U scikit-image==0.16.2

For wandb:

pip install wandb

Running 1 digit

  1. Change settings in src/main.py to generate the datasets or use the already generated datasets from data/.
  2. Change paths to correct datasets in src/dvs_training.py, make sure the DETECTION_MAX_INSTANCES from src/mrcnn/config.py is set to 1.
  3. From src/dvs_training.py, make sure the init_with variable is set to coco if training from scratch or set it to last to continue training some previous model.
  4. Run src/dvs_training.py, wait until finished, setup similar paths in src/dvs_testing.py and run it. Plots should be generated and the results in terms of Accuracy, MIoU and mAP will be displayed when it finishes running.

Running multiple digits

  1. Make sure the DETECTION_MAX_INSTANCES from src/mrcnn/config.py is set to 4 (or if you change the generation of multiple digits, set it to how many digits there are).
  2. Similar to "Running 1 digit", but for generating the dataset, you only need to generate it the first time when running src/dvs_training_multiple.py, so afterwards you can set REGENERATE to False from src/dvs_dataset_multiple.py.
  3. Run src/dvs_training_multiple.py and then run src/dvs_testing_multiple.py.

Visuals

Generated training masks

Predictions

Roadmap

W1 starting on 19/04/2022, presentation on 22/06/2022, documented here.

Authors and acknowledgment

Author: Ana Băltărețu
Supervisors: Nergis Tömen, Ombretta Strafforello, Xin Liu

Related work

  1. N-MNIST dataset
  2. Mask R-CNN
  3. Splash of Color: Instance Segmentation with Mask R-CNN and TensorFlow
  4. Matterport3D: Learning from RGB-D Data in Indoor Environments
  5. EV-SegNet: Semantic Segmentation for Event-based Cameras
  6. EvDistill: Asynchronous Events to End-task Learning via Bidirectional Reconstruction-guided Cross-modal Knowledge Distillation
  7. Event-based Vision: A Survey
  8. A 128x128 120 dB 15 μs Latency Asynchronous Temporal Contrast Vision Sensor
  9. A 640×480 Dynamic Vision Sensor with a 9μm Pixel and 300Meps Address-Event Representation
  10. Contour Detection and Characterization for Asynchronous Event Sensors
  11. Gradient-Based Learning Applied to Document Recognition
  12. DDD17: End-To-End DAVIS Driving Dataset
  13. End-to-End Learning of Representations for Asynchronous Event-Based Data
  14. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
  15. A Survey on Performance Metrics for Object-Detection Algorithms
  16. Microsoft COCO: Common Objects in Context
  17. mAP (mean Average Precision) for Object Detection
  18. Deep Residual Learning for Image Recognition

License

MIT License Copyright (c) 2022 Ana Băltăreţu

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages