Skip to content

Jason-cs18/Awesome-Multi-Camera-Network

Repository files navigation

Multi-Camera Networks

  • Multi-camera Networks research notes. Target venues: system conferences (OSDI/SOSP/ATC/EuroSys), network (NSDI/SIGCOMM/SoCC), mobile (MobiCom/MobiSys/SenSys/UbiComp), data analytics (VLDB/SIGMOD) and computer vision (ICCV/CVPR/ECCV/ICML/ICLR/NeurIPS).
  • Unlike book, I collect papers from system and AI perspective, respectively. To avoid diving into details of specific vision tasks (eg., object detection), I only list low-resource learning, domain adaptation & continual learning and dynamic deep neural networks in AI Algorithm because I think these three topics are generalized on all vision tasks and are useful to help us deploy deep learning based vision applications. In the end, I list datasets and useful toolboxes.

Note: specific vision algorithms (tracking, object detection, segmentation and action recognition) are not collected in this note. If you want to learn or try them, you can refer to SenseTime-CHUK Open-MMLab, which provides a suit of toolboxes to help AI researcher/engineers implement vision algorithms. For example, you can try 50+ image-based object detection models using the same mmdetection API and try 10+ video-based object detection methods using the same mmtracking API.

Outline

  • Book and Survey - a starting point to understand basic concepts behind multi-camera networks
  • Researchers, Workshops and Courses - follow them to get recent research trends in multi-camera networks
  • Topics - group recent papers in different sub-topics (i.e., Camera calibration)
    • System
    • AI Algorithm
      • Low-resource learning - efficient learning under limited data/annotations/computation/(time)
      • Domain adaptation and continual learning - robustness and sustainability
        • For continual learning, most AI works focus on how to learn unseen classes and how to memory seen classes (avoid catastrophic forgetting). Thus, it is also named incremental learning.
        • For domain adaptation, AI researchers target to improve generalization of existing pretrained models. Based on given target data (labeled or unlabeled), existing algorithms can be split into two categories: (1) supervised retraining; (2) unsupervised domain adaptation (source-free and source-target-joint training).
        • Recent works about Model Exchange & Serving and Model Monitoring & Updates are summarized in this slide provided by Architecture of ML Systems (SS2021, Graz University of Technology).
      • Dynamic deep neural networks - computing flexibility
  • Dataset - test your ideas on popular datasets
  • Toolbox - verify your ideas quickly using toolbox

Book and Survey

  1. Multi-Camera Networks: Principles and Applications. 2005.
  2. Camera Networks: The Acquisition and Analysis of Videos over Wide Areas (Synthesis Lectures on Computer Vision). 2012.
  3. M.Valera et al. Intelligent distributed surveillance systems: a review. 2005.
  4. Wang et al. Intelligent multi-camera video surveillance: a review. 2012.
  5. Ye et al. Wireless Video Surveillance: A Survey. 2013.
  6. Zhang et al. Deep Learning in Mobile and Wireless Networking: A Survey. IEEE TRANS 2019.

Researchers, Workshops and Courses

Researchers (organization and research interests)

Workshops (video analytics)

  1. The 3rd Workshop on Hot Topics in Video Analytics and Intelligent Edges (ACM MobiCom'21) - focus on deep learning based video analytics
  2. Multi-camera Multiple People Tracking Workshop (IEEE ICCV'21) - track multiple people from indoor scenes using multiple RGB cameras
  3. Multimedia Systems Conference (ACM MMSys'21) - contain multiple topics in video analysis

Courses

  1. CS294: Machine Learning Systems (Fall 2019, Berkeley) - contain all concepts/background behind machine learning systems (the best reference website!)
  2. 706.550: Architecture of ML Systems (Summer 2021, Graz University of Technology) - the architecture and essential concepts of modern ML systems for both local and large-scale machine learning (based on non-deep ML analytics)
  3. CS231A: Computer Vision, From 3D Reconstruction to Recognition (Winter 2021, Stanford) - focus on basic concepts behind many computer vision tasks across multi-camera networks (camera models, calibration, single- and multiple-view geometry, stereo systems, sfm, stereo, matching, depth estimation, optical flow and optimal estimation)
  4. COS 598a: Machine Learning-Driven Video Systems (Spring 2022, Princeton) - target to recent research interests on video analytics (Strong Recommendation)
  5. CS34702 Topics in Networks: Machine Learning for Networking and Systems (Fall 2020, UChicago) - target to awesome recent research works on netwoking system (video streaming and cloud scheduing are recommended)
  6. CSE 234: Data Systems for Machine Learning (Fall 2021, UCSD) - focus on the lifecycle of ML-based data analytics, including data sourcing and preparation for ML, programming models and systems for scalable ML model building, and systems for faster ML deployment
  7. CSE 291F: Advanced Data Analytics and ML Systems (Winter 2019, UCSD) - the emerging area of advanced data analytics and ML systems, at the intersection of data management, ML/AI, and systems.
  8. CS6465: Emerging Cloud Technologies and Systems Challenges (Fall 2019, Cornell) - emerging cloud computing technology, opportunities and challenges.

Topics

System

Edge video analytics

[1] Li et al. Reducto: On-Camera Filtering for Resource-Efficient Real-Time Video Analytics. In SIGCOMM'20.
[2] Xu et al. Video Analytics with Zero-streaming Cameras. In ATC'21.
[3] Jha et al. Visage: Enabling Timely Analytics for Drone Imagery. In MobiCom'21.
[4] Jiang et al. Flexible High-resolution Object Detection on Edge Devices with Tunable Latency. In MobiCom'21.
[5] Han et al. LegoDNN: Block-grained Scaling of Deep Neural Networks for Mobile Vision. In MobiCom'21.
[6] Zhang et al. Elf: Accelerate High-resolution Mobile Deep Vision with Content-aware Parallel Offloading. In MobiCom'21.
[7] Xiao et al. Towards Performance Clarity of Edge Video Analytics. In SEC'21.

Configuration search

[1] Romero et al. Llama: A Heterogeneous & Serverless Framework for Auto-Tuning Video Analytics Pipelines. In SoCC'21.

Database

[1] Saurez et al. A drop-in middleware for serializable DB clustering across geo-distributed sites. In VLDB'20.

Video streaming

[1] Y. Yan et al. Learning in situ: a randomized experiment in video streaming. In NSDI'20.
[2] Kim et al. Neural-Enhanced Live Streaming: Improving Live Video Ingest via Online Learning. In SIGCOMM'20.
[3] Du et al. Server-Driven Video Streaming for Deep Learning Inference. In SIGCOMM'20.
[4] Han et al. ViVo: Visibility-aware Mobile Volumetric Video Streamin. In MobiCom'20.
[5] Zhang et al. SENSEI: Aligning Video Streaming Quality with Dynamic User Sensitivity. In NSDI'21.

Resource management

[1] Zhang et al. The Design and Implementation of a Wireless Video Surveillance System. In MobiCom'15.
[2] Xu et al. Approximate Query Service on Autonomous IoT Cameras. In MobiSys'20.
[3] Bhardwaj et al. Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers. In NSDI'22. - target to solve when to retrain models and how to reduce resource usage for multi-tasks (many inference and retraining tasks).
[4] Zhou et al. Octo: INT8 Training with Loss-aware Compensation and Backward Quantization for Tiny On-device Learning. In ATC'21.

Prediction serving and model update

[1] Suprem et al. ODIN: Automated Drift Detection and Recovery in Video Analytics. In VLDB'21. - target to detect domain drift and update corresponding models automatically.
[2] Romero et al. INFaaS: Automated Model-less Inference Serving. In ATC'21. Best paper award! - the first model-less prediction serving system
[3] Feng et al. Palleon: A Runtime System for Efficient Video Processing toward Dynamic Class Skew. In ATC'21. - model selection based on the automatically detected class skews
[4] Wang et al. SmartHarvest: Harvesting Idle CPUs Safely and Efficiently in the Cloud. In EuroSys'21. - identify and harvest idle resources
[5] Hu et al. Scrooge: A Cost-Effective Deep Learning Inference System. In SoCC'21. - consider input complexity
[6] Ling et al. RT-mDL: Supporting Real-Time Mixed Deep Learning Tasks on Edge Platforms. In SenSys'21. - scheduling multiple DL jobs in resource-constrainted devices
[7] Schelter et al. Learning to Validate the Predictions of Black Box Classifiers on Unseen Data. In SIGMOD'20. - a tool to monitor models' performance without annotations
[8] Agarwal et al. Boggart: Accelerating Retrospective Video Analytics via Model-Agnostic Ingest Processing. arxiv prePrint 2106.15315.
[9] Gunasekaran et al. Cocktail: Leveraging Ensemble Learning for Optimized Model Serving in Public Cloud. In NSDI'22. - expect to improve prediction serving's performance via ensembling learning

Multi-Camera Collaboration

[1] Jain et al. Scaling Video Analytics Systems to Large Camera Deployments. In HotMobile'19.
[2] Liu et al. Who2com: Collaborative Perception via Learnable Handshake Communication. In ICRA'20.
[3] Liu et al. When2com: Multi-Agent Perception via Communication Graph Grouping. In CVPR'20.
[4] Zeng et al. Distream: Scaling Live Video Analytics withWorkload-Adaptive Distributed Edge Intelligence. In SenSys'20.
[5] Jain et al. Spatula: Efficient cross-camera video analytics on large camera networks. In SEC'20. Best Paper Award!
[6] Tong et al. Large-Scale Vehicle Trajectory Reconstruction with Camera Sensing Network. In MobiCom'21.

Privacy

Useful external links Keywords
Tutorial on privacy-preserving data analysis (The Alan Turing Institute) todo
The Second AAAI Workshop on Privacy-Preserving Artificial Intelligence (PPAI-21) todo
A Dive into Privacy Preserving Machine Learning (OpML'20) todo
CrypTen (Facebook AI Research) Privacy Preserving Machine Learning framework, PyTorch, Multi-Party Computation (MPC)

[1] (TAMU and Adobe Research) Wu et al. Towards Privacy-Preserving Visual Recognition via Adversarial Training: A Pilot Study. In ECCV'18.
[2] (CMU) Wang et al. Enabling Live Video Analytics with a Scalable and Privacy-Aware Framework. In 2018 ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM'18).
[3] (KAIST, USTC, Rice, NJU, SNU, PKU and MSRA) Lee et al. Occlumency: Privacy-preserving Remote Deep-learning Inference Using SGX. In MobiCom'19.
[4] (NUS) Shen et al. Human-imperceptible Privacy Protection Against Machines. In MM'19.
[5] (PSU and Facebook) Khazbak et al. TargetFinder: Privacy Preserving Target Search through IoT Cameras. In IoTDI'19 (Best Paper Award).
[6] (Tsinghua and USTC) Li et al. Invisible: Federated Learning over Non-Informative Intermediate Updates against Multimedia Privacy Leakages. In MM'20.
[7] (UCB and MSR) Poddar et al. Visor: Privacy-Preserving Video Analytics as a Cloud Service. In 29th Usenix Security Symposium (Security'20).
[8] (ICL, QMUL, Telefónica Research and Samsung AI) Mo et al. DarkneTZ: Towards Model Privacy at the Edge using Trusted Execution Environments. In MobiSys'20.
[9] (NJU, Cornell and MSRA) Wu et al. PECAM: privacy-enhanced video streaming and analytics via securely-reversible transformation. In MobiCom'21.
[10] (ASU) Hu et al. LensCap: Split-Process Framework for Fine-Grained Visual Privacy Control for Augmented Reality Apps. In MobiSys'21.
[11] (CUHK) Ouyang et al. ClusterFL: A Similarity-Aware Federated Learning System for Human Activity Recognition. In MobiSys'21.
[12] (ICL and Telefónica Research) Mo et al. PPFL: Privacy-preserving Federated Learning with Trusted Execution Environments. In MobiSys'21 (Best paper award).
[13] (CMU, UCSD and MSR) Dsouza et al. Amadeus: Scalable, Privacy-Preserving Live Video Analytics. arXiv prePrint 2011.05163.
[14] (MIT, Princeton, UChicago and Rutgers) Cangialosi et al. Privid: Practical, Privacy-Preserving Video Analytics Queries. In NSDI'22.

AI Algorithm

Low-resource learning

[1] H. Aghdam et al. Active Learning for Deep Detection Neural Networks. In ICCV'19. Public Code Note

Domain adaptation and continual learning

xxx

Dynamic deep neural networks

xxx

Dataset

  1. Duke MTMC (8 cameras, non-overlapping)
  2. Nvidia CityFlow (>40 cameras, overlapping and non-overlapping)
  3. EPFL WildTrack (7 cameras, overlapping)
  4. EPFL-RLC (3 cameras, overlapping)
  5. CMU Panoptic Dataset (>50 cameras, overlapping)
  6. University of Illinois STREETS (100 cameras, non-overlapping)
  7. Awesome reID dataset

Toolbox

  1. CHUK-mmcv: a foundational python library for computer vision research and supports many research projects (2D/3D detection, semantic segmentation, image and video editing, pose estimation, action understanding and image classification).
  2. JDCV-fastreid: a python library implementing SOTA re-identification methods (including pedestrian and vehicle re-identification). They also provided a good documentation.
  3. Cheetah: an end-to-end deep learning based prediction serving server that speeds up deployment of image classification, object detection, segmentation and tracking techniques, which is based on NVIDIA Trition server and docker.
  4. Chameleon: an efficient continuous adaptation framework based on NVIDIA TAO.

Releases

No releases published

Packages

No packages published