Skip to content

jeho-lee/Awesome-On-Device-AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 

Repository files navigation

Awesome research works for On-device AI

Mobile Edge AI Systems

Along with the rapid development of AI and deep learning, DNN models have been widely used in various applications. However, the high computational comlexity of DNN models makes it difficult to deploy them on mobile and edge devices with limited computing resources. This repo collects the research works presenting a system that can efficiently execute DNN models on mobile and edge devices.

  • Most-relevant conference: ACM MobiSys, ACM MobiCom, ACM Sensys
  • Relevant conference: ACM EuroSys, ACM IPSN, USENIX NSDI, USENIX ATC, MLSys

Efficient Inference using Heterogeneous Processors (e.g., CPU, GPU, NPU, etc.)

  • [Sensys 2023] Miriam: Exploiting Elastic Kernels for Real-time Multi-DNN Inference on Edge GPU [paper (pre-print)]
  • [MobiSys 2023] NN-Stretch: Automatic Neural Network Branching for Parallel Inference on Heterogeneous Multi-Processors [paper]
  • [SenSys 2023] BlastNet: Exploiting Duo-Blocks for Cross-Processor Real-Time DNN Inference [paper]
  • [ATC 2023] Decentralized Application-Level Adaptive Scheduling for Multi-Instance DNNs on Open Mobile Devices [paper]
  • [MobiSys 2022] Band: Coordinated Multi-DNN Inference on Heterogeneous Mobile Processors [paper]
  • [MobiSys 2022] CoDL: efficient CPU-GPU co-execution for deep learning inference on mobile devices [paper]

On-device LLM/NLP

  • [MobiCom 2024] Mobile Foundation Model as Firmware [paper] [code]

On-device Training, Model Adaptation

  • [MobiCom 2023] Cost-effective On-device Continual Learning over Memory Hierarchy with Miro [paper]
  • [MobiCom 2023] AdaptiveNet: Post-deployment Neural Architecture Adaptation for Diverse Edge Environments [paper]
  • [MobiSys 2023] ElasticTrainer: Speeding Up On-Device Training with Runtime Elastic Tensor Selection [paper]
  • [SenSys 2023] On-NAS: On-Device Neural Architecture Search on Memory-Constrained Intelligent Embedded Systems [paper]
  • [MobiCom 2022] Mandheling: mixed-precision on-device DNN training with DSP offloading [paper]
  • [MobiSys 2022] Memory-efficient DNN training on mobile devices [paper]

Profilers

  • [MobiSys 2021] nn-Meter: towards accurate latency prediction of deep-learning model inference on diverse edge devices [paper]

Application-centric Approaches

  • [MobiSys 2023] OmniLive: Super-Resolution Enhanced 360° Video Live Streaming for Mobile Devices [paper]
  • [IPSN 2023] PointSplit: Towards On-device 3D Object Detection with Heterogeneous Low-power Accelerators [paper]

Server-Edge Collaborative Inference

  • [MobiCom 2023] AccuMO: Accuracy-Centric Multitask Offloading in Edge-Assisted Mobile Augmented Reality [paper]
  • [IPSN 2023] CoEdge: A Cooperative Edge System for Distributed Real-Time Deep Learning Tasks [paper]

Efficient AI methods

DNN model pruning, quantization, compression, efficient ViT, etc. are the most popular methods to reduce the computational complexity of DNN models. This repo collects the research works presenting efficient AI methods.

  • Relevant conference: CVPR, ICLR, NeurIPS, ICML, ICCV, ECCV, AAAI

Pruning and Compression

  • [CVPR 2023] DepGraph: Towards Any Structural Pruning [paper] [code]
  • [ICML 2023] Efficient Latency-Aware CNN Depth Compression via Two-Stage Dynamic Programming [paper] [code]
  • [NeurIPS 2022] Structural Pruning via Latency-Saliency Knapsack [paper] [code]

Efficient Vision Transformer (ViT)

  • [ICLR 2023 top 5%] Token Merging: Your ViT but Faster [paper] [code]
  • [ICCV 2023] Rethinking Vision Transformers for MobileNet Size and Speed [paper] [code]
  • [ICCV 2023] EfficientViT: Multi-Scale Linear Attention for High-Resolution Dense Prediction [paper] [code]
  • [CVPR 2023] SparseViT: Revisiting Activation Sparsity for Efficient High-Resolution Vision Transformer [paper] [code]
  • [CVPR 2022 Oral] PoolFormer: MetaFormer Is Actually What You Need for Vision [paper] [code]

Elastic Neural Networks

  • [CVPR 2023 Highlight] Stitchable Neural Networks [paper] [code]

Efficient Neural Radiance Fields (NeRF)

  • [CVPR 2023] Real-Time Neural Light Field on Mobile Devices [paper] [code]
  • [ECCV 2022] R2L: Distilling Neural Radiance Field to Neural Light Field for Efficient Novel View Synthesis [paper] [code]