Welcome to Awesome On-device AI

A curated list of awesome projects and papers for AI on Mobile/IoT/Edge devices. Everything is continuously updating. Welcome contribution!

Papers/Tutorial

1. Learning on Devices

1.1 Memory Efficient Learning

[ICML'22] POET: Training Neural Networks on Tiny Devices with Integrated Rematerialization and Paging. by Patil et al. [paper]
[NeruIPS'22] On-Device Training Under 256KB Memory. by Ji Lin, Song Han et al. [paper]
[MobiSys'22] Melon: breaking the memory wall for resource-efficient on-device machine learning. by Qipeng Wang et al. [paper]
[MobiSys'22] Sage: Memory-efficient DNN Training on Mobile Devices. by In Gim et al. 2022 [paper]

1.2 Learning Acceleration

[MobiCom'22] Mandheling: Mixed-Precision On-Device DNN Training with DSP Offloading. by Daliang Xu et al. [paper]

1.3 Learning on Mobile Cluster

[ICPP'22] Eco-FL: Adaptive Federated Learning with Efficient Edge Collaborative Pipeline Training. by Shengyuan Ye et al. [paper] [code]
[SEC'21] EDDL: A Distributed Deep Learning System for Resource-limited Edge Computing Environment. by Pengzhan Hao et al. [paper]

1.4 Measurement and Survey

[MobiSys'21 Workshop] Towards Ubiquitous Learning: A First Measurement of On-Device Training Performance. by Dongqi Chai, Mengwei Xu et al. [paper]

2. Inference on Devices

2.1 Collaborative Inference

[MobiSys'23] NN-Stretch: Automatic Neural Network Branching for Parallel Inference on Heterogeneous Multi-Processors. by USTC & Microsoft. [paper]
[MobiSys'22] CoDL: efficient CPU-GPU co-execution for deep learning inference on mobile devices. by Fucheng Jia et al. [paper]
[InfoCom'22] Distributed Inference with Deep Learning Models across Heterogeneous Edge Devices. by Chenghao hu et al. [paper]
[TON'20] Coedge: Cooperative dnn inference with adaptive workload partitioning over heterogeneous edge devices. by Liekang Zeng et al. [paper]
[ICCD'20] A distributed in-situ CNN inference system for IoT applications. by Jiangsu Du et al. [paper]
[TPDS'20] Model Parallelism Optimization for Distributed Inference via Decoupled CNN Structure. by Jiangsu Du et al. [paper]
[EuroSys'19] μLayer: Low Latency On-Device Inference Using Cooperative Single-Layer Acceleration and Processor-Friendly Quantization. by Youngsok Kim et al. [paper]
[TCAD'18] DeepThings: Distributed Adaptive Deep Learning Inference on Resource-Constrained IoT Edge Clusters. by zhuoran Zhao et al. [paper]
[DATE'17] Modnn: Local distributed mobile computing system for deep neural network. by Jiachen Mao et al. [paper]

2.2 Latency Prediction for Inference

[MobiSys'21] nn-Meter: towards accurate latency prediction of deep-learning model inference on diverse edge devices. by Li Lyna Zhang et al. [paper]

2.3 Multi-DNN Serving

[MobiSys'22] Band: coordinated multi-DNN inference on heterogeneous mobile processors. by Seoul National University et al. [paper]

2.4 DNN Arch./Op.-level Optimization

[MobiSys'23] ConvReLU++: Reference-based Lossless Acceleration of Conv-ReLU Operations on Mobile CPU. by Shanghai Jiao Tong University [paper]

3. Models for Mobile

3.1 Lightweight Model

[ACL'20] MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices. by Zhiqing Sun et al. [paper]
[ICML'19] EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. by Mingxing Tan et al. [paper]
[CVPR'18] Shufflenet: An extremely efficient convolutional neural network for mobile devices. by Xiangyu Zhang et al.[paper]
[CVPR'18] MobileNetV2: Inverted Residuals and Linear Bottlenecks. by Mark Sandler et al. [paper]

4. On-device AI Application

4.1 On-device NLP

[Ubicomp'18] DeepType: On-Device Deep Learning for Input Personalization Service with Minimal Privacy Concern. by Mengwei Xu et al. [paper]
[Arxiv 2018] Federated learning for mobile keyboard prediction. by Google [paper]

5. Survey and Tutorial

5.1 Tutorial

[CVPR'23 Tutorial] Efficient Neural Networks: From Algorithm Design to Practical Mobile Deployments. by Snap Research [paper]

Open Source Projects

1. DL Framework on Mobile

Tensorflow Lite: Deploy machine learning models on mobile and edge devices. by Google. [code]
TensorflowJS: A WebGL accelerated JavaScript library for training and deploying ML models. by Google. [code]
MNN: A Universal and Efficient Inference Engine. by Alibaba. [code]

2. Inference Deployment

TensorRT: A C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators. by Nvidia. [code]
TVM: Open deep learning compiler stack for cpu, gpu and specialized accelerators. by Tianqi Chen et al. [code]
MACE: a deep learning inference framework optimized for mobile heterogeneous computing platforms. by XiaoMi. [code]
NCNN: a high-performance neural network inference framework optimized for the mobile platform. by Tencent. [code]

3. Open Source Auto-Parallelism Framework

3.1 Pipeline Parallelism

Pipeline Parallelism for PyTorch by Pytorch. [code]
A Gpipe implementation in Pytorch by Kakaobrain. [code]

Contribute

All contributions to this repository are welcome. Open an issue or send a pull request.

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Welcome to Awesome On-device AI

Contents

Papers/Tutorial

1. Learning on Devices

1.1 Memory Efficient Learning

1.2 Learning Acceleration

1.3 Learning on Mobile Cluster

1.4 Measurement and Survey

2. Inference on Devices

2.1 Collaborative Inference

2.2 Latency Prediction for Inference

2.3 Multi-DNN Serving

2.4 DNN Arch./Op.-level Optimization

3. Models for Mobile

3.1 Lightweight Model

4. On-device AI Application

4.1 On-device NLP

5. Survey and Tutorial

5.1 Tutorial

Open Source Projects

1. DL Framework on Mobile

2. Inference Deployment

3. Open Source Auto-Parallelism Framework

3.1 Pipeline Parallelism

Contribute

About

Releases

Packages

Contributors 2

ysyisyourbrother/awesome-on-device-AI

Folders and files

Latest commit

History

Repository files navigation

Welcome to Awesome On-device AI

Contents

Papers/Tutorial

1. Learning on Devices

1.1 Memory Efficient Learning

1.2 Learning Acceleration

1.3 Learning on Mobile Cluster

1.4 Measurement and Survey

2. Inference on Devices

2.1 Collaborative Inference

2.2 Latency Prediction for Inference

2.3 Multi-DNN Serving

2.4 DNN Arch./Op.-level Optimization

3. Models for Mobile

3.1 Lightweight Model

4. On-device AI Application

4.1 On-device NLP

5. Survey and Tutorial

5.1 Tutorial

Open Source Projects

1. DL Framework on Mobile

2. Inference Deployment

3. Open Source Auto-Parallelism Framework

3.1 Pipeline Parallelism

Contribute

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Packages