Skip to content

and-mill/sustainability_attacks_on_AI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 

Repository files navigation

Sustainability_Attacks_on_AI

MIT License

There are thousands of publications on making AI more energy and time efficient. We subsume these proposals as sustainability measures for now.

From an AI security perspective: Can a bad actor nullify these optimizations for fun (and profit)? We call this an energy-latency attack. This is a novel angle on AI security which will become very relevant very soon. Sponge Examples: Energy-Latency Attacks on Neural Networks will give you a good idea on energy-latency attacks.

The objective of this repo is to match sustainability measures to existing attacks. This way, we can identify trends and possibly blank spots.

For now, this is a collection of papers and articles with personal annotations. It can already help you find sources on (the robustness of) sustainable AI.

Feel free to star and fork. Contact me if you are curious (See my Github bio).

Contents

Motivation

Publication Notes
A New Golden Age in Computer Architecture: Empowering the Machine-Learning Revolution Journal article about the rise of AI hardware
Energy and Policy Considerations for Deep Learning in NLP "LLM-Training consumes 5 Car lifetime CO2e Emissions”
Carbon Emissions and Large Neural Network Training "GPT-3 training consumed as much CO2e as 5 rountrips between San Francisco and New York"
Characterizing Sources of Ineffectual Computations in Deep Learning Networks Gives us some clues which acceleration options have the biggest influence on reducing work. See Fig.1 in the paper for ranking.
Deep Learning's Diminishing Returns: The Cost of Improvement is Becoming Unsustainable "We will achieve 95% ImageNet accuracy by 2025, but it will cost as many CO2 emission as New York City consumes in a month"
The computational limits of deep learning Same authors as Deep Learning's Diminishing Returns: The Cost of Improvement is Becoming Unsustainable
The Carbon Footprint of Machine Learning Training Will Plateau, Then Shrink Disputes Deep Learning's Diminishing Returns: The Cost of Improvement is Becoming Unsustainable, as sustainability measures will increase too.
Sustainable AI: Environmental Implications, Challenges and Opportunities Facebook AI reports trillions of inferences across datacenters
Tensor Dash presentation video Has some nice charts on rising energy consumption of DNNs at the beginning
The Carbon Footprint of Transformers
Huggingface models lists carbon emitted see distilgpt2, it uses https://mlco2.github.io/impact/#compute

Sustainability

Sparsity

General Sparsity (TODO: sort later)

Publication Venue/Journal/Journal Notes
Minimizing Energy Consumption of Deep Learning Models by Energy-Aware Training
Automatic Generation of Multi-Precision Multi-Arithmetic CNN Accelerators for FPGAs
Accelerating Sparse Deep Neural Networks NVIDEA 2:4 structured sparsity in Ampere Architecture
Sparseloop: An Analytical Approach To Sparse Tensor Accelerator Modeling
Harnessing Manycore Processors with Distributed Memory for Accelerated Training of Sparse and Recurrent Models
Sparse-DySta: Sparsity-Aware Dynamic and Static Scheduling for Sparse Multi-DNN Workloads
Pruning and Quantization for Deep Neural Network Acceleration: A Survey

Activation Sparsity Accelerators

Publication Venue/Journal/Journal Notes
EIE: Efficient Inference Engine on Compressed Deep Neural Network 2016 ACM SIGARCH Computer Architecture News Fundamental paper for hardware-based acceleration
Retrospective: EIE: Efficient Inference Engine on Sparse and Compressed Neural Network
Inducing and Exploiting Activation Sparsity for Fast Neural Network Inference PMLR 2020
Accelerating convolutional neural networks via activation map compression 2019 IEEE/CVF
Compressing DMA Engine: Leveraging Activation Sparsity for Training Deep Neural Networks 2018 IEEE Symposium on High-Performance Computer Architecture
SNICIT: Accelerating Sparse Neural Network Inference via Compression at Inference Time on GPU ICPP 2023
TensorDash: Exploiting Sparsity to Accelerate Deep Neural Network Training
Accelerating Deep Neural Networks via Semi-Structured Activation Sparsity ICCV Workshop 2023
Two sparsities are better than one: unlocking the performance benefits of sparse–sparse networks
Exploiting Activation Sparsity for Fast CNN Inference on Mobile GPUs ACM Transactions on Embedded Computing Systems 2021
DASNet: Dynamic Activation Sparsity for Neural Network Efficiency Improvement ICTAI 2019
A novel zero weight/activation-aware hardware architecture of convolutional neural network
SCNN: An Accelerator for Compressed-Sparse Convolutional Neural Networks
Going Deeper in Spiking Neural Networks: VGG and Residual Architectures
Sparsity-Aware and Re-configurable NPU Architecture for Samsung Flagship Mobile SoC

Activation Pruning

Publication Venue/Journal Notes
ELSA: Hardware-Software Co-design for Efficient, Lightweight Self-Attention Mechanism in Neural Networks ISCA 2021
Sanger: A Co-Design Framework for Enabling Sparse Attention using Reconfigurable Architecture MICRO 2021
DOTA: detect and omit weak attentions for scalable transformer acceleration ASPLOS 2022

Mixture of Experts (MoE)

No sense in listing them. There are good repos for that.

Parameter Efficient Fine-Tuning (PEFT)

Publication Venue/Journal Notes
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning Prequisite for understanding PEFT
LoRA: Low-rank adaptation of large language models
Few-shot parameter-efficient finetuning is better and cheaper than in-context learning
ComPEFT: Compression for Communicating Parameter Efficient Updates via Sparsification and Quantization Author: Collin Raffel

Dynamic Model Architectures

Too much to list them all for now. refer to attacks on them.

Publication Venue/Journal Notes
Dynamic Neural Networks: A Survey Good overview of all techniques in Figure 1

Channel Skipping Architectures

Publication Venue/Journal Notes
Dynamic Channel Pruning: Feature Boosting and Suppression ICLR 2019
Channel gating neural networks NeurIPS 2019

Little Big Transformer

Publication Venue/Journal Notes
Speculative Decoding with Big Little Decoder NeurIPS Poster 2023
Big Little Transformer Decoder

Latency Predictors

Publication Venue/Journal Notes
on-the-fly, on-chip latency predictors for Edge TPUs Google

Attacks

Sparsity

Publication Venue/Journal Notes
Sponge Examples: Energy-Latency Attacks on Neural Networks Euro S&P 2021 First paper to define the goal of energy-latency attacks very clearly
Energy-Latency Attacks via Sponge Poisoning

Language systems

Publication Venue/Journal Notes
Sponge Examples: Energy-Latency Attacks on Neural Networks Euro S&P 2021 First paper to define the goal of energy-latency attacks very clearly
Bad Characters: Imperceptible NLP Attacks S&P 2022 E-L attacks are hard to do imperceptibly
NMTSloth: understanding and testing efficiency degradation of neural machine translation systems

Dynamic Model Architectures

Publication Venue/Journal Notes
A Panda? No, It's a Sloth: Slowdown Attacks on Adaptive Multi-Exit Neural Network Inference
ILFO: Adversarial Attack on Adaptive Neural Networks
Dynamic Neural Network is All You Need: Understanding the Robustness of Dynamic Mechanisms in Neural Networks
AntiNODE: Evaluating Efficiency Robustness of Neural ODEs
The Dark Side of Dynamic Routing Neural Networks: Towards Efficiency Backdoor Injection
GradMDM: Adversarial Attack on Dynamic Networks

Uncategorized

Publication Venue/Journal Notes
Phantom Sponges: Exploiting Non-Maximum Suppression to Attack Deep Object Detectors
SlowLiDAR: Increasing the Latency of LiDAR-Based Detection Using Adversarial Examples
Manipulating SGD with Data Ordering Attacks NeurIPS 2021
NICGSlowDown: Evaluating the Efficiency Robustness of Neural Image Caption Generation Models

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published