# Introduction to Adversarial Attacks against RL Algorithms

If we want to apply our learned knowledge of RL algorithms and utilize it to solve practical problems an inevitable question arises: How vulnerable are RL algorithms against malicious manipulation? The importance of this question becomes evident if we consider fields, in which RL is already used. One such example would be autonomous vehicles. A self-driving car crashing upon sensing a certain malicious pattern of road markings would be disastrous. Similarly, a vehicle should be as resistent against wrong or manipulated sensordata.


## Precursors in Machine Learning

Of course, RL is not the first field in where such attacks have to be taken into consideration. Image recognition and classification, cousins of RL, are types of Machine Learning in which such attacks have long since been considered and researched. 

Intrusion detection and spam filtering were early subjects of adversarial attack research. An early taxonomy of various adversarial attacks on machine learning is outlined in the 2006 paper by Marco Barreno et. al "Can Machine Learning Be Secure?". This paper discusses a variety of adversarial attacks, how they degrade performance and what defenses existed at the time.

The advent of neural networks and deep learning was also soon followed by the discovery of possible adversarial attaks. While RL algorithms may not have been considered as the subject at the time, modern research shows how these techiques can also be used in adversarial attacks against RL algorithms.


## Types of RL Adversarial Attacks

Different applications of RL algorithms can be attacked in different ways. Thus we need do differentiate between multiple types of adversarial attacks. The goal of an adversarial attack may vary, yet it is usualy to diminish the reward gained by a target agent in an episode.

![Adversariel_eng-2.png](attachment:Adversariel_eng-2.png)

### Crafting an Adversrial Observation

Given the policy $\pi$ of an algorithm, the next action $a$ is decided based upon the current state $s$, $\pi(s) = a$. If an attacker can now change state $s$ into a malicious observation $s'$, the attacker may manipulate what action $a$ is chosen next. This can be considered an adversarial attack and is in essence very similar to how adversaial attacks against image recognition algorithms are executed. 

Usually $s'$ will be created by adding a small perturbation to the true observation $s$. Different metrics can be chosen to specify what qualifies as a "small" perturbation. If the state $s$ is comprised of the pixel values of an image, a small perturbation could be either a very small change to each pixle's value or a greater change to very few pixels. 

This attack can be further specialiced by only attacking the most impactful states in a trajectory, minimizing the needed interference and thus hopefully minimizing possibile detection.

One such attack can be seen in the paper "Adversarial Attacks on Neural Network Policies" by Sandy Huang et. al. where such an attack is shown against an agent playing Pong. This approach will be further exporeld in future posts.

![Bild1.PNG](attachment:Bild1.PNG)

### Adverarial Polices

If we have an environment consisting of multiple agents interacting with each other, another adversarial attack becomes possible. An attacker might control one or more agents inside that environment fully. The goal of these malicous agents might then simply be to diminish the performance of the target agent.

This type of attack is less explored thus far, as multi-agent environments are in general less explored so far and many common RL frameworks fail to suppport multi-agent environments properly. Even though this type of attack is still new, we will present the work done by Adam Gleave et. al. in the paper "Adversarial Policies: Attacking Deep Reinforcement Learning" in a future post.

## Conlusion

By showcasing some applications and types of adversarial attacks on RL algorithms we hope to give you an understanding of the security of RL algorithms, possible vulnerabilities and needed considerations for secure algorithms in following posts.

## References

- [Can Machine Learning be Secure? Barreno et. al., 2006](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.72.4400&rep=rep1&type=pdf)
- [Adversarial Attacks on Neural Network Policies, Sandy Huang et. al., 2017](https://arxiv.org/abs/1702.02284)

- [Adversarial Policies: Attacking Deep Reinforcement Learning, Adam Gleave, 2019](https://arxiv.org/abs/1702.02284)