# Tactical Approaches to Adversarial attacks on RL

## Adversarial attacks

<!-- 
- fundamentals of adversarial attacks
- keywords
- goals of attacks
- how agent and environment are affected 
- focus of this post 
-->

First of all let us introduce the fundamental concepts of adversarial attacks. The overall goal of an adversarial attack is reducing the agent's reward to a minimum by manipulating its choices.

In order to achieve this goal an adversarial attack impairs the performance of a trained model - in our case a RL trained model - by feeding it with false information. This so called ***adversarial sample*** usually consists of a perturbed version of the original observation which itself is returned by the environment. The adversarial sample manipulates the agent to take preferably the least desired action while also being similar enough to a valid observation to not be easily detectable. 

While the ***adversarial perturbation*** is the amount of noise added to the observation during the sample crafting, the instance or agent crafting the samples themselves is called ***adversary***. Furthermore we differentiate so called ***white-box attacks*** from ***black-box attacks***. Adversaries of the latter attack models of which they have no information. In some cases (cases in which the adversary has limited information about the target model but never its parameters) black-box attacks are further sub-classified in ***semi-black-box attacks***. 

This specific post will limit itself on ***tactical approaches*** to adversarial attacks as presented in Lin, *et al.* (2017).

## Different types of adversarial attacks

<!-- 
( strategically timed to critical point)(maybe enchanting,antagonist)

- explain basic idea(strategically timed and enchanting)
- explain attack strategy and present functions(informal or formal)
- effects on agent and environment

- introduce critical point strategy( and antagonist attack) ( what are the differences?)
- basic idea and principle
- attack strategy
- effect on agent and environment compared to strategically timed attack 
-->

Starting off the most approachable and simple way to go about attacking an agent using adversarial methods is the ***uniform attack***. Here adversarial samples are crafted at each and every timestep. Therefore the agent is attacked a lot resulting in a large adversarial perturbation which somewhat defeats the idea of adversarial attacks being rather difficult to detect.

### Strategically timed attack

Lin, **et al.** introduce the idea of a so called ***strategically timed attack***. Even for simple examples it is quite intuitive that attacks are not equally efficient at different timesteps, meaning e.g. attacking an agent that acts in OpenAI Gym's **CarRacing** environment (introduced in more detail later on) would be less efficient during longer straight sections of the track compared to curved sections.
To determine when the adversary is to craft an adversarial sample we first compute a function $c$ that essentially compares the rewards of the agent's best and worst action as follows:
$$c(s_t) = \max_{a_t}\pi(s_t, a_t) - \min_{a_t}\pi(s_t, a_t)$$

Note that this method of computing $c$ is only applicable for policy gradient-based methods like A3C or PPO.

Next an adversarial sample is only crafted if $c$ at least matches a certain threshold $\beta$. Overall the number of attacks during an episode depends on wether or not an adversarial sample was crafted in the individual timesteps and therefore directly on said threshold $\beta$. Put simple a large threshold results in few attacks while a small threshold results in many attacks. This of course not only affects the overall adversarial perturbation but also the effectiveness of the adversarial attacks. Choosing $\beta$ wisely therefore determines both the success of an adversary attack and its perceptibility. 

### Enchanting attack

## Implementation and results
- show process and results through implementation on example
- compare results of attacks 

## Conclusion
- evaluate results of attacks
- explain why critical point attack is more "evolved"
- outlook on application of these attacks on Real world scenarios
- (maybe comparison with other attacks

# References

- [Carlini and Wagner (2016)](https://ieeexplore.ieee.org/abstract/document/8294186)
- [Lin, _et al._ (2017)](https://arxiv.org/abs/1703.06748)
- [CarRacing](https://gym.openai.com/envs/CarRacing-v0/)