Skip to content

kyungphilDev/Robust-Deep-RL_Soft-Actor-Critic-Approach

Repository files navigation

Robust Deep RL with Soft Actor Critic approach

Robust Deep RL with a Soft Actor-Critic approach with adversarial perturbation on state observations

I designed new Robust Deep RL with a Soft Actor-Critic approach with adversarial perturbation on state observations. My work is based on SA-MDP, which is proposed by Zhang et al. (2020). For more detailed explanation, please check attached pdf file. **2022 Spring Semester, Personal Project Research _Kyungphil Park

SA-MDP(State Adversarial-MDP)

SA-MDP assumes that the fixed-adversarial attack is the situation of the worst-case with the most minimized Q value following equations, and Zhang et al. (2020) newly define it as a SA-MDP. **Zhang et al. (2020)

1

SA-SAC Regularizer

3

SA-SAC

In our work, we need to solve a minimax problem: minimizing the policy loss for a worst case

  • object function

4

Codes

I designed Robust Deep RL with a soft actor critic approach in discrete action space. I tested SA-SAC in a several atari gym environments. SAC codes are based on the **bernomone's github codes.

Train SA-SAC agent

At first, make new three directories saved_models, vidoes and Logs.

  • Before you start training, set n_steps, memory_size, train_start, reg_train_start … at the config01.json file.
  • n_steps : total nubmer of steps you want to train.
  • memory_size: buffer memory size
  • train_start: number of steps when training begins.
  • reg_train_start: number of steps when training with SA-Regularizer begins.

train.py (train vanilla SAC)

train.py 
	--config=config01.json(default)
	--new=1(default) # set 0 when you load pretrained models
 	--game=BeamRider(default) # set any atari game environment 
  • example: python train.py , python [train.py](http://train.py) —game=Assault

robust_train.py (train SA-SAC)

robust_train.py 
	--config=config01.json(default)
	--new=1(default) # set 0 when you load pretrained models
 	--game=BeamRider(default) # set any atari game environment 
  • example: python robust_train.py , python robust_[train.py](http://train.py) —game=Assault

generate_match_video.py

  • render atari game video with your trained models.
generate_match_video.py
	--config=config01.json(default)
	--seed=0(default)
  	--game=BeamRider(default) # set any atari game environment 
  	--random=False(default) # set 1 when you want to test random action.
  • example: python generate_match_video.py, python generate_match_video[.py](http://train.py) —game=Assault --random=1

PGD_generate_video.py

(+ PGD attack(adversarial perturbation on state observation)

  • render atari game video with your trained models
PGD_generate_video.py
	--config=config01.json(default)
	--seed=0(default)
	--game=BeamRider(default) # set any atari game environment 
  	--steps=10(default) # set PGD attack steps number.
  • example: python PGD_generate_video.py, python PGD_generate_video[.py](http://train.py) —game=Assault

evalulation.py

  • test trained models for several episodes.
evalulation.py
	--config=config01.json(default)
	--seed=0(default)
  	--game=BeamRider(default) # set any atari game environment 
  	--iter=10(default) # set iteration number(tot episode number).
  • example: python evalulation.py, python evalulation[.py](http://train.py) —game=Assault —iter=30

pgd_evalulation.py

(+ PGD attack(adversarial perturbation on state observation)

  • test trained models for several episodes.
pgd_evalulation.py
	--config=config01.json(default)
	--seed=0(default)
  	--game=BeamRider(default) # set any atari game environment 
  	--iter=10(default) # set iteration number(tot episode number).
  • example: python pgd_evalulation.py, python pgd_evalulation[.py](http://train.py) —game=Assault —iter=30

Results

Untitled 1 Untitled

References

[1] Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations, Zhang et al. (2020)

[2] Discrete Soft Actor Critic, bernomone's github codes

Releases

No releases published

Packages

No packages published

Languages