Skip to content

jhunter533/SANSAC

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spiking Actor Network Soft Actor Critic (SANSAC)

This project implements a Soft Actor Critic (SAC) which can additionally be used as a Spiking Actor Network Soft Actor Critic (SANSAC) via SpikingJelly. This project uses the Bipedal Walker environment from gymnasium but it can be used with other gymnasium environments through modification of input dimensions.

This code also uses wandb for logging data however this code is not integral to the main algorithm.

How to Use

Install dependencies:

pip install torch gymnasium matplotlib numpy pynvml wandb spikingjelly

This code also requires cupy and cuda to run for gpu acceleration. These can be disabled if needed.

Run training:

env = gym.make("BipedalWalker-v3")
agent = Agent(env, hidden_dim=256, hidden_dim2=256, seed=42, SAN=True)
agent.train(9000)
agent.save_model()
avg_reward=agetn.eval_agent()

In addition code can also be run through runner.py enabling/disabling wandb as needed.

References

  1. Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel & Sergey Levine. Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. CoRR, abs/1801.01290 (2018).
    http://arxiv.org/abs/1801.01290⟩

  2. Wei Fang, Yanqi Chen, Jianhao Ding, Zhaofei Yu, Timothée Masquelier, Ding Chen, Liwei Huang, Huihui Zhou, Guoqi Li & Yonghong Tian. SpikingJelly: An open-source machine learning infrastructure platform for spike-based intelligence. Science Advances 9(40): eadi1480 (2023).
    doi: 10.1126/sciadv.adi1480 ⟨https://www.science.org/doi/10.1126/sciadv.adi1480⟩

  3. Lukas Biewald. Experiment Tracking with Weights and Biases. (2020).
    Software: ⟨https://www.wandb.com/⟩

  4. Mark Towers, Ariel Kwiatkowski, Jordan Terry, John U. Balis, Gianluca De Cola, Tristan Deleu, Manuel Goulão, Andreas Kallinteris, Markus Krimmel, Arjun KG et al. Gymnasium: A Standard Interface for Reinforcement Learning Environments. arXiv:2407.17032 (2024).
    https://arxiv.org/abs/2407.17032⟩

About

Spiking Actor Network Soft Actor Critic algorithm

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages