CSCE 642 Project

Setup

You can set up the environment using conda + pip or virtual env + pip, Python version 3.10.12 is required.

To install the packages run

pip install -r requirements.txt

Running the Agents

To run the agents you simply run

python3 main.py <args>

A full list of arguments can be found at the bottom, to produce our results we ran the following commands.

DQN

For Training: For DQN we used these hyperparameters with alpha set to 0.0, 0.5 and 1.0.

python3 main.py -s 5000 -n 1024 -l 0.0001 -g 0.99 -d 90 -E 0.9 -m 1000 -N 100 -B 32 -a <alpha> -L 5 -S ddqn -M train

And for Testing:

python3 main.py -S ddqn -M test -e 100 -t 1000

A2C

For Training:

python3 main.py -s 20000 -n 2048 -l 0.0001 -g 0.99 -d 90 -S a2c -M train

And for Testing:

python3 main.py -S a2c -M test -e 100 -t 1000

Note that for -M test we expect there to already be trained models for all 3 alpha values (0.0, 0.5, 1.0). Our trained models are already saved.

Plotting Data

Once the data has been collected:

python3 main.py -S <any_solver> -M plot_both

Note that plotting will only occur if the corresponding pickle files are present and in the correct directory.

Command-line Arguments

You must specify the -S and -M flags to specify training and which solver or testing. All other arguments are optional and have default values configured as described below. To list out all the arguments you can use the -h flag.

Argument	Description	Default Value
`-e`	Number of episodes for evaluation (only needed with `-M test`)	100
`-s`	Number of environment steps to train for	5000
`-t`	Max steps per episode in testing (only needed with `-M test`)	1000
`-n`	Number of dense neurons per layer	1024
`-l`	Learning rate	0.0001
`-g`	Discount factor	0.99
`-d`	Max duration per episode (seconds)	90
`-E`	Initial Epsilon for Epsilon-Greedy in DQN (We use linear decay to 0 over the first 5000 steps)	0.9
`-m`	Max Replay Memory size for DQN	1000
`-N`	Interval for updating target newtork in DQN (number of steps)	100
`-B`	Batch Size for sampling for DQN Replay Memory	32
`-a`	Alpha for Prioritized Experience Replay	0.0
`-L`	Number of Dense Layers for DQN Q network	5
`-S`	Which solver to run for training, one of ddqn or a2c	None
`-M`	Mode, either train, test, or plot_both	None

Citation

The source code for the environment requests that we include this citation.

@misc{highway-env,
  author = {Leurent, Edouard},
  title = {An Environment for Autonomous Driving Decision-Making},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/eleurent/highway-env}},
}

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
__pycache__		__pycache__
agents		agents
baselines		baselines
diagrams		diagrams
helpers		helpers
pickle files		pickle files
saved models		saved models
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CSCE 642 Project

Setup

Running the Agents

DQN

A2C

Plotting Data

Command-line Arguments

Citation

About

Releases

Packages

Contributors 2

Languages

prakcoin/AutoDriveRL

Folders and files

Latest commit

History

Repository files navigation

CSCE 642 Project

Setup

Running the Agents

DQN

A2C

Plotting Data

Command-line Arguments

Citation

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages