RadDQN

RadDQN: a Deep Q Learning-based Architecture for Finding Time-efficient Minimum Radiation Exposure Pathway

Description: We introduce RadDQN architecture that works on efficient reward-based function to provide time-efficient minimum radiation exposure based optimal path. Inclusion of information about position and strength of radiation source, radial distances of the agent to the radioactive sources and from destination makes the reward-based function effective to deal with multiple scenarios that significantly differs on distribution of radiation field. Further, we propose unique exploration strategies that enables the conversion of random action into a model-directed action based on the importance of future reward and progress of the training. In our article (link), we have demonstrated the performance of agent in multiple scenarios with varying number of source and its strength. Moreover, we benchmarked the predicted path with grid-based deterministic Dijkstra algorithm. Our model is found to achieve superior convergence rate and high training stability as compared to vanilla DQN.

Link of the article (preprint) at ArXiv https://arxiv.org/abs/2402.00468

Flowsheet of RadDQN architechture:

Radiation Aware Reward Structure:

Radiation-aware reward function for two sources of unit radiation strength at (2,0) and (7,7). Reward \textbf{r} results from the subtraction of $\frac{n}{R_{e}}$ from $\sum_i{\frac{\Gamma S_i}{R_{s,i}^{2}}}$. In the figure, the start and exit cell are symbolized as S and E, respectively. The value of $\Gamma$ and n are taken as 1.

Run command:

User needs to provide the path for saving log file. The seed may be assigned using --seed option.

python main.py --config_file configs/dqn_ef_1_s_0.yaml --logdir log/seed_3007_ef_1_s_0 --seed 3007

Scenario:

Three radioactive sources on the simulated floor. The task of the agent is to reach destination in quickest time but with exposure of minimum radiation intensity (cumulatively).

The above plot shows the optimum path (black dashed line) predicted by RadDQN in case of three sources (S1, S2 and S3) in simulated floor. The important diversion points within the predicted path in response to the change in radiation intensity of sources are shown as A/B/C/D. (a) Top panel: S1, S2 and S3 has equal radiation strength (5 unit) (b) bottom left panel: radiation strength of S1 source is increased by 20-fold. The trajectory has three major diversion points to minimize the radiation exposure. (c) bottom right panel: radiation strength of S1 and S2 sources are increased by 20-fold. The trajectory has four major diversion points to minimize the radiation exposure.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
configs		configs
models		models
GridBoard.py		GridBoard.py
Gridworld_v6static_3sources.py		Gridworld_v6static_3sources.py
LICENSE		LICENSE
README.md		README.md
gridworld_config.py		gridworld_config.py
info.log		info.log
main.py		main.py
run_script.sh		run_script.sh
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configs

configs

models

models

GridBoard.py

GridBoard.py

Gridworld_v6static_3sources.py

Gridworld_v6static_3sources.py

LICENSE

LICENSE

README.md

README.md

gridworld_config.py

gridworld_config.py

info.log

info.log

main.py

main.py

run_script.sh

run_script.sh

trainer.py

trainer.py

Repository files navigation

RadDQN

About

Releases

Packages

Languages

License

BiswajitSadhu/RadDQN

Folders and files

Latest commit

History

Repository files navigation

RadDQN

About

Resources

License

Stars

Watchers

Forks

Languages