Deep Reinforcement Learning for Secrecy Energy-Efficient UAV Communication with Reconfigurable Intelligent Surfaces
IEEE Wireless Communications and Networking Conference 2023 (WCNC 2023)
Simulation for Conference Proceedings https://doi.org/10.1109/WCNC55385.2023.10118891
Refer this link for the preprint
This paper investigates the physical layer security (PLS) issue in reconfigurable intelligent surface (RIS) aided millimeter-wave rotary-wing unmanned aerial vehicle (UAV) communications under the presence of multiple eavesdroppers and imperfect channel state information (CSI). The goal is to maximize the worst-case secrecy energy efficiency (SEE) of UAV via a joint optimization of flight trajectory, UAV active beamforming and RIS passive beamforming. By interacting with the dynamically changing UAV environment, real-time decision making per time slot is possible via deep reinforcement learning (DRL). To decouple the continuous optimization variables, we introduce a twin twin-delayed deep deterministic policy gradient (TTD3) to maximize the expected cumulative reward, which is linked to SEE enhancement. Simulation results confirm that the proposed method achieves greater secrecy energy savings than the traditional twin-deep deterministic policy gradient DRL (TDDRL)-based method.
RIS-aided mmWave UAV system under the presence of eavesdroppers and imperfect channel state information (CSI)
A Twin-TD3 (TTD3) algorithm to decouple the joint optimization of:
- UAV active beamforming and RIS passive beamforming
- UAV flight trajectory
We adopt double DRL framework, where the 1st and 2nd agent provides the policy for task (1) and (2), respectively.
Setup the repo
conda create --name <env> python=3.10.4
conda activate <env>
git clone https://github.com/yjwong1999/Twin-TD3.git
cd Twin-TD3
pip install -r requirements.txt
User can train two types of algorithm for training:
- Twin DDPG is TDDRL algorithm
- Twin TD3 is our proposed TTD3 algorithm
Run the following in the bash
or powershell
main_train.py
is the main python file to train the DRL algorithms
# To use Twin DDPG with SSR as optimization goal
python3 main_train.py --drl ddpg --reward ssr
# To use Twin TD3 with SSR as optimization goal
python3 main_train.py --drl td3 --reward ssr
# To use Twin DDPG with SEE as optimization goal
python3 main_train.py --drl ddpg --reward see
# To use Twin TD3 with SEE as optimization goal
python3 main_train.py --drl td3 --reward see
# To use pretrained DRL for UAV trajectory (recommended for stable convergence)
python3 main_train.py --drl td3 --reward see --trained-uav
# To set number of episodes (default is 300)
python3 main_train.py --drl td3 --reward see --ep-num 300
# To set seeds for DRL weight initialization (not recommended)
python3 main_train.py --drl td3 --reward see --seeds 0 # weights of both DRL are initialized with seed 0
python3 main_train.py --drl td3 --reward see --seeds 0 1 # weights of DRL 1 and DRL2 are initialized with seed 0 and 1, respectively
run_simulation.py
is the python file to run the simulation using your trained models
# plot everything for each episode
python3 run_simulation.py --path data/storage/scratch/<DIR> # if you train the algorithm without the pretrained uav
python3 run_simulation.py --path data/storage/trained_uav/<DIR> # if you train the algorithm with the pretrained uav
load_and_plot.py
is the python file to plot the (i) Rewards, (ii) Sum Secrecy Rate (SSR), (iii) Secrecy Energy Efficient (SEE), (iv) UAV Trajectory, (v) RIS configs for each episode in one experiments. The plotted figures are saved at data/storage/scratch/<DIR>/plot
or data/storage/trained_uav/<DIR>plot
# plot everything for each episode
python3 load_and_plot.py --path data/storage/scratch/<DIR> --ep-num 300 # if you train the algorithm without the pretrained uav
python3 load_and_plot.py --path data/storage/trained_uav/<DIR> --ep-num 300 # if you train the algorithm with the pretrained uav
Note that you can use the bash script batch_train.sh
and batch_eval.sh
to train the algorithms and evaluate them using the previous two python codes
# To train on batch
bash batch_train.sh
# To evaluate on batch
bash batch_eval.sh
We run the main_train.py
for 5 times for each settings below, and averaged out the performance
SSR and SEE (the higher the better) Total Energy Consumption (the lower the better)
Algorithms | SSR (bits/s/Hz) | Energy (kJ) | SEE (bits/s/Hz/kJ) |
---|---|---|---|
TDDRL | 5.03 | 12.4 | 40.8 |
TTD3 | 6.05 | 12.7 | 48.2 |
TDDRL (with energy constraint) | 4.68 | 11.2 | 39.4 |
TTD3 (with energy constraint) | 5.39 | 11.2 | 48.4 |
Summary
- In terms of SSR, TTD3 outperforms TDDRL with or without energy constraint
- In terms of SEE and Energy, TTD3 (with energy constraint) outperforms all other algorithms
- Generally, TTD3 algorithm are better than TTDRL
- Even with energy contraint (trade-off between energy consumption and SSR), TTD3 outperforms TDDRL in all aspects
* Remarks:
Note that the performance of DRL (especially twin DRL) has a big variation, sometimes you may get extremely good (or bad) performance
The above benchmark results are averaged performance of several experiments, to get a more holistic understandings on the algorithms
It is advised to use the benchmark UAV models we trained, for better convergence.
This approach is consistent with the codes provided by TDDRL
This work was supported by the British Council under UK-ASEAN Institutional Links Early Career Researchers Scheme with project number 913030644.
Both RIS Simulation and the System Model for this Research Project are based the research work provided by Brook1711.
We intended to fork the original repo for the system model (as stated below) as the base of this project.
However, GitHub does not allow a forked repo to be private.
Hence, we could not maintain our code based a forked version of the original repo, while keeping it private until the project is completed.
We would like to express our utmost gratitude for Brook1711 and his co-authors for their research work.
RIS Simulation is based on the following research work:
SimRIS Channel Simulator for Reconfigurable Intelligent Surface-Empowered Communication Systems
The original simulation code is coded in matlab, this GitHub repo provides a Python version of the simulation.
The simulation of the System Model is provided by the following research work:
Learning-Based Robust and Secure Transmission for Reconfigurable Intelligent Surface Aided Millimeter Wave UAV Communications
The code is provided in this GitHub repo.
We can derive the Rotary-Wing UAV’s propulsion energy consumption based on the following research work:
Energy Minimization in Internet-of-Things System Based on Rotary-Wing UAV
Main reference for TD3 implementation:
PyTorch/TensorFlow 2.0 for TD3
- Add argparse arguments to set drl algo and reward type
- Add argparse arguments to set episode number
- Add argparse arguments to set seeds for the two DRLs
- Add argparse arguments to load pretrained DRL for UAV trajectory
- Add benchmark/pretrained model
- Project naming (use instead of using datetime format)
- Remove saving "best model", there are no best model, only latest model
- The following codes can be used, but you have to manually change the filepath in the codes
plot_ssr.py
is the python file to plot the final episode's SSR for the 4 benchmarks in the paper
# plot ssr
python3 plot_ssr.py
plot_see.py
is the python file to plot the final episode's SSR for the 4 benchmarks in the paper
# plot see
python3 plot_see.py
plot_traj.py
is the python file to plot the final episode's UAV trajectory for the 4 benchmarks in the paper
# plot UAV trajectory
python3 plot_traj.py
@INPROCEEDINGS{10118891,
author={Tham, Mau-Luen and Wong, Yi Jie and Iqbal, Amjad and Ramli, Nordin Bin and Zhu, Yongxu and Dagiuklas, Tasos},
booktitle={2023 IEEE Wireless Communications and Networking Conference (WCNC)},
title={Deep Reinforcement Learning for Secrecy Energy- Efficient UAV Communication with Reconfigurable Intelligent Surface},
year={2023},
doi={10.1109/WCNC55385.2023.10118891}}