Skip to content

FIVEYOUNGWOO/Pareto-DQN-for-Joint-Optimziation-in-Massive-MIMO-Networks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 

Repository files navigation

PQN-for-Joint-Optimization-in-Massive-MIMO-Networks

  • We achieve joint optimization of spectral efficiency (SE) and energy efficiency (EE) in Massive MIMO networks by approximating the Pareto Front in a 2D space with multiple objectives, using a replay buffer and DQN model.

  • Therefore, I believe that the suggested Pareto Q-Network (PQN) can be valuable for understanding reinforcement learning-based approaches to multi-objective joint optimization and gaining insights into this area.

Introduction

  • Reinforcement learning (RL) is a powerful tool for optimization and decision-making, aiming to maximize rewards as an alternative to human decision-making. However, RL has limitations in real-world problems where multiple objectives or dynamic environments with shifting priorities exist. In these contexts, adaptive decision-making performance may degrade. To address this, we designed a multi-objective RL algorithm focused on jointly optimizing SE and EE, which are critical objectives in 5G networks.

Massive MIMO Network Environments

Pareto Front Apprixmation Deep Q-Network (PQN)

  • Pareto Front Approximation : PQN approximates a Pareto front for multiple conflicting objectives, using reinforcement learning to balance SE and EE.

  • Replay Buffer for Multi-Objective Optimization : PQN leverages a replay buffer to store and sample experiences, maintaining diverse solutions across SE and EE objectives. These experiences are replayed to enhance the network’s ability to optimize both objectives.

  • Deep Q-Learning Approach : PQN employs a Deep Q-Network (DQN) to evaluate and optimize actions for each state, aiming to achieve a balance between the two objectives over time.

  • Adaptive Decision-Making in Dynamic Environments : PQN is designed for adaptability in environments where priorities may shift between objectives, such as in Massive MIMO networks, where balancing SE and EE is critical.

Experiemental results

  • Spectral Efficiency : This plot shows the SE reward across episodes. The upward trend indicates that PQN effectively improves SE over time as it learns from interactions in the environment.

  • Energy Efficiency : This plot represents the EE reward across episodes. Similar to SE, there is a clear upward trend, signifying that PQN is capable of consistently improving EE through training.

  • Trade-Off Performance : This scatter plot visualizes the trade-off between SE and EE achieved by PQN. The diagonal distribution suggests that the massive MIMO network balances the two objectives, with one objective often correlating with improvements in the other.

About

Multi-objective reinforcement learning-based network's joint optimization.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published