Network-on-Chips (NoCs) serve as the standard interconnecting fabric for connecting cores, caches, and memory modules on a chip. However, they consume about 10%-36% of a total chip’s power. The paper first proposes to use a combined design of both power-gating (PG) and frequency scaling (DVFS) by disconnecting router from the power supply when the router is idle. PG reduces static power and DVFS reduces dynamic power. Next, the paper proposes an RL-based control policy for the combined design. Finally, the paper proposes to use an artificial neural network (ANN) to reduce hardware costs of reinforcement learning. The proposed involves a private L1 cache and a shared L2 cache that is attached to the router. An RL agent that monitors PG performance to indicate the best V/F and updates a state-action table. When consecutive idle cycles are detected, power to router is cutoff for saving static power. More specifically, the RL agent takes in a vector of system attributes that describes cache activities at the local core, number of received flits at each port, and PG efficiency of a router and attempts to maximize power savings. The reward function is defined as: where refers to static power and refers to the dynamic power. A penalty is applied when the average read cache miss latency surpasses a threshold. The parameters used for the RL model were , , and provides the optimal power savings. The paper also proposed an ANN approach since the RL implementation required significant overhead. The ANN has three layers that sigmoid and relu function for the hidden and output player respectively to calculate the entire state-action table. The inputs were normalized, the batch size was and the learning rate is . The experiment used gem5 simulator enhanced with GARNET, DSENT, and Synopsys design and used PARSEC as the benchmark. The paper reports that their proposed design achieved an average of 30% dynamic power savings over baselines and can improve the PID design by 13%. The proposed design reduces average static power consumption by 16%. Overall, the proposed design has an average of 26% power reduction and improves overall power consumption upon the PID design by 17%. ANN’s reduce area overhead reduced by 67% compared to RL’s but requires an area increase of 3.7% compared to the baseline. When testing 5, 10, 15, and 20 neurons, the paper reports 74%,82%, 83%, and 97% accuracy. One shortcoming is the lack of possible actions that an RL agent can take to optimize power saving in a chip. In the paper, the only actions that the RL model can take are 2GHz/1V, 1.5GHz/.8V, and 1GHz/.6V V/F ratios. Finally, it seems as if this approach works for simpler chips. I am not quite sure how an RL or ANN based approach would fair in a more complex system especially with more variables to consider and the amount of additional overhead cost. Possible unexplored challenges could be to generalize the RL approach and the ANN approach on a more complex system. Another possible unexplored challenge is using different energy saving techniques instead of just PG and DVFs approaches. I did briefly looked through one of the related works, particularly, the paper about the powerpunch approach.