This repo is the code implementation of the paper titled "Multi-agent Inductive Policy Optimization" (MAIPO). In this paper, a novel multi-agent reinforcement learning algorithms is proposed and this repo contains all details of MAIPO.
- Python3 (include numpy, tensorflow1.0 etc.)
- [Pettingzoo](GitHub - Farama-Foundation/PettingZoo: Gym for multi-agent reinforcement learning)
- [FAST.Farm](GitHub - OpenFAST/openfast: Main repository for the NREL-supported OpenFAST whole-turbine and FAST.Farm wind farm simulation codes.)
python3 ./run.sh > run.log 2>&1 &
Please use the following command to see other input parameters of the train.py file.
python3 train.py --help
This section first introduce how we control a wind farm and the wind farm simulator we used. Then a traditional wind farm control method---MPPT is introduced as a baseline. Afterward, we will evaluate the proposed MAIPO algorithm on this simulator and show the comparison with HAPPO, MAPPO, MPPT etc. Finally, we will exhibit how the control policies trained by MARL overcome the wake effect in wind farms and boost their power generation. These results will be added in the appendix of the corresponding paper.
A wind turbine control system consists of sensors, actuators, and a system that ties these elements together. A hardware or software system processes input signals from the sensors and generates output signals for actuators. The main goal of the controller is to modify the operating states of the turbine to maintain safe turbine operation, maximize power, mitigate damaging fatigue loads, and detect fault conditions. In this paper, our goal is to collectively control different substructures of wind turbines to maximize the power generated by the entire wind farm, including blade pitch control, nacelle yaw control, generator torque control etc. The blade pitch and nacelle yaw are shown in the following figure:
Roughly, the output power of a wind turbine can be determined by its yaw angle, pitch angle and generator torque:
The wind farm simulator we used in this paper is FAST.Farm, it serves as the real-time digital counterpart of a physical wind farm. This simulator both include models of the aerodynamics of the wind farm and the elastic-servo dynamics of wind turbines. Different from the traditional control methods that use the wind farm model to design the control policy, MARL aims to teach each agent (turbine) to learn the control policy through interacting with the simulator. Specifically, at time
The optimal control policy for an isolated wind turbine is maximum power point tracking (MPPT):
when the wind speed is below rated, the objective is to control the generator torque to maximize its power output. When the wind speed is sufficient to drive the full-power operation of wind turbines, the goal becomes to maintain the output at the rated level to alleviate the structural load via the joint control of blade pitch, yaw angle, and generator torque. In wind farms, turbines are normally installed in arrays, and thus the actions of upstream turbines affect the environmental state of their downstream counterparts through the wake effect. Although MPPT can achieve optimal solutions for upstream turbines, the power outputs of HWTs within the wake planes of upstream turbines are reduced greatly, causing a decline in power generation of the entire wind farm. Therefore, how to design a control policy for wind farms which can overcome the wake effect is an ongoing issue. This paper proposes MAIPO to solve this problem.
| Experment 1 | Experiment 2 | Experiment 3 | Experiment 4 | |
|---|---|---|---|---|
| Turbine Type | NREL 5MW | SOWFA | Palm | CLwind |
| Numbers | 9 | 9 | 6 | 2 |
| Power Scale | 0.99 | 0.99 | 0.99 | 0.99 |
| Rotor length | 126 | 96 | 81 | 60 |
Experiment 1
Experiment 2
Experiment 3Experiment 4
We reproduce HAPPO and compare it with MAIPO. From the following results, we can find the performance of MAIPO is better than HAPPO.












