🐟 DyCFish-Gym English | 中文
An Intelligent Control Platform Bridging Reduced-Order Dynamics and Computational Fluid Dynamics for Propulsion
- 🏠 About
- 📁 Project Structure
- 📚 Getting Started
- 🚀 Usage
- 📦 Benchmark & Method
- 📝 TODO List
- 🔗 Citation
- 📄 License
- 👏 Acknowledgements
DyCFish-Gym is a unified intelligent control platform for bio-inspired thunniform propulsion. It bridges reduced-order dynamics (ROM) and high-fidelity Computational Fluid Dynamics (CFD) via a two-stage deep reinforcement learning (DRL) framework, achieving efficient policy learning while maintaining physical consistency.
Training a DRL agent entirely within full-order CFD solvers requires millions of environmental interactions, making it computationally prohibitive. DyCFish-Gym addresses this fundamental efficiency–fidelity trade-off through a hierarchical architecture:
- Stage 1 — ROM Pre-training: A reduced-order articulated dynamics model enables rapid policy pretraining. The ROM captures essential degrees of freedom (body + tail joint) with CPG-parameterized actuation, enabling >10³ simulation steps per second.
- Stage 2 — CFD Fine-tuning: The pretrained policy is transferred to a high-fidelity CFD environment (ANSYS Fluent) via the PyFluent interface for refinement under fully resolved Navier–Stokes equations. This stage accounts for only 10–15% of total training time.
The platform is systematically validated across three representative closed-loop tasks:
- 🎯 Trajectory tracking — straight-line, semicircular, and W-shaped paths
- 🏃 Rapid escape — predator evasion via C-start mechanism reconstruction
- ⚓ Station-keeping — maintaining position against uniform inflow disturbances
Key features include:
- 🧬 Biologically Interpretable Behaviors: The DRL agent autonomously learns bio-inspired mechanisms, such as exploiting reverse Kármán vortices to minimize cost of transport (COT).
- 📊 Outperforms Classical Controllers: 40% lower trajectory error, 30% smaller steady-state deviation, and 20% lower energy cost relative to PID and MPC baselines.
- 🤖 Sim-to-Real Transfer: Learned policies are successfully deployed on a physical dual-joint robotic fish for trajectory tracking experiments.
- Operating System: Windows or Linux (Ubuntu 20.04+ recommended)
- NVIDIA GPU (Optional, but recommended for PyTorch training)
- ANSYS Fluent (Required for Stage 2, must be installed and configured for
ansys-fluent-core) - Conda
- Python 3.9
# Create and activate conda environment
conda create -n dycfish python=3.9.13
conda activate dycfish
# Upgrade pip
pip install --upgrade pip
# Install core libraries (Deep Learning and RL)
pip install numpy==2.0.2
pip install torch==2.1.0
pip install stable-baselines3[extra]
# Install visualization and utility packages
pip install pygame opencv-python matplotlib
# Install Pyfluent package (required for CFD stage)
pip install ansys-fluent-core
# Adjust Pandas version (to prevent conflicts)
pip uninstall pandas -y
pip install pandas==2.2.2git clone https://github.com/YOUR_USERNAME/DyCFish-Gym.git
cd DyCFish-GymThe ansys-fluent-core package enables seamless control of ANSYS Fluent from Python.
- For detailed documentation and guides, refer to the official repository: PyFluent.
Controlling Fluent via Python:
| Operation | Command | Description |
|---|---|---|
| Import | import ansys.fluent.core as pyfluent |
Import the core library |
| Launch (No GUI) | session = pyfluent.launch_fluent() |
Start Fluent without GUI |
| Launch (With GUI) | session = pyfluent.launch_fluent(show_gui=True) |
Start Fluent with GUI (meshing mode only) |
| Exit | session.exit() |
Close the Fluent session |
Pyfluent Journaling (Code Generation):
- Launch Fluent:
session = pyfluent.launch_fluent() - Start recording:
(api-start-python-journal "python_journal.py") - Perform TUI commands → Python code is generated in
python_journal.py - Stop recording:
(api-stop-python-journal)
The first stage uses the reduced-order dynamics environment for rapid policy learning.
Train the agent:
cd dynamic_stage
python train.pyThis will:
- Launch the
FishEnvenvironment with real-time Pygame rendering - Train a PPO agent for 1,000,000 timesteps
- Save model checkpoints every 10,000 steps to
./models/ - Log episode rewards to
./logs/episode_rewards.txt - Support TensorBoard monitoring via
./tensorboard/
Monitor training:
tensorboard --logdir ./tensorboard/Evaluate a trained model:
cd dynamic_stage
python test.pyThis loads a trained model, runs 5 evaluation episodes, and saves the results as a video to ./logs/test_escape.mp4.
The second stage fine-tunes the pretrained policy in a high-fidelity ANSYS Fluent CFD environment.
⚠️ Note: ANSYS Fluent must be installed and properly configured before running this stage.
Run CFD training:
cd CFD_stage
python training.pyThis will:
- Launch ANSYS Fluent (double-precision 2D solver, 6 processors)
- Train with multi-worker support and automatic checkpoint resume
- Save models with local/global best tracking to
./saved_models/ - Log detailed performance metrics per episode
DyCFish-Gym is organized into four tightly coupled functional modules:
| Module | Description |
|---|---|
| CPG Parameterization | Encodes biological kinematics into a low-dimensional actuation space (amplitude A, frequency f) |
| Model Construction | ROM for rapid pretraining + CFD for high-fidelity refinement |
| DRL Control | PPO-based policy learning with stable cross-model transfer |
| Strategy Biomimicry | Behavioral evaluation: decodes optimized swimming strategies for mechanistic insights |
| Parameter | Dynamic Stage (FishEnv) |
CFD Stage (FluentEnv) |
|---|---|---|
| Observation | 7D: [x, y, ψ, vx, vy, ωz, t] | 7D: [x, y, θ, vx, vy, ωz, t] |
| Action | 2D: [amplitude, frequency] | 2D: [frequency, amplitude] |
| Physics | Reduced-order articulated dynamics | Unsteady Navier–Stokes (Fluent) |
| Speed | >10³ steps/s | ~50 steps/s |
| Method | Trajectory Tracking RMSE | Station-keeping Deviation | Escape Success | Energy Cost |
|---|---|---|---|---|
| PID | 1.00 | 1.00 | 0.82 | 1.00 |
| MPC | 0.78 | 0.83 | 0.87 | 0.92 |
| DyCFish-Gym | 0.60 | 0.68 | 0.96 | 0.78 |
- Release CFD_stage training code.
- Release dynamic_stage code.
- [] Release the demo videos.
- Release the paper.
This work was supported by the National Key R&D Program of China (Grant No. 2024YFC3013200).
- Stable Baselines3 — PPO implementation
- Gymnasium — RL environment interface
- ANSYS PyFluent — Python interface for ANSYS Fluent
- PyTorch — Deep learning framework
- Pygame — Real-time visualization
