Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
75 changes: 37 additions & 38 deletions .cursor/rules/mfc-agent-rules.mdc
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
----
-description: Full MFC project rules – consolidated for Agent Mode
-alwaysApply: true
----
---
description: Full MFC project rules – consolidated for Agent Mode
alwaysApply: true
---

# 0 Purpose & Scope
Consolidated guidance for the MFC exascale, many-physics solver.
Expand All @@ -19,7 +19,7 @@ Written primarily for Fortran/Fypp; the GPU and style sections matter only when
- Assume free-form Fortran 2008+, `implicit none`, explicit `intent`, and modern intrinsics.
- Prefer `module … contains … subroutine foo()`; avoid `COMMON` blocks and file-level `include` files.
- **Read the full codebase and docs *before* changing code.**
- Docs: <https://mflowcode.github.io/documentation/md_readme.html> and the respository root `README.md`.
- Docs: <https://mflowcode.github.io/documentation/md_readme.html> and the repository root `README.md`.

### Incremental-change workflow

Expand Down Expand Up @@ -62,27 +62,7 @@ Written primarily for Fortran/Fypp; the GPU and style sections matter only when

---

# 3 FYPP Macros for GPU acceleration Pogramming Guidelines (for GPU kernels)

Do not directly use OpenACC or OpenMP directives directly.
Instead, use the FYPP macros contained in src/common/include/parallel_macros.fpp

Wrap tight loops with

```fortran
$:GPU_PARALLEL_FOR(private='[...]', copy='[...]')
```
* Add `collapse=n` to merge nested loops when safe.
* Declare loop-local variables with `private='[...]'`.
* Allocate large arrays with `managed` or move them into a persistent
`$:GPU_ENTER_DATA(...)` region at start-up.
* **Do not** place `stop` / `error stop` inside device code.
* Must compile with Cray `ftn` and NVIDIA `nvfortran` for GPU offloading; also build CPU-only with
GNU `gfortran` and Intel `ifx`/`ifort`.

---

# 4 File & Module Structure
# 3 File & Module Structure

- **File Naming**:
- `.fpp` files: Fypp preprocessed files that get translated to `.f90`
Expand All @@ -99,25 +79,44 @@ $:GPU_PARALLEL_FOR(private='[...]', copy='[...]')
- `contains` section
- Implementation of subroutines and functions

# 5 Fypp Macros and GPU Acceleration
---

# 4 Fypp Macros

## Use of Fypp
- **Fypp Directives**:
- Start with `#:` (e.g., `#:include`, `#:def`, `#:enddef`)
- Macros defined in `include/*.fpp` files
- Used for code generation, conditional compilation, and GPU offloading

## Some examples
---

Documentation on how to use the Fypp macros for GPU offloading is available at https://mflowcode.github.io/documentation/md_gpuParallelization.html
# 5 FYPP Macros for GPU Acceleration Programming Guidelines (for GPU kernels)

Some examples include:
- `$:GPU_ROUTINE(parallelism='[seq]')` - Marks GPU-callable routines
- `$:GPU_PARALLEL_LOOP(collapse=N)` - Parallelizes loops
- `$:GPU_LOOP(parallelism='[seq]')` - Marks sequential loops
- `$:GPU_UPDATE(device='[var1,var2]')` - Updates device data
- `$:GPU_ENTER_DATA(copyin='[var]')` - Copies data to device
- `$:GPU_EXIT_DATA(delete='[var]')` - Removes data from device
- Do not use OpenACC or OpenMP directives directly.
- Instead, use the FYPP macros contained in `src/common/include/parallel_macros.fpp`
- Documentation on how to use the Fypp macros for GPU offloading is available at https://mflowcode.github.io/documentation/md_gpuParallelization.html

Wrap tight loops with
```fortran
$:GPU_PARALLEL_FOR(private='[...]', copy='[...]')
```
* Add `collapse=n` to merge nested loops when safe.
* Declare loop-local variables with `private='[...]'`.
* Allocate large arrays with `managed` or move them into a persistent
`$:GPU_ENTER_DATA(...)` region at start-up.
* **Do not** place `stop` / `error stop` inside device code.
* Must compile with Cray `ftn` or NVIDIA `nvfortran` for GPU offloading; also build CPU-only with
GNU `gfortran` and Intel `ifx`/`ifort`.

- Example GPU macros include the below, among others:
- `$:GPU_ROUTINE(parallelism='[seq]')` - Marks GPU-callable routines
- `$:GPU_PARALLEL_LOOP(collapse=N)` - Parallelizes loops
- `$:GPU_LOOP(parallelism='[seq]')` - Marks sequential loops
- `$:GPU_UPDATE(device='[var1,var2]')` - Updates device data
- `$:GPU_ENTER_DATA(copyin='[var]')` - Copies data to device
- `$:GPU_EXIT_DATA(delete='[var]')` - Removes data from device

---

# 6 Documentation Style

Expand All @@ -136,7 +135,7 @@ which conforms to the Doxygen Fortran format.
- Example: `@:ASSERT(predicate, message)`

- **Error Reporting**:
- Use `s_mpi_abort(<msg>)` for error termination, not `stop`
- Use `s_mpi_abort(error_message)` for error termination, not `stop`
- No `stop` / `error stop` inside device code

# 8 Memory Management
Expand Down