Skip to content

fix: preserve Go scope stacks across OS threads#100

Merged
rapids-bot[bot] merged 1 commit into
NVIDIA:mainfrom
willkill07:fix/go-windows-scope-stack
May 13, 2026
Merged

fix: preserve Go scope stacks across OS threads#100
rapids-bot[bot] merged 1 commit into
NVIDIA:mainfrom
willkill07:fix/go-windows-scope-stack

Conversation

@willkill07
Copy link
Copy Markdown
Member

@willkill07 willkill07 commented May 13, 2026

Overview

Fix intermittent Windows Go test failures caused by Go goroutines moving between OS threads while NeMo Flow scope state is stored in Rust thread-local fallback storage.

  • I confirm this contribution is my own work, or I have the right to submit it under this project's license.
  • I searched existing issues and open pull requests, and this does not duplicate existing work.

Details

  • Add Rust runtime helpers to capture and restore the current thread-local scope stack binding.
  • Expose the capture/restore helpers through the FFI API and generated header.
  • Update Go ScopeStack.Run to restore the previous Rust thread-local binding before releasing the locked OS thread.
  • Run the flaky ATOF Go lifecycle test inside an explicit ScopeStack.Run so PushScope, EmitEvent, and PopScope stay on one OS thread.
  • Extend Go coverage to assert ScopeStackActive() is restored after Run.

Where should the reviewer start?

Start with go/nemo_flow/nemo_flow.go for the Go runtime behavior, then review crates/core/src/api/runtime/scope_stack.rs and crates/ffi/src/api/scope_stack.rs for the capture/restore bridge. The regression coverage is in go/nemo_flow/atof_test.go and go/nemo_flow/context_test.go.

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • Relates to: none

Summary by CodeRabbit

  • New Features

    • Added scope-stack binding capture and restoration functionality for managing thread-local state preservation.
    • Extended FFI support for thread scope stack operations.
  • Tests

    • Updated scope-stack lifecycle tests to use the new binding management approach.

Review Change Stack

Signed-off-by: Will Killian <wkillian@nvidia.com>
@willkill07 willkill07 requested a review from a team as a code owner May 13, 2026 18:37
@github-actions github-actions Bot added size:M PR is medium Bug issue describes bug; PR fixes bug lang:go PR changes/introduces Go code lang:rust PR changes/introduces Rust code labels May 13, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 13, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 8f2f40a8-6a32-4630-9460-b56de8de1bda

📥 Commits

Reviewing files that changed from the base of the PR and between 4107db0 and c4c8ae8.

📒 Files selected for processing (9)
  • crates/core/src/api/runtime.rs
  • crates/core/src/api/runtime/scope_stack.rs
  • crates/ffi/nemo_flow.h
  • crates/ffi/src/api/mod.rs
  • crates/ffi/src/api/scope_stack.rs
  • crates/ffi/src/types/mod.rs
  • go/nemo_flow/atof_test.go
  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
📜 Recent review details
🧰 Additional context used
📓 Path-based instructions (28)
crates/ffi/**

📄 CodeRabbit inference engine (.agents/skills/test-ffi-surface/SKILL.md)

Rebuild the FFI crate in release mode so the shared library and header stay in sync

Files:

  • crates/ffi/src/types/mod.rs
  • crates/ffi/nemo_flow.h
  • crates/ffi/src/api/mod.rs
  • crates/ffi/src/api/scope_stack.rs
**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-ffi-surface/SKILL.md)

**/*.rs: Run cargo fmt --all for FFI work as it is Rust work
Run just test-rust for FFI validation
Run cargo clippy --workspace --all-targets -- -D warnings to enforce warnings-as-errors linting

**/*.rs: Run cargo fmt --all for Rust code formatting
Run cargo clippy --workspace --all-targets -- -D warnings to enforce Rust linting with no warnings
Run just test-rust as the shared-runtime build/test wrapper for Rust changes

Use Rust snake_case naming convention for Rust code

**/*.rs: Any Rust change must run just test-rust
Any Rust change must run cargo fmt --all
Any Rust change must run cargo clippy --workspace --all-targets -- -D warnings

**/*.rs: Run cargo fmt --all for Rust code formatting when Node changes touch Rust files
Run cargo clippy --workspace --all-targets -- -D warnings to enforce strict linting when Rust files changed as part of Node work

**/*.rs: Always run just test-rust when any Rust code changes
Always run cargo fmt --all when any Rust code changes
Always run cargo clippy --workspace --all-targets -- -D warnings when any Rust code changes

Files:

  • crates/ffi/src/types/mod.rs
  • crates/core/src/api/runtime/scope_stack.rs
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/runtime.rs
  • crates/ffi/src/api/scope_stack.rs
{crates/ffi/nemo_flow.h,crates/ffi/src/**/*.rs}

📄 CodeRabbit inference engine (.agents/skills/test-ffi-surface/SKILL.md)

Check the generated header diff when any exported symbol or type changed in the FFI surface

Files:

  • crates/ffi/src/types/mod.rs
  • crates/ffi/nemo_flow.h
  • crates/ffi/src/api/mod.rs
  • crates/ffi/src/api/scope_stack.rs
**/*.{rs,go,js,ts,tsx}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Include SPDX license headers in all Rust, Go, JavaScript, and TypeScript source files using C-style comment syntax

Files:

  • crates/ffi/src/types/mod.rs
  • crates/core/src/api/runtime/scope_stack.rs
  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/runtime.rs
  • go/nemo_flow/atof_test.go
  • crates/ffi/src/api/scope_stack.rs
**/*.{rs,py,go,js,ts,tsx}

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Use SONAR_IGNORE_START / SONAR_IGNORE_END markers only for documented false positives that cannot be resolved in code; keep ignored blocks small, add explanatory comments, and require reviewer sign-off

Files:

  • crates/ffi/src/types/mod.rs
  • crates/core/src/api/runtime/scope_stack.rs
  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/runtime.rs
  • go/nemo_flow/atof_test.go
  • crates/ffi/src/api/scope_stack.rs
**/*.{js,ts,tsx,jsx,py,rs,go,java,c,cpp,h,cc,cxx,cs,rb,php,swift,kt}

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changed files must be formatted with the language-native formatter

Files:

  • crates/ffi/src/types/mod.rs
  • crates/ffi/nemo_flow.h
  • crates/core/src/api/runtime/scope_stack.rs
  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/runtime.rs
  • go/nemo_flow/atof_test.go
  • crates/ffi/src/api/scope_stack.rs
**/*.{py,js,ts,tsx,go,rs,md}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

Format changed files with the language-native formatter before the final lint/test pass

Files:

  • crates/ffi/src/types/mod.rs
  • crates/core/src/api/runtime/scope_stack.rs
  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/runtime.rs
  • go/nemo_flow/atof_test.go
  • crates/ffi/src/api/scope_stack.rs
**/*.{rs,py,js,ts,tsx,go}

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

During iteration, prefer uv run pre-commit run --files <changed files...> for targeted validation

Files:

  • crates/ffi/src/types/mod.rs
  • crates/core/src/api/runtime/scope_stack.rs
  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • crates/ffi/src/api/mod.rs
  • crates/core/src/api/runtime.rs
  • go/nemo_flow/atof_test.go
  • crates/ffi/src/api/scope_stack.rs
crates/{python,ffi,node,wasm}/**/*

⚙️ CodeRabbit configuration file

crates/{python,ffi,node,wasm}/**/*: Treat binding changes as public API changes. Check for parity with the other language bindings, FFI ownership/lifetime safety,
callback error propagation, stable type conversion, and consistent async/stream semantics.
Flag changes that update one binding without corresponding tests or documentation for the same surface elsewhere.

Files:

  • crates/ffi/src/types/mod.rs
  • crates/ffi/nemo_flow.h
  • crates/ffi/src/api/mod.rs
  • crates/ffi/src/api/scope_stack.rs
**/*.h

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Prefix C FFI exports with nemo_flow_

Files:

  • crates/ffi/nemo_flow.h
crates/ffi/nemo_flow.h

📄 CodeRabbit inference engine (CONTRIBUTING.md)

Synchronize FFI header crates/ffi/nemo_flow.h through Cargo/build.rs; pre-commit hooks enforce this

Use nemo_flow_ prefix naming convention for C FFI functions

Files:

  • crates/ffi/nemo_flow.h
crates/core/src/api/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

crates/core/src/api/**/*.rs: Implement public API changes first in crates/core/src/api/ and related core modules such as crates/core/src/api/runtime/, crates/core/src/codec/, or crates/core/src/json.rs
Use snake_case naming convention for Rust core API implementations

Files:

  • crates/core/src/api/runtime/scope_stack.rs
  • crates/core/src/api/runtime.rs
crates/{core,adaptive}/**

📄 CodeRabbit inference engine (.agents/skills/prepare-pr/SKILL.md)

Changes to crates/core or crates/adaptive must run the full language matrix

Files:

  • crates/core/src/api/runtime/scope_stack.rs
  • crates/core/src/api/runtime.rs
{crates/core,crates/adaptive}/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/test-wasm-binding/SKILL.md)

If the change touched shared runtime semantics in crates/core or crates/adaptive, also use validate-change

Files:

  • crates/core/src/api/runtime/scope_stack.rs
  • crates/core/src/api/runtime.rs
crates/core/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

When crates/core changes, run the full validation matrix across Rust, Python, Go, Node.js, and WebAssembly

crates/core/**/*.rs: Use Json = serde_json::Value in Rust-facing runtime APIs where the existing code expects JSON payloads.
Use Result<T> with FlowError in core runtime paths. Keep errors explicit and binding-appropriate at the wrapper layer.

Files:

  • crates/core/src/api/runtime/scope_stack.rs
  • crates/core/src/api/runtime.rs
crates/{core,adaptive}/**/*.rs

⚙️ CodeRabbit configuration file

crates/{core,adaptive}/**/*.rs: Review the Rust runtime for async correctness, scope isolation, middleware ordering, and event lifecycle regressions.
Pay close attention to task-local/thread-local scope propagation, callback lifetimes, stream finalization, and root_uuid isolation.
Public API changes should preserve existing behavior unless tests and docs show the intended migration path.

Files:

  • crates/core/src/api/runtime/scope_stack.rs
  • crates/core/src/api/runtime.rs
go/nemo_flow/**/*.go

📄 CodeRabbit inference engine (.agents/skills/test-go-binding/SKILL.md)

Format changed Go packages with cd go/nemo_flow && go fmt ./...

Use PascalCase naming convention for Go API implementations

For Go binding changes, run go fmt ./... within the Go module to format all Go files

Files:

  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • go/nemo_flow/atof_test.go
**/*.go

📄 CodeRabbit inference engine (CONTRIBUTING.md)

**/*.go: Use gofmt for Go code formatting
Use go vet ./... for Go static analysis
Use PascalCase naming convention for Go code

Files:

  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • go/nemo_flow/atof_test.go
**/*.{md,markdown,py,sh,bash,js,ts,java,cpp,go,rust}

📄 CodeRabbit inference engine (.agents/skills/contribute-docs/SKILL.md)

Keep package names, repo references, and build commands current in documentation

Files:

  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • go/nemo_flow/atof_test.go
{crates/adaptive/**,python/nemo_flow/{adaptive,plugin}.py,go/nemo_flow/{adaptive,**}/*.go,**/*.{ts,js,wasm}}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Keep adaptive config schema, plugin lifecycle, and bindings in sync across crates/adaptive, core, bindings, Python (python/nemo_flow/adaptive.py and python/nemo_flow/plugin.py), Go (go/nemo_flow/adaptive and go/nemo_flow), and Node/WebAssembly helpers

Files:

  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • go/nemo_flow/atof_test.go
{crates/adaptive/**/*.rs,python/nemo_flow/adaptive.py,python/nemo_flow/plugin.py,go/nemo_flow/**/*.go,**/{helper,constructor,builder}.{ts,tsx,js}}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Ensure typed helper constructors map cleanly to the same config document without divergence

Files:

  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • go/nemo_flow/atof_test.go
{crates/adaptive/**/*.rs,python/nemo_flow/plugin.py,go/nemo_flow/**/*.go,**/{plugin,lifecycle}.{ts,tsx,js}}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Maintain consistent plugin lifecycle across all language bindings (Rust, Python, Go, Node/WebAssembly)

Files:

  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • go/nemo_flow/atof_test.go
{crates/adaptive/**/*.rs,python/nemo_flow/plugin.py,go/nemo_flow/**/*.go,**/{context,plugin}.{ts,tsx,js,rs}}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Keep plugin context surfaces aligned across all language implementations

Files:

  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • go/nemo_flow/atof_test.go
{docs/**,examples/**,crates/adaptive/**,python/nemo_flow/**,go/nemo_flow/**,**/{example,component}.{ts,tsx,js,rs,py,go}}

📄 CodeRabbit inference engine (.agents/skills/maintain-optimizer/SKILL.md)

Any new adaptive component kind must have documentation, examples, and binding coverage across all supported languages

Files:

  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • go/nemo_flow/atof_test.go
go/**/*.go

📄 CodeRabbit inference engine (.agents/skills/validate-change/SKILL.md)

If Go language surface changed, always run Go test target even when Rust core did not change

Files:

  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • go/nemo_flow/atof_test.go
go/nemo_flow/**/*

⚙️ CodeRabbit configuration file

go/nemo_flow/**/*: Review Go binding changes for cgo memory ownership, race safety, callback cleanup, idiomatic exported APIs, and parity with Rust/FFI behavior.
Any API change should include focused Go tests and consider race-test behavior.

Files:

  • go/nemo_flow/context_test.go
  • go/nemo_flow/nemo_flow.go
  • go/nemo_flow/atof_test.go
{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}

⚙️ CodeRabbit configuration file

{crates/**/tests/**,python/tests/**,go/nemo_flow/**/*_test.go}: Tests should cover the behavior promised by the changed API surface, including error paths and cross-request isolation where relevant.
Prefer assertions on lifecycle events, scope stacks, middleware ordering, and binding parity over shallow smoke tests.

Files:

  • go/nemo_flow/context_test.go
  • go/nemo_flow/atof_test.go
crates/ffi/src/api/**/*.rs

📄 CodeRabbit inference engine (.agents/skills/add-binding-feature/SKILL.md)

Add or update FFI wrappers in the relevant crates/ffi/src/api/*.rs module with nemo_flow_ prefix naming convention, re-export through crates/ffi/src/api/mod.rs, and ensure generated crates/ffi/nemo_flow.h stays correct

Files:

  • crates/ffi/src/api/mod.rs
  • crates/ffi/src/api/scope_stack.rs
🔇 Additional comments (11)
crates/core/src/api/runtime/scope_stack.rs (1)

239-247: LGTM!

Also applies to: 310-325

crates/core/src/api/runtime.rs (1)

20-23: LGTM!

crates/ffi/src/types/mod.rs (1)

14-14: LGTM!

Also applies to: 51-52

crates/ffi/src/api/mod.rs (1)

39-40: LGTM!

Also applies to: 49-50

go/nemo_flow/atof_test.go (1)

47-63: LGTM!

go/nemo_flow/context_test.go (1)

9-9: LGTM!

Also applies to: 127-131, 139-141

go/nemo_flow/nemo_flow.go (1)

34-34: LGTM!

Also applies to: 215-216, 1506-1521

crates/ffi/nemo_flow.h (1)

187-190: LGTM!

Also applies to: 1778-1799

crates/ffi/src/api/scope_stack.rs (3)

5-6: LGTM!


79-101: LGTM!


103-120: LGTM!


Walkthrough

This PR adds thread-local scope stack binding capture and restore functionality across Rust core, FFI boundary, and Go bindings. A new ThreadScopeStackBinding type preserves both the scope stack handle and explicit ownership flag, enabling restoration of exact thread-local state around foreign runtime operations. The Go ScopeStack.Run() method now captures pre-existing bindings, reliably restores them via deferred operations, and validates all operations via status checks.

Changes

Thread-local Scope Stack Binding

Layer / File(s) Summary
Rust core binding type and utilities
crates/core/src/api/runtime/scope_stack.rs, crates/core/src/api/runtime.rs
ThreadScopeStackBinding struct captures thread-local ScopeStackHandle and explicit ownership flag; capture_thread_scope_stack() and restore_thread_scope_stack() read and reapply these values to thread-local storage; public re-export exposes type and functions at crate::api::runtime.
FFI opaque wrapper and type imports
crates/ffi/src/types/mod.rs, crates/ffi/src/api/mod.rs
FfiThreadScopeStackBinding wraps ThreadScopeStackBinding as an opaque FFI handle; runtime imports updated to include both ScopeStackHandle and ThreadScopeStackBinding; type is imported and re-exported in FFI types module.
C FFI functions and Rust implementations
crates/ffi/nemo_flow.h, crates/ffi/src/api/scope_stack.rs, crates/ffi/src/api/mod.rs
C header defines opaque FfiThreadScopeStackBinding type and declares nemo_flow_scope_stack_capture_thread() and nemo_flow_scope_stack_restore_thread() functions; Rust FFI module implements both with null-pointer validation, heap allocation, and error handling; necessary runtime functions imported.
Go CGo bindings and ScopeStack.Run refactor
go/nemo_flow/nemo_flow.go
Go declares CGo FfiThreadScopeStackBinding struct and extern capture/restore functions; ScopeStack.Run() reworked to capture pre-existing thread binding, bind the provided stack, and defer restoration of the original binding with status checks and panic on failure.
Go test updates for binding preservation
go/nemo_flow/context_test.go, go/nemo_flow/atof_test.go
TestScopeStackActiveInsideRun locks OS thread and verifies ScopeStackActive() state transitions around stack.Run; TestAtofExporterLifecycleWritesRawJSONL refactored to execute scope lifecycle inside stack.Run with error capture and deferred cleanup.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title follows Conventional Commits format with the 'fix' type and provides a concise, imperative summary of the change.
Description check ✅ Passed The description is comprehensive, covering all required template sections with detailed explanations of changes, reviewer guidance, and confirmed author confirmations.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

level=error msg="[linters_context] typechecking error: pattern ./...: directory prefix . does not contain main module or its selected dependencies"


Comment @coderabbitai help to get the list of available commands and usage tips.

@willkill07 willkill07 self-assigned this May 13, 2026
@willkill07 willkill07 added this to the 0.2 milestone May 13, 2026
@willkill07
Copy link
Copy Markdown
Member Author

/merge

@rapids-bot rapids-bot Bot merged commit 3c80bcc into NVIDIA:main May 13, 2026
59 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bug issue describes bug; PR fixes bug lang:go PR changes/introduces Go code lang:rust PR changes/introduces Rust code size:M PR is medium

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants