Skip to content

feat: support CGO_ENABLED=1 alongside CGO_ENABLED=0 (closes #13)#37

Merged
kolkov merged 3 commits intogo-webgpu:mainfrom
jiyeyuran:feature/cgo-enabled-1-support
May 6, 2026
Merged

feat: support CGO_ENABLED=1 alongside CGO_ENABLED=0 (closes #13)#37
kolkov merged 3 commits intogo-webgpu:mainfrom
jiyeyuran:feature/cgo-enabled-1-support

Conversation

@jiyeyuran
Copy link
Copy Markdown
Contributor

@jiyeyuran jiyeyuran commented May 6, 2026

Closes #13. Also covers the duplicate-symbol root cause referenced in #22.

Summary

Make CGO_ENABLED=1 a fully supported build mode for goffi while keeping the existing CGO_ENABLED=0 / fakecgo path byte-identical for the gogpu ecosystem. Lets goffi co-exist in binaries that already link other CGO bindings (gocv, libavcodec wrappers, etc.).

This builds on the dl-layer dynamic-loading sketch in 44fb57f and completes the missing pieces called out in the review:

  • The hard requirement "gate internal/fakecgo/ with //go:build !cgo" is finished off (asm files + netbsd.go were the gaps).
  • runtime/cgo is now reliably linked under cgo mode through both internal/dl and internal/syscall, so runtime.cgocall works regardless of which package the import chain enters from.
  • ffi/cgo_unsupported.go (the panic-on-CGO=1 blocker) is removed, since CGO=1 is now supported.
  • The CI matrix is extended with a cgo: ['0', '1'] dimension on Linux/Windows/macOS.

What changed

internal/dl — make work in both cgo modes

  • dl_unix.go and the four dl_{stubs,wrappers}_{unix,arm64}.s files: drop the !cgo build tag. The implementation only depends on runtime.cgocall, which works under both modes once runtime/cgo is wired up.
  • dl_{darwin,linux,freebsd}.go: fold the cgo_import_dynamic directives in from the now-deleted dl_*_nocgo.go siblings. The directives are valid under both cgo modes, so the _nocgo split is no longer needed.
  • New internal/dl/cgo.go (cgo only): import _ "runtime/cgo" so iscgo=true and runtime.cgocall is initialised. Without this, CGO=1 builds hit fatal error: cgocall unavailable.

internal/syscall — same wiring at the lower layer

  • New internal/syscall/cgo.go (cgo only): import _ "runtime/cgo". Required because the internal/arch/* packages depend on internal/syscall only — the internal/dl import alone wouldn't be enough for binaries that exercise the syscall layer directly (e.g. arm64 unit tests).

internal/fakecgo — finish the !cgo gating (root cause shared with #22)

  • asm_amd64.s, asm_arm64.s: add the !cgo && (darwin || freebsd || linux || netbsd) build tag the rest of the package already has. Without it, crosscall2 could collide with runtime/cgo's own crosscall2 under cgo mode.
  • netbsd.go: build tag was just netbsd, missing !cgo. Aligned with every other file in the package.

internal/arch/arm64 — test isolation

  • abi_capture_test.go: drop the unconditional _ "internal/fakecgo" import. Under CGO=1 fakecgo has zero Go files and the import fails with build constraints exclude all Go files.
  • New fakecgo_test.go (arm64 && !cgo): hosts the fakecgo blank import for the no-cgo test build only.

ffi

  • cgo_unsupported.go removed. Its only role was to make CGO=1 builds panic, which is no longer the desired behaviour.
  • dl_unix.go, dl_darwin.go: drop the !cgo build tag so LoadLibrary/GetSymbol/FreeLibrary are available under CGO=1 too.

README.md

  • Requirements section updated: goffi now works under both modes; CGO_ENABLED=0 remains the recommended default.
  • The link: duplicated definition of symbol _cgo_init #22 limitation note now lists CGO_ENABLED=1 as an alternative workaround (the real runtime/cgo makes both projects' fakecgo packages drop out via //go:build !cgo).

CI (.github/workflows/ci.yml)

  • test job: add cgo: ['0', '1'] matrix dimension on every OS (Linux/Windows/macOS).
  • Coverage artefacts disambiguated as coverage-<os>-cgo<0|1>.
  • Codecov upload still pinned to ubuntu-latest + cgo=0 to keep history compatible.
  • quality-gate parsing updated for the new artefact name.
  • Removed the stale "race detector not supported" comment block.

Compliance with the review checklist

Requirement Status
internal/fakecgo/ gated with //go:build !cgo ✅ Completed (asm files + netbsd.go were the gaps)
Existing CGO_ENABLED=0 behaviour must not change ✅ Verified locally; all !cgo-gated code paths untouched in CGO=0 mode
All existing tests pass under CGO_ENABLED=0
All existing tests pass under CGO_ENABLED=1 on Linux/macOS/Windows ✅ Verified locally on darwin/arm64; CI matrix now covers all three
CI matrix extended with CGO_ENABLED=1 jobs
No new dependencies ✅ Only stdlib runtime/cgo, only under cgo build tag
FreeBSD and ARM64 must not regress dl_freebsd.go updated symmetrically; ARM64 dual-mode tests pass; cross-compile verified for all 7 targets
Core FFI path (assembly/classification/CIF prep) untouched ✅ No changes under internal/arch/{amd64,arm64} core, types/, or ffi/{call,cif,classification}.go

Test plan

Local (darwin/arm64, Go 1.26.1):

CGO_ENABLED=0 go build ./...                              # PASS
CGO_ENABLED=1 go build ./...                              # PASS
CGO_ENABLED=0 go test  ./ffi ./types ./internal/...       # PASS
CGO_ENABLED=1 go test  ./ffi ./types ./internal/...       # PASS (10/10 stable)
gofmt -l .                                                # clean

Cross-compile (CGO_ENABLED=0):

  • linux/amd64 ✅
  • linux/arm64 ✅
  • darwin/amd64 ✅
  • darwin/arm64 ✅
  • windows/amd64 ✅
  • windows/arm64 ✅
  • freebsd/amd64 ✅ (with the existing -gcflags=...fakecgo=-std)

Reviewer checklist for CI:

  • Lint passes
  • All 6 Test - <os> (cgo=<0|1>) jobs are green
  • Cross-compile job stays green (covers all 7 targets)
  • Quality-gate coverage thresholds met under cgo=0 (cgo=1 informational)

Notes / non-goals

  • This PR doesn't enable the race detector in CI; the comment block has been updated to reflect that CGO=1 now makes that possible, but layering it on is out of scope.
  • The previously-suggested nofakecgo build tag (link: duplicated definition of symbol _cgo_init #22) is unchanged; users who pull in purego in the same binary can still use it. This PR adds a second escape hatch: building with CGO_ENABLED=1 sidesteps the duplicate symbols entirely.
  • No public API changes. Existing CGO=0 users do not need to do anything.

@kolkov
Copy link
Copy Markdown
Contributor

kolkov commented May 6, 2026

Thank you for a well-structured PR. Code review and deep research analysis are complete.

Review Summary

The PR is structurally sound, follows purego's established import _ "runtime/cgo" pattern, and the dual-mode approach is correct. Core FFI path untouched. All 8 review requirements met.

CI Failure — Not Your Code

The lint failure is a pre-existing issue in ffi/callback.go (reflect.Ptrreflect.Pointer, deprecated since Go 1.18). We've fixed it in PR #38. Once that merges, please rebase your branch on main and CI should pass cleanly.

One Verification Question

Our analysis shows that under CGO_ENABLED=1, crosscall2 resolves from runtime/cgo's assembly (byte-identical to fakecgo's implementation). This should work transparently for goffi's callback system.

Question: Did you test NewCallback() under CGO_ENABLED=1? Specifically — does a callback invoked from a C-created thread (not a Go goroutine) work correctly? We want explicit confirmation that the callback path through runtime/cgo's crosscall2 was exercised in your local testing.

Maintainership

If we merge this, would you be willing to be the ongoing maintainer for the CGO_ENABLED=1 code path? We'd add you to CODEOWNERS for the relevant files (internal/dl/cgo.go, internal/syscall/cgo.go) so you get pinged on related issues and PRs.

Minor Suggestions (non-blocking)

  1. Consider adding a comment in internal/dl/cgo.go noting that cgo_import_dynamic works under CGO_ENABLED=1 specifically because import _ "runtime/cgo" triggers the external linker.
  2. On glibc ≥ 2.34, libdl.so.2 is a stub mapping to libc.so.6. Not a problem — just worth a comment in dl_linux.go for future readers.

Overall: excellent work. Clean, isolated, well-documented. Looking forward to CI green + your confirmation on callbacks.

@codecov
Copy link
Copy Markdown

codecov Bot commented May 6, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@kolkov
Copy link
Copy Markdown
Contributor

kolkov commented May 6, 2026

@jiyeyuran One more note on commit f43c29a ("chore: fix golangci-lint v2 errors on non-Linux hosts").

This commit touches files outside the CGO_ENABLED=1 scope — .golangci.yml, ffi/darwin_objc_test.go, and internal/arch/arm64/classification.go. We noticed the Co-authored-by: Cursor tag, so we wanted to ask: were these changes intentional on your part, or did the AI agent apply them automatically while working on the CGO feature?

Specifically, three concerns:

1. .golangci.yml — removing build-tags: linux

We understand the rationale (avoiding osInit redeclared when linting on macOS). However, this tag was set deliberately and affects the entire project's lint pipeline. On CI (ubuntu-latest), GOOS=linux is already the default, so the tag was indeed a no-op there — but removing it is an infrastructure decision that should be reviewed separately.

2. ffi/darwin_objc_test.go — removing test helpers

Four helper functions were removed as "unused": objcArgInt64(), objcArgRect(), objcArgSize(), objcCallBool(). These are scaffolding for our ObjC/Darwin test suite — objcArgRect and objcArgSize are prepared for NSRect/NSSize struct passing tests we plan to add (related to #33). The cString() refactor is fine stylistically but also unrelated to CGO support.

3. internal/arch/arm64/classification.go — redundant type removal

This one is harmless (staticcheck QF1011), but still outside the PR scope.

What we'd suggest

In enterprise open-source projects (Go stdlib, Kubernetes, etc.), feature PRs are kept scoped to the feature — unrelated cleanups go in separate PRs. This ensures clean git bisect, independent reverts, and focused review.

Could you either:

  • Option A: Remove commit f43c29a from this PR (e.g., interactive rebase to drop it), or
  • Option B: Split it into a separate PR for lint fixes

Your core CGO work (commits 1-2) is excellent and CI-green. We'd like to merge that cleanly without mixing in unrelated changes.

No rush — just want to keep the git history clean.

jiyeyuran pushed a commit to jiyeyuran/goffi that referenced this pull request May 6, 2026
Address PR go-webgpu#37 reviewer feedback by adding an explicit end-to-end
exercise of the C-thread -> trampoline -> crosscall2 -> Go closure
path. The new test uses goffi's own FFI to call pthread_create with
a Go callback as the start_routine, then pthread_join to observe
the callback's return value. The assertion is identical in both cgo
modes; only the source of crosscall2 differs (internal/fakecgo vs
runtime/cgo).

Verified locally on darwin/arm64, Go 1.26.1:
  CGO_ENABLED=0 go test -run TestCallback_FromCThread -count=20  PASS
  CGO_ENABLED=1 go test -run TestCallback_FromCThread -count=20  PASS

Also acts on the two non-blocking comment suggestions:
  - internal/dl/cgo.go: spell out that the runtime/cgo blank import
    is what makes our cgo_import_dynamic directives effective under
    CGO_ENABLED=1, by activating the external linker.
  - internal/dl/dl_linux.go: note that on glibc >= 2.34 libdl.so.2 is
    a stub mapping to libc.so.6, and that we still target libdl.so.2
    on purpose for backwards compatibility with older glibc and musl.

No production-code behaviour changes.

Co-authored-by: Cursor <cursoragent@cursor.com>
@jiyeyuran
Copy link
Copy Markdown
Contributor Author

Thanks for the careful review! All three items addressed in commit c260888 (rebased on top of the merge from main):

1. Verification — callback from a C-created thread under CGO_ENABLED=1

Yes, the path through runtime/cgo's crosscall2 is now explicitly exercised. I added ffi/callback_cthread_test.go which uses goffi's own FFI to call pthread_create with a Go callback as the start_routine, then pthread_join to observe the callback's return value. The interesting path is:

C-created thread (pthread_create)
  -> trampoline  (callback_{amd64,arm64}.s)
  -> crosscall2  (runtime/cgo when CGO=1, internal/fakecgo when CGO=0)
  -> runtime.cgocallback
  -> Go closure
  -> return value -> pthread_join

The test is gated (linux || darwin || freebsd) && (amd64 || arm64) with no cgo constraint, so the CI matrix runs it under both cgo=0 and cgo=1 on Linux and macOS — that's the byte-for-byte confirmation that both crosscall2 implementations work through the trampoline.

Local results (darwin/arm64, Go 1.26.1):

CGO_ENABLED=0 go test -run TestCallback_FromCThread -count=20  PASS
CGO_ENABLED=1 go test -run TestCallback_FromCThread -count=20  PASS
CGO_ENABLED=1 go test -run TestCallback -v               21 / 21 PASS

The closure asserts on three witnesses: the C arg arrives intact, the closure's return value comes back via pthread_join's *retval, and the closure can take the address of a stack variable (i.e. the Go runtime really did set up a goroutine on the C thread before our Go code ran).

2. Maintainership

Happy to. Add me (@jiyeyuran) to CODEOWNERS for:

  • internal/dl/cgo.go
  • internal/syscall/cgo.go
  • ffi/callback_cthread_test.go

I'll keep an eye on issues/PRs that touch the CGO_ENABLED=1 path and the callback-from-C-thread test.

3. Minor suggestions — applied

  • internal/dl/cgo.go: expanded the comment to spell out that the import _ "runtime/cgo" blank import is exactly what activates the external linker, and that the external linker is what makes our //go:cgo_import_dynamic directives effective under CGO_ENABLED=1. (Without it, the directives are silently dropped at link time.)
  • internal/dl/dl_linux.go: added a note that on glibc ≥ 2.34, libdl.so.2 is a stub mapping to libc.so.6, and that we deliberately keep targeting libdl.so.2 so older glibc (< 2.34) and musl — which still ship the real library — keep working.

Both are non-functional comment-only changes; no behaviour difference.

CI should be green now once it picks up the rebased branch.

@kolkov
Copy link
Copy Markdown
Contributor

kolkov commented May 6, 2026

@jiyeyuran Thank you for the thorough callback verification and the maintainership commitment — both are exactly what we needed. The pthread_create test is excellent.

It looks like our messages crossed — your reply addressed the first review comment (callbacks, maintainership, minor suggestions), but the second comment about commit f43c29a (out-of-scope lint changes in .golangci.yml, darwin_objc_test.go, classification.go) was posted around the same time. Could you take a look at that one when you get a chance?

wuxinfei and others added 3 commits May 6, 2026 16:38
)

Lets goffi co-exist in binaries that already link other CGO bindings (gocv,
ffmpeg-go wrappers, libavcodec, etc.) by making CGO_ENABLED=1 a fully
supported build mode while keeping the existing CGO_ENABLED=0 / fakecgo
path 100% unchanged (the primary path used by the gogpu ecosystem).

What changes
------------
internal/dl
  - dl_unix.go and dl_{stubs,wrappers}_{unix,arm64}.s: drop the `!cgo`
    constraint so the goffi-owned dlopen/dlsym implementation compiles in
    both cgo modes. The runtime.cgocall it uses works under CGO_ENABLED=1
    as long as runtime/cgo is linked (see cgo.go below).
  - dl_{darwin,linux,freebsd}.go: fold the cgo_import_dynamic directives
    in from the deleted dl_*_nocgo.go siblings; the directives are valid
    under both cgo modes, so the `!cgo` split is no longer needed.
  - cgo.go (new, `cgo` only): `import _ "runtime/cgo"` so iscgo=true and
    runtime.cgocall is wired up. Without this we hit "fatal error: cgocall
    unavailable" in CGO_ENABLED=1 mode.

internal/syscall
  - cgo.go (new, `cgo` only): same `import _ "runtime/cgo"` trick. The
    arch packages depend on internal/syscall (not internal/dl), so the dl
    import alone doesn't drag runtime/cgo into binaries that only use the
    syscall layer (e.g. internal/arch/arm64 unit tests).

internal/fakecgo (defensive gating, also addresses go-webgpu#22 root cause)
  - asm_amd64.s, asm_arm64.s: add the same
    `!cgo && (darwin || freebsd || linux || netbsd)` build tag the rest
    of the package already has. Without it the `crosscall2` symbol could
    collide with runtime/cgo's own crosscall2 in cgo mode.
  - netbsd.go: tag was previously just `netbsd`, missing `!cgo`. Aligned
    with every other file in the package.

internal/arch/arm64
  - abi_capture_test.go: drop the unconditional
    `_ "internal/fakecgo"` import; under CGO_ENABLED=1 fakecgo has zero
    Go files and the import fails with
    "build constraints exclude all Go files".
  - fakecgo_test.go (new, `arm64 && !cgo`): hosts the fakecgo blank
    import so it stays in the test binary in CGO_ENABLED=0 mode only.

ffi
  - cgo_unsupported.go: removed. Its sole purpose was to make CGO=1
    builds panic, which is no longer the desired behaviour now that
    CGO=1 is supported.
  - dl_unix.go, dl_darwin.go: drop the `!cgo` constraint so LoadLibrary
    / GetSymbol / FreeLibrary are available under CGO=1 too.

CI (.github/workflows/ci.yml)
  - test job: add a `cgo: ['0', '1']` matrix dimension so every OS runs
    both cgo modes. Coverage artefacts disambiguated as
    `coverage-<os>-cgo<0|1>`. Codecov upload still pinned to
    ubuntu-latest + cgo=0 to keep history compatible.
  - Quality-gate parsing updated for the new artefact name.
  - Removed the stale "race detector not supported" comment block; the
    cgo=1 matrix entry now makes the race detector feasible if someone
    wants to layer it on later.

Verification
------------
Local (darwin/arm64, Go 1.26.1):
  CGO_ENABLED=0 go build ./...                  PASS
  CGO_ENABLED=1 go build ./...                  PASS
  CGO_ENABLED=0 go test  ./ffi ./types ./internal/...   PASS
  CGO_ENABLED=1 go test  ./ffi ./types ./internal/...   PASS (10/10 stable)
  Cross-compile CGO=0 on linux/amd64, linux/arm64, darwin/amd64,
    darwin/arm64, windows/amd64, windows/arm64, freebsd/amd64   PASS

Notes
-----
- CGO_ENABLED=0 is still the primary path. fakecgo is still gated by
  `!cgo` everywhere (the changes here only *complete* the gating that
  was already present on most files), and the existing nofakecgo build
  tag keeps working.
- No new dependencies. Only stdlib `runtime/cgo` is pulled in, and only
  under the `cgo` build tag.

Co-authored-by: Cursor <cursoragent@cursor.com>
Update the Requirements section to reflect that goffi now works under
both CGO_ENABLED=0 (the default, fakecgo path) and CGO_ENABLED=1
(real runtime/cgo path), and explain how each mode supplies the cgo
machinery. CGO_ENABLED=0 remains the primary, recommended mode for
the gogpu ecosystem.

Also note that building with CGO_ENABLED=1 is an alternative
workaround for the goffi+purego duplicate-symbol issue (go-webgpu#22): in cgo
mode both libraries' internal/fakecgo packages are gated out by
//go:build !cgo, so there is nothing to collide.

Co-authored-by: Cursor <cursoragent@cursor.com>
Address PR go-webgpu#37 reviewer feedback by adding an explicit end-to-end
exercise of the C-thread -> trampoline -> crosscall2 -> Go closure
path. The new test uses goffi's own FFI to call pthread_create with
a Go callback as the start_routine, then pthread_join to observe
the callback's return value. The assertion is identical in both cgo
modes; only the source of crosscall2 differs (internal/fakecgo vs
runtime/cgo).

Verified locally on darwin/arm64, Go 1.26.1:
  CGO_ENABLED=0 go test -run TestCallback_FromCThread -count=20  PASS
  CGO_ENABLED=1 go test -run TestCallback_FromCThread -count=20  PASS

Also acts on the two non-blocking comment suggestions:
  - internal/dl/cgo.go: spell out that the runtime/cgo blank import
    is what makes our cgo_import_dynamic directives effective under
    CGO_ENABLED=1, by activating the external linker.
  - internal/dl/dl_linux.go: note that on glibc >= 2.34 libdl.so.2 is
    a stub mapping to libc.so.6, and that we still target libdl.so.2
    on purpose for backwards compatibility with older glibc and musl.

No production-code behaviour changes.

Co-authored-by: Cursor <cursoragent@cursor.com>
@jiyeyuran jiyeyuran force-pushed the feature/cgo-enabled-1-support branch from c260888 to 1dbf74e Compare May 6, 2026 08:38
@jiyeyuran
Copy link
Copy Markdown
Contributor Author

jiyeyuran commented May 6, 2026

Force-pushed a linearized branch that drops f43c29a per your request. Fully agree the lint/staticcheck cleanup belongs in its own PR — apologies, that one slipped in unintentionally while the agent was iterating on lint failures.

Branch is now exactly three focused commits on top of main:

1dbf74e test: verify NewCallback works from C-created threads (CGO=0/1)
491acbb docs(README): document CGO_ENABLED=1 as a supported build mode
68d66e4 feat: support CGO_ENABLED=1 alongside CGO_ENABLED=0 (closes #13)

Reverted/dropped from this PR (no longer present anywhere in the diff):

  • .golangci.yml change (removing build-tags: linux)
  • ffi/darwin_objc_test.go removals of objcArgInt64 / objcArgRect / objcArgSize / objcCallBool and the cString refactor
  • internal/arch/arm64/classification.go elementKind type-removal

Copy link
Copy Markdown
Contributor

@kolkov kolkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. All review criteria met: CI green (CGO=0/1 × 3 OS), callbacks verified from C-threads, core FFI untouched, clean 3-commit scope. Welcome aboard as CGO path maintainer.

@kolkov kolkov merged commit e664be6 into go-webgpu:main May 6, 2026
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Discussion: Should goffi support CGO_ENABLED=1?

2 participants