Skip to content

fix(windows): guard Linux-only POSIX RT calls so the installer build compiles under MSYS2#133

Merged
thiagoralves merged 7 commits into
developmentfrom
fix/windows-pthread-rt-guards
Jun 5, 2026
Merged

fix(windows): guard Linux-only POSIX RT calls so the installer build compiles under MSYS2#133
thiagoralves merged 7 commits into
developmentfrom
fix/windows-pthread-rt-guards

Conversation

@thiagoralves
Copy link
Copy Markdown
Contributor

Summary

v4.1.1's Windows installer build failed at `core/src/plc_app/image_tables.cpp:96`:

```
error: 'PTHREAD_PRIO_INHERIT' was not declared in this scope; did you mean 'PTHREAD_ONCE_INIT'?
```

`PTHREAD_PRIO_INHERIT` is the POSIX priority-inheritance mutex protocol — optional in the standard. MSYS2/MinGW pthread on Windows doesn't ship it (`_POSIX_THREAD_PRIO_INHERIT` is undefined, `pthread_mutexattr_setprotocol` isn't declared). And it's a no-op even when it works without an RT scheduler — Windows has no SCHED_FIFO, so there's nothing to priority-boost into.

Same pattern hits `plc_state_manager.cpp`'s CPU-affinity block: `cpu_set_t` / `CPU_ZERO` / `CPU_SETSIZE` / `pthread_setaffinity_np` are Linux `<sched.h>` extensions, not standard POSIX. MSYS2 doesn't define them. Build never reaches this file because it stops at `image_tables.cpp` first, but the next compile would have hit it.

What changed

Two surgical `#if` guards:

  • `image_tables.cpp::init_recursive_pi_mutex` — wraps the `pthread_mutexattr_setprotocol(&attr, PTHREAD_PRIO_INHERIT)` call with `#if !defined(CYGWIN) && !defined(MSYS) && defined(_POSIX_THREAD_PRIO_INHERIT) && _POSIX_THREAD_PRIO_INHERIT > 0`. Linux gets PI; Windows gets a plain recursive mutex.
  • `plc_state_manager.cpp::plc_task_thread` — wraps the CPU-affinity setup with `#if !defined(CYGWIN) && !defined(MSYS) && defined(linux)`. Non-Linux hosts log "CPU affinity requested but unsupported on this platform" and skip.

The existing `pthread_setschedparam(SCHED_FIFO)` call in the same function is already runtime-graceful (logs a warning on EPERM, falls through) so it needs no compile-time guard — MSYS2 declares the symbol; the kernel rejects the request at runtime.

The `#if !defined(CYGWIN) && !defined(MSYS)` pattern matches the existing `HAS_REALTIME_FEATURES` macro in `core/src/plc_app/utils/utils.c` — established convention in-tree.

Why didn't earlier RC builds fail this way

`v4.1.0-rc.1` (April 3) was the last Windows-installer build that ran successfully. rc.2 / rc.3 / rc.4 all had the job skipped. The Phase 5 STruC++ shim work added `image_tables.cpp` (and its sibling `plc_state_manager.cpp`) after rc.1, so this file path is brand new on Windows.

Test plan

  • Windows-installer CI on this PR turns green
  • Linux/Docker builds remain unaffected (the guarded code is identical on Linux — `_POSIX_THREAD_PRIO_INHERIT` and `linux` are both set on a Linux build host)
  • After merge: cut v4.1.2 with a Windows installer artifact included

thiagoralves and others added 7 commits April 3, 2026 08:31
Pre-release tags (e.g. v4.1.0-rc.1) should not overwrite the latest
tag in ghcr. Only stable releases (no hyphen suffix) get latest.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Pre-release tags (e.g. v4.1.0-rc.1) should only build Docker images,
not Windows installers. Manual workflow_dispatch still works.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Release v4.1.1 — STruC++ compiler + VPP support + compile CPU cap
v4.1.1's Windows installer build failed at
`core/src/plc_app/image_tables.cpp:96` with:

  error: 'PTHREAD_PRIO_INHERIT' was not declared in this scope

`PTHREAD_PRIO_INHERIT` is the POSIX priority-inheritance mutex
protocol — optional in the standard.  MSYS2/MinGW pthread on Windows
doesn't ship it (`_POSIX_THREAD_PRIO_INHERIT` is undefined and
`pthread_mutexattr_setprotocol` isn't declared).  Priority
inheritance is only meaningful on a real-time-scheduled system
(SCHED_FIFO threads); Windows has no RT scheduler, so the protocol
would be a no-op even if it linked.

Same broader pattern applies to the CPU-affinity block in
`plc_state_manager.cpp::plc_task_thread`: `cpu_set_t` / `CPU_ZERO` /
`CPU_SETSIZE` / `CPU_SET` / `pthread_setaffinity_np` are Linux
`<sched.h>` extensions, not part of standard POSIX, and MSYS2's
pthread headers don't define them.

Fix: guard both blocks with `#if !defined(__CYGWIN__) && !defined(__MSYS__)`
(plus `_POSIX_THREAD_PRIO_INHERIT` for the PI call and `__linux__` for
the CPU-affinity block).  Windows degrades to a plain recursive
mutex + default scheduling — consistent with the platform's lack of
real-time support.  The existing `pthread_setschedparam(SCHED_FIFO)`
call in the same function is already runtime-graceful (logs a warning
on EPERM and falls through), so it doesn't need a compile-time guard;
MSYS2's pthread.h declares the symbol but the kernel rejects the
request at runtime, which is the behaviour we already handle.

Reference: the same `#if !defined(__CYGWIN__) && !defined(__MSYS__)`
pattern already exists in `core/src/plc_app/utils/utils.c` (gating
`<sched.h>` / `<sys/mman.h>` / `<sys/prctl.h>` includes via the local
`HAS_REALTIME_FEATURES` macro), so this matches an established
in-tree convention rather than introducing a new one.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After fixing the POSIX PI-mutex / CPU-affinity guards in
image_tables.cpp + plc_state_manager.cpp, the Windows installer
build progressed past the runtime and stopped on the EtherCAT
plugin instead:

  /usr/include/w32api/winsock2.h:1031:34: error: conflicting types
    for 'select'; have 'int(int, fd_set *, fd_set *, fd_set *,
                            const TIMEVAL *)'
  /usr/include/w32api/winsock2.h:1040:34: error: conflicting types
    for 'gethostname'; have 'int(char *, int)'

Root cause: the vendored SOEM (Simple Open EtherCAT Master) ships
its own Windows port under
`core/src/drivers/plugins/native/ethercat/libs/soem/oshw/win32/wpcap/`
that pulls in WinPcap headers, which collide with MSYS2's w32api
`winsock2.h` (duplicate `select` / `gethostname` declarations).

Even if those headers compiled cleanly the EtherCAT plugin itself
relies on SCHED_FIFO + PREEMPT_RT scheduling that Windows doesn't
provide — it's fundamentally Linux-only by design.  No point
trying to ship a non-functional Windows EtherCAT plugin.

This patch teaches `build_native_plugins()` in install.sh to skip
plugins listed in a `linux_only_plugins=("ethercat")` array when
running under MSYS2.  Linux/macOS builds are unchanged; Windows
gets a clean build that excludes the plugin entirely and logs
"Skipping Linux-only plugin on MSYS2/Windows: ethercat".

The array makes future Linux-only plugins (eg. xenomai, custom
PREEMPT_RT-dependent drivers) trivial to add.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The vendored SOEM library and its wpcap include path were already
wired up for MSYS2 in the EtherCAT CMakeLists.txt (auto-generates a
`MSYS.cmake` for SOEM that defines `WIN32` and adds the wpcap
include directory).  SOEM's own .c files compile fine — but
`ethercat_plugin.c` and its siblings fail with:

  error: conflicting types for 'select'; have
    'int(int, fd_set *, fd_set *, fd_set *, const TIMEVAL *)'
  note: previous declaration of 'select' with type
    'int(int, fd_set *, fd_set *, fd_set *, struct timeval *)'

Root cause: `ethercat_plugin.c` includes both `<signal.h>` (POSIX
chain → `sys/select.h` → POSIX `select` with `struct timeval *`)
AND `soem/soem.h` (→ `nicdrv.h` win32 → `pcap.h` → `pcap-stdinc.h`
→ `winsock2.h` → Win32 `select` with `const TIMEVAL *`).  MSYS2's
two libc's disagree on the signature so we get a redeclaration
error.  Same story for `gethostname` (`size_t` vs `int`).

MSYS2's `sys/select.h` is explicit about this:

  /* We don't define fd_set and friends if we are compiling POSIX
     ... (defined by __USE_W32_SOCKETS) the W32api winsock[2].h
     header which it must not call the Cygwin select function. */
  # if !(defined (_WINSOCK_H) || defined (_WINSOCKAPI_) || \
         defined (__USE_W32_SOCKETS))

Same guard exists in `sys/unistd.h` around `gethostname`.

Fix: add `__USE_W32_SOCKETS` to SOEM's `target_compile_definitions
(PUBLIC ...)` block.  The PUBLIC scope propagates the define to
every target linking against soem — including `ethercat_plugin` —
so the POSIX chain's `select` / `gethostname` decls get suppressed
in those TUs and the Win32 wpcap-supplied decls win without
conflict.  No effect on Linux builds (the whole block is gated by
`if(CMAKE_SYSTEM_NAME STREQUAL \"MSYS\" OR ... \"CYGWIN\")`).

Verified by reproducing the conflict in a Windows VM with the same
MSYS2 toolchain (gcc 15.2.0, w32api 14.0.0) and watching the
EtherCAT plugin build go from \"conflicting types for 'select'\" to
clean: \`libethercat_plugin.so\` artefact produced.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@thiagoralves thiagoralves merged commit 03f3796 into development Jun 5, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant