Skip to content

hal: Deprecate pyhal and move functionality into hal module#4061

Merged
BsAtHome merged 1 commit into
LinuxCNC:masterfrom
BsAtHome:fix_hal-port-pyhal-deprecate
May 29, 2026
Merged

hal: Deprecate pyhal and move functionality into hal module#4061
BsAtHome merged 1 commit into
LinuxCNC:masterfrom
BsAtHome:fix_hal-port-pyhal-deprecate

Conversation

@BsAtHome
Copy link
Copy Markdown
Contributor

This is phase 0 of the HAL update. We need to get rid of direct access to HAL memory and library from python (anything that bypasses the usual accesses).

The pyhal module, only used by raster, has been deprecated and the HAL_PORT functionality has been implemented in halmodule (_hal and hal modules). The port functionality was introduced with laser/raster and has been quite isolated so far. Therefore, the fallout should be minimal.

The hal module has been updated so that it accepts native values when setting pins, params and signals. Previously, only string arguments were accepted. The type of the pin/param/signal is now considered when accepting values (like True/False for bit pins and a float value like 3.1415 for real pins, etc.)

The HAL_PORT code has been updated and streamlined. The maximum allowed port queue size is now enforced at 64kByte (because it lives in HAL shared memory). All port API functions do proper argument testing now to prevent unexpected behaviour or crashes. The raster class and test has been updated and the test is no longer skipped.

There are also doc updates (full hal module API description), but they have been held back until the asciidoctor changes are in for generating the docs. The update causes a2x to generate an error when generating a pdf and it is unclear why and where the error happens (most likely in the TeX processing; asciidoctor says the doc is fine).

@grandixximo
Copy link
Copy Markdown
Contributor

@BsAtHome heads-up on a defense-in-depth opportunity while you have halmodule.cc open.

While debugging the qtdragon ui-smoke segfault on Ubuntu 24.04 I traced a shutdown SIGSEGV to a lifecycle gap between _hal and qtvcp:

  • pyhalitem stores its halitem by value (pypin->pin = *pin; in pyhal_pin_new) and has no back-ref to the owning halobject.
  • After comp.exit() the HAL shared memory is detached by rtapi_shmem_delete in hal_lib.c, so every snapshotted u becomes a pointer into an unmapped page.
  • Any caller holding pyhalitem objects (qtvcp's QPin.REGISTRY is the one I hit, via its update timer) will SIGSEGV on the next read on glibc 2.39; older glibc happens to keep the pages around so the bug was latent.

This PR keeps the by-value snapshot, so the lifecycle bug class survives the rewrite. The qtvcp-side fix (stopping the QPin timer before HAL.exit) closes the qtdragon path and is up as #4062. _hal can also be hardened on its own so future callers of pyhal_read_common after hal_exit get a Python exception instead of a SIGSEGV: e.g. give pyhalitem an owning halobject* (Py_INCREF'd) plus a check that the owner's hal_id > 0 before deref, or invalidate items on exit.

Reproducer evidence (instrumented halmodule logging mincore() on every read):

[t=...891937] PYHAL_EXIT_CALLED ... Python stack: qtvcp.py:527 HAL.exit()
[t=...892578] hal_exit name=qtdragon hal_id=84 map_size=59
[t=...900233] READ u=0x7f4537a97fb0 known=1 alive=0 u_mapped=0
                                                    ^^^^^^^^^^^ shmem gone

How would you like to handle the _hal side:

  1. Fold a hardening into hal: Deprecate pyhal and move functionality into hal module #4061 yourself.
  2. Leave hal: Deprecate pyhal and move functionality into hal module #4061 as-is and accept a follow-up PR from me after this lands.
  3. Open an issue only and decide later.

Happy to share the instrumented branch and docker repro if useful.

@BsAtHome
Copy link
Copy Markdown
Contributor Author

I think you need to open an issue and I'll fold it into a new PR to make it clear that they are separate issues.

Accessing a component's pins and other component associated things after you cal hal_exit() on the component is a bug in the program. It is a use-after-free bug. Its like, you closed the door and then want to walk through the door. That will hurt...

The whole halmodule code needs (a lot) more adjustment when the hal types are changed. This PR is primarily about porting the hal port code.

@BsAtHome BsAtHome merged commit f01903d into LinuxCNC:master May 29, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants