Skip to content

Conversation

@sigmaSd
Copy link
Contributor

@sigmaSd sigmaSd commented Oct 23, 2025

This PR is a +10 hour debugging with + 200 requests to claude, The motivation is I had a "stress" program that discovered many issues, and so this pr:

This commit resolves memory management issues with Python callbacks, including segfaults, premature callback destruction, and resource leaks.

Callbacks had multiple lifetime management issues:

  1. Callbacks were freed prematurely when JS PyObject wrappers were GC'd, even though Python still held references (causing segfaults)
  2. The same Python object could have multiple PyObject wrappers, each registering with refregistry and causing duplicate Py_DecRef calls
  3. Resource leaks occurred when callbacks were never properly cleaned up
  4. Auto-created callbacks (from JS functions) had no cleanup mechanism
  5. Handle python callbacks that are never passed to python

Implement PyCapsule-based callback cleanup:

  1. PyCapsule destructor: When creating a callback, we:

    • Create a PyCapsule containing the callback's handle
    • Use the capsule as the m_self parameter of PyCFunction_NewEx
    • Register a destructor that Python calls when freeing the PyCFunction
    • In the destructor, remove the callback from callbackRegistry and destroy it
  2. Separate lifetime tracking: Callbacks use callbackRegistry instead of refregistry. They are NOT registered with refregistry because:

    • The initial reference from PyCFunction_NewEx is sufficient
    • Cleanup happens via the capsule destructor, not FinalizationRegistry
    • This prevents artificial refcount inflation
  3. Handle-based deduplication: Track registered handles in a Set (not PyObject instances) to prevent duplicate registration when multiple PyObject wrappers exist for the same Python object

  4. Auto-created callback cleanup: callbackCleanupRegistry handles callbacks created from raw JS functions that are never passed to Python

  • src/python.ts: Implement PyCapsule-based callback cleanup
  • src/symbols.ts: Add PyCapsule_New and PyCapsule_GetPointer symbols
  • test/test_with_gc.ts: Add proper cleanup to tests

All existing tests pass with resource sanitization on

This commit resolves memory management issues with Python callbacks,
including segfaults, premature callback destruction, and resource leaks.

Callbacks had multiple lifetime management issues:
1. Callbacks were freed prematurely when JS PyObject wrappers were GC'd,
   even though Python still held references (causing segfaults)
2. The same Python object could have multiple PyObject wrappers, each
   registering with refregistry and causing duplicate Py_DecRef calls
3. Resource leaks occurred when callbacks were never properly cleaned up
4. Auto-created callbacks (from JS functions) had no cleanup mechanism

Implement PyCapsule-based callback cleanup:

1. **PyCapsule destructor**: When creating a callback, we:
   - Create a PyCapsule containing the callback's handle
   - Use the capsule as the m_self parameter of PyCFunction_NewEx
   - Register a destructor that Python calls when freeing the PyCFunction
   - In the destructor, remove the callback from callbackRegistry and destroy it

2. **Separate lifetime tracking**: Callbacks use callbackRegistry instead of
   refregistry. They are NOT registered with refregistry because:
   - The initial reference from PyCFunction_NewEx is sufficient
   - Cleanup happens via the capsule destructor, not FinalizationRegistry
   - This prevents artificial refcount inflation

3. **Handle-based deduplication**: Track registered handles in a Set (not
   PyObject instances) to prevent duplicate registration when multiple
   PyObject wrappers exist for the same Python object

4. **Auto-created callback cleanup**: callbackCleanupRegistry handles
   callbacks created from raw JS functions that are never passed to Python

- src/python.ts: Implement PyCapsule-based callback cleanup
- src/symbols.ts: Add PyCapsule_New and PyCapsule_GetPointer symbols
- test/test_with_gc.ts: Add proper cleanup to tests

All existing tests pass with resource sanitization on
…tration

Callbacks are now only added to callbackRegistry when .owned is called
(i.e., when actually passed to Python), not during creation. This allows
callbackCleanupRegistry to properly clean up callbacks that are created
but never passed to Python.

Changes:
- Store Callback reference in PyObject.#callback during creation
- Register with callbackRegistry in .owned getter when callback is passed to Python
- Add test case for auto-created callbacks that are never passed to Python
@sigmaSd
Copy link
Contributor Author

sigmaSd commented Oct 23, 2025

This is a mini version of the stress test: ( an important part is spamming GC)

import {
  kw,
  NamedArgument,
  type PyObject,
  python,
} from "jsr:@denosaurs/python@0.4.6";

import {
  type Adw1_ as Adw_,
  DenoGLibEventLoop,
  type Gtk4_ as Gtk_,
} from "jsr:@sigma/gtk-py@0.7.0";

const gi = python.import("gi");
gi.require_version("Gtk", "4.0");
gi.require_version("Adw", "1");
const Gtk: Gtk_.Gtk = python.import("gi.repository.Gtk");
const Adw: Adw_.Adw = python.import("gi.repository.Adw");
const GLib = python.import("gi.repository.GLib");
const gcp = python.import("gc");
const el = new DenoGLibEventLoop(GLib); // this is important so setInterval works (by unblockig deno async event loop)

const gcInterval = setInterval(() => {
  gcp.collect();
  gc();
}, 100);

class MainWindow extends Gtk.ApplicationWindow {
  #state = false;
  #f?: PyObject;
  constructor(kwArg: NamedArgument) {
    // deno-lint-ignore no-explicit-any
    super(kwArg as any);
    this.set_default_size(300, 150);
    this.set_title("Awaker");
    this.connect("close-request", () => {
      el.stop();
      clearInterval(gcInterval);
      return false;
    });

    const button = Gtk.ToggleButton(
      new NamedArgument("label", "OFF"),
    );
    const f = python.callback(this.onClick);
    button.connect("clicked", f);
    const vbox = Gtk.Box(
      new NamedArgument("orientation", Gtk.Orientation.VERTICAL),
    );
    vbox.append(button);
    this.set_child(vbox);
  }

  // deno-lint-ignore no-explicit-any
  onClick = (_: any, button: Gtk_.ToggleButton) => {
    this.#state = !this.#state;
    (this.#state) ? button.set_label("ON") : button.set_label("OFF");
  };
}

class App extends Adw.Application {
  #win: MainWindow | undefined;
  constructor(kwArg: NamedArgument) {
    super(kwArg);
    this.connect("activate", this.onActivate);
  }
  onActivate = python.callback((_kwarg, app: Gtk_.Application) => {
    new MainWindow(new NamedArgument("application", app)).present();
  });
}

const app = new App(kw`application_id=${"com.example.com"}`);
app.register();
app.activate();
el.start();

@sigmaSd
Copy link
Contributor Author

sigmaSd commented Oct 23, 2025

The above example is what motivated this:
If you run the above code with deno run -A --v8-flags=--expose-gc it will segfault, because this pyobject const f = python.callback(this.onClick); is Gced by JS because it have no way of know that python still needs it,

I couldn't make this test more minimal or make it a standalone test with no gtk involved, but the point being in a relativey complex project, JS might not see that some callbacks are still needed by python

So to fix this I see 2 solutions:

  • either never free callbacks (by saving them to a global object)
  • or what this PR does, use python capsule to detect when python doesn't need the object anymore and synchronize the 2 Gcs

The pr seems to work well so far

Copy link
Member

@eliassjogreen eliassjogreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow thanks! I looked through the code and can't see anything that's obviously wrong, but it's as you say not the simplest issue nor piece of code... Well documented and easy to follow too! <3

@sigmaSd
Copy link
Contributor Author

sigmaSd commented Oct 25, 2025

Thanks I added a small cleanup, turns out bun tests runs with each commit that's pretty cool, I think its ready to merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants