Use compressed strings for capsule names. #7255

scoder · 2025-10-24T19:21:16Z

For exported C names, it's wasteful to have each name and signature created as separate Python strings in the module state when it's unlikely to be used there. Storing joined, compacted string constants instead allows using the available byte string compression and reusing signature C strings for identical PyCapsule signatures.

Closes #7107

For exported C names, it's wasteful to have the name created as Python string in the module state when it's unlikely to be used there. Storing plain C string constants allows the C compiler to store them away in the constant data segment instead. See cython#7107

…ility module.

…s as empty C strings.

… of two.

scoder · 2025-10-25T09:47:48Z

I think I've squeezed every possible byte out of the storage. It now uses a loop over a function pointer array and a single compressed byte string for the names and the signatures, and compacts the signatures by removing duplicates (which also nicely uses the same signature C string for PyCapsules with the same signature).

da-woods · 2025-10-25T12:22:24Z

To me this makes more sense for the signatures than the names.

The signatures are (probably) duplicated while the names aren't.

For the names, it saves a little bit of memory making the string-tab array shorter (one pointer per name), but increases the runtime memory since you now have the big unified string plus a separate Python string for each name in the dictionary. For the signatures they can just be a pointer into the big unified string so that's fine.

My (possibly wrong) guess would be that this doesn't make much difference to the compression.

Only slightly related to this PR but I do think it would be nice to cut down the constant tables to exclude things that are only used when initializing the module - i.e. have a separate shorter-lived table outside the module state for constants that we don't need to keep around forever.

scoder · 2025-10-25T16:27:15Z

To me this makes more sense for the signatures than the names.
The signatures are (probably) duplicated while the names aren't.

Yes, that's why it's done that way. I also moved the signatures first so that that part of the aligned byte string can stay in memory or caches, while the names in the back part can reside elsewhere, in unused memory places.

For the names, it saves a little bit of memory making the string-tab array shorter

And also the module state. It's now a single string entry for all signatures (which need to stay alive) and all names. I doubt that the names as part of the byte string hurt that much at runtime.

My (possibly wrong) guess would be that this doesn't make much difference to the compression.

I tried it on lxml.etree, which exports 51 functions, and it saves 4 KB in the stripped binary module. Not much, but also not nothing. Can't say much about the runtime impact, but even just the module state reduction (816 bytes) and the signature deduplication (17 signatures less, probably another ~0.5 KB) should be relevant enough to make sure we don't waste more RAM than before.

cut down the constant tables to exclude things that are only used when initializing the module - i.e. have a separate shorter-lived table outside the module state for constants that we don't need to keep around forever.

I think that's a good idea. We could also clear out init-time-only constants at the end of PyInit as a first step, but a separate table could really discard them completely.

…st of tuples for the export instead of three separate lists.

scoder · 2025-10-25T16:56:32Z

I think we can probably do the same for the import code. I'll take a look as well.

da-woods · 2025-10-25T18:19:15Z

My (possibly wrong) guess would be that this doesn't make much difference to the compression.

I tried it on lxml.etree, which exports 51 functions, and it saves 4 KB in the stripped binary module. Not much, but also not nothing. Can't say much about the runtime impact, but even just the module state reduction (816 bytes) and the signature deduplication (17 signatures less, probably another ~0.5 KB) should be relevant enough to make sure we don't waste more RAM than before.

My original comment was a little ambiguous. I completely believe in the compression+deduplication of the signatures. I really meant the "compared to a version where the names stay in the module state as they are now".

FWIW I had a quick go at that in https://github.com/da-woods/cython/tree/export-without-names. Not really sure how I'd actually compare runtime memory usage - I tried tracemalloc with Cython but don't think it told me too much.

I added a switch to use either PyBytes or a C string for sig+names and with PyBytes, modules that import the shared module become visibly smaller. So I'll leave the switch at "True".

…nd of ModuleNode.py to keep the ModuleNode class near the top.

scoder · 2025-10-25T20:37:51Z

I integrated also the import code. Looking at the shared module test, the modules that import the shared module become visibly smaller:
Current master:

 61840 TEST_TMP/memoryview/memoryview_shared_utility/pkg1/pkg11/add_one.cpython-312-x86_64-linux-gnu.so*
 66064 TEST_TMP/memoryview/memoryview_shared_utility/pkg1/pkg11/cast.cpython-312-x86_64-linux-gnu.so*
104048 TEST_TMP/memoryview/memoryview_shared_utility/pkg1/pkg11/dependency.cpython-312-x86_64-linux-gnu.so*
178600 TEST_TMP/memoryview/memoryview_shared_utility/pkg2/CythonShared.cpython-312-x86_64-linux-gnu.so*

Before integrating the import code:

 61840 TEST_TMP/memoryview/memoryview_shared_utility/pkg1/pkg11/add_one.cpython-312-x86_64-linux-gnu.so*
 66064 TEST_TMP/memoryview/memoryview_shared_utility/pkg1/pkg11/cast.cpython-312-x86_64-linux-gnu.so*
104048 TEST_TMP/memoryview/memoryview_shared_utility/pkg1/pkg11/dependency.cpython-312-x86_64-linux-gnu.so*
178920 TEST_TMP/memoryview/memoryview_shared_utility/pkg2/CythonShared.cpython-312-x86_64-linux-gnu.so*

After:

 57744 TEST_TMP/memoryview/memoryview_shared_utility/pkg1/pkg11/add_one.cpython-312-x86_64-linux-gnu.so*
 66064 TEST_TMP/memoryview/memoryview_shared_utility/pkg1/pkg11/cast.cpython-312-x86_64-linux-gnu.so*
 99952 TEST_TMP/memoryview/memoryview_shared_utility/pkg1/pkg11/dependency.cpython-312-x86_64-linux-gnu.so*
178920 TEST_TMP/memoryview/memoryview_shared_utility/pkg2/CythonShared.cpython-312-x86_64-linux-gnu.so*

It's interesting that the shared Cython module gets a little larger with this change. Not sure what triggers that. I guess it fails to benefit from the signature deduplication. I also tried using a plain C string for sig+name instead of compressed Python bytes and that makes it much larger (183016 bytes), so that's not a good idea.

scoder · 2025-10-25T20:45:06Z

It's also interesting that the gain is always exactly 4096 bytes. Might be a segmentation issue. That could also explain why the shared module grows in the test, it might simply move data into a different binary segment that then grows to the next unit boundary. Something like that, maybe?

EDIT: This SO question suggests that segments are padded to the page size of the CPU architecture:
https://stackoverflow.com/questions/67288459/elf-executable-file-many-zero-bytes

…way.

scoder · 2025-10-26T05:28:34Z

the names stay in the module state as they are now

I considered that along the way but I doubt that the names would also be used inside of the module, so that they'd benefit from being Python strings. I'd expect them to be used exactly once in the module, when exporting the capsules.

OTOH, we also intern the Python identifiers, and other modules that import the capsules need to provide the same strings, so (globally) interning them on export (and import) could reduce the total amount of different string objects in the Python runtime. That's assuming that the majority of exported names are also used somewhere, which isn't always true for cross package cimports from larger packages (say, SciPy-sized). Here, I'd expect the gain from shared interned strings to be fairly small in comparison to the larger module states. Avoiding interned Python names on both sides (export and import) might actually be a better choice.

That written, we're talking about a few kilobytes back and forth here. Whichever we choose probably won't turn a big wheel either way in a real system. If we eventually manage to discard import-time data after use, this will be easy to split up again.

With the import integration, I consider the PR in its current state fine for 3.2rc. What do you think?

scoder · 2025-10-26T11:57:28Z

We could also clear out init-time-only constants at the end of PyInit as a first step

I implemented this for the simple case here, but then noticed that this isn't safe because we deduplicate constants (and strings specifically) in the string array. Thus, if a user happens to have a bytes string around that looks like the names or signatures string, then clearing the array index in the module state will kill a user visible string. However unlikely that is, it's easy to get there for user code. That speaks for keeping the strings entirely separate. Definitely not in 3.2 any more.

da-woods · 2025-10-26T12:03:47Z

We could also clear out init-time-only constants at the end of PyInit as a first step

I implemented this for the simple case here, but then noticed that this isn't safe because we deduplicate constants (and strings specifically) in the string array. Thus, if a user happens to have a bytes string around that looks like the names or signatures string, then clearing the array index in the module state will kill a user visible string. However unlikely that is, it's easy to get there for user code. That speaks for keeping the strings entirely separate. Definitely not in 3.2 any more.

Yes - for me I think it'd make most sense to do as a general mechanism rather than adding specific special-cases. (I also don't know exactly what you did here, but the signature strings do need to live as long as the module for the export code so it'd have to be names-only).

Will have a thorough look at it later today.

da-woods

I can't see any issues.

This feels like the sort of thing that's sufficiently repetitive that it probably doesn't have hidden corner cases so is probably OK for 3.2. But it's always a bit hard to judge

scoder · 2025-10-27T07:59:44Z

cut down the constant tables to exclude things that are only used when initializing the module

This is now #7266

scoder added this to the 3.3 milestone Oct 24, 2025

scoder added enhancement Code Generation labels Oct 24, 2025

scoder added 4 commits October 25, 2025 07:59

Rewrite function/pointer export code using a loop.

983a605

Store signatures and names as compressed Python bytes.

81595e1

Refactor the export code generation to use it also from the shared ut…

bf2b6d2

…ility module.

Deduplicate PyCapsule signature string by storing identical signature…

2f02e6a

…s as empty C strings.

scoder changed the title ~~Use plain C strings for capsule names.~~ Use compressed strings for capsule names. Oct 25, 2025

Store signatures and names as one combined Python byte string instead…

d109195

… of two.

scoder mentioned this pull request Oct 25, 2025

[ENH] Compress string constants #7107

Closed

Simplify the implementation by removing indirections and passing a li…

b14e564

…st of tuples for the export instead of three separate lists.

scoder added 2 commits October 25, 2025 22:20

Use the same implementation for the import code.

008fadf

I added a switch to use either PyBytes or a C string for sig+names and with PyBytes, modules that import the shared module become visibly smaller. So I'll leave the switch at "True".

Move the actual code generation functions from the beginning to the e…

2d42c9f

…nd of ModuleNode.py to keep the ModuleNode class near the top.

Minor name change to move the generated name prefix a bit out of the …

8d16f72

…way.

scoder modified the milestones: 3.3, 3.2 Oct 26, 2025

da-woods reviewed Oct 26, 2025

View reviewed changes

scoder merged commit 2a00d4c into cython:master Oct 27, 2025
92 checks passed

scoder deleted the exported_names branch October 27, 2025 07:37

Uh oh!

Use compressed strings for capsule names. #7255

Use compressed strings for capsule names. #7255

Conversation

scoder commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scoder commented Oct 25, 2025

Uh oh!

da-woods commented Oct 25, 2025

Uh oh!

scoder commented Oct 25, 2025

Uh oh!

scoder commented Oct 25, 2025

Uh oh!

da-woods commented Oct 25, 2025

Uh oh!

scoder commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scoder commented Oct 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

scoder commented Oct 26, 2025

Uh oh!

scoder commented Oct 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

da-woods commented Oct 26, 2025

Uh oh!

da-woods left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

scoder commented Oct 27, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

scoder commented Oct 24, 2025 •

edited

Loading

scoder commented Oct 25, 2025 •

edited

Loading

scoder commented Oct 25, 2025 •

edited

Loading

scoder commented Oct 26, 2025 •

edited

Loading