-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Generate PC/python3.def by scraping headers #68091
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The attached patch adds a script that can scrape header files to pick out PyAPI_DATA and PyAPI_FUNCs to generate PC/python3.def. It's not complete and is not integrated into the build at all yet. I'm planning to work on this further at the PyCon sprints. Running this and comparing output to the current checked-in python3.def shows that we're not keeping it up-to-date well at all. |
Same patch, with diff to python3.def to make it available. |
I think the approach taken in this script is incorrect. It leads to false modifications of the stable ABI, making it in fact unstable. Four examples: PyAST_FromNode does not belong to the stable ABI, because "struct _node" doesn't belong to the stable ABI (see PEP-384 for a list of structures that belong to the stable ABI). In fact, ast.h isn't even a header file that extension modules should regularly include. All deletions are (e.g. PyCFunction_New) are certainly incorrect: functions must never be removed from the stable ABI. In fact, Objects/methodobject.c preserves PyCFunction_New just for the sake of the stable ABI (despite there being no longer a declaration in the header file). PyErr_SetFromWindowsErrWithFilename could belong to the stable ABI, or not, depending on the intention of stability in its addition. Since it was there since 3.0, and had not been in the stable ABI back then, I'd claim that it doesn't belong to the stable ABI (apparently, nobody has missed it). OTOH PyExc_BrokenPipeError might be a useful addition to the stable ABI. However, it was not there in 3.0, so the header files ought to version-guard its declaration, but currently don't. So I prefer if the def file keeps being maintained manually; I don't mind if there is a tool that checks for inconsistencies, leaving their resolution to a maintainer. |
That would be the preferred way to do it, but we got no traction from anyone at PyCon or on python-dev when we tried to get people to look at what's changed. I suspect we haven't marketed the stable API well enough to have a lot of users, and the build tools still mark everything as 3.x-specific rather than 3-specific. Keeping it as a bare minimum interface has its value, but it's not unreasonable to expect that the functions available from the headers under Py_LIMITED_API are actually exported by python3.dll. |
What about including this in Tools/scripts/patchcheck.py? I think that might strike a good enough balance between making changes noticeable and not automatically changing the def file. As far as all the differences this has uncovered, I'll try to make a patch to #ifdef them in/out. |
Could the latter break people already using the stable ABI on other platforms? Does it even work for others? I don't know what the equivalent of python3.dll would be. If it doesn't exist outside of Windows, then assuming that everything not in python3.def isn't stable is fine. Otherwise I think we need to go the other way. |
It might be nice to have a test that tries to import all of the stable ABI at build time so that the buildbots break if someone adds a new API and doesn't think about what they've done. As an aside, it'd be nice for the script to take the version numbers as args so we can integrate it into the MSBuild steps (which already know the version) and avoid having to update this each time we update the Python version. |
Unassigning since I haven't made it anywhere with this in a year; if anybody else would like to pick it up, feel free. Otherwise I'll get to it some day :) |
This tool would be very useful, since new API often accidentally marked as stable API in header files. The script can play just deliberative role, warn if new unstable API is not wrapped with "#ifndef Py_LIMITED_API" or if stable API is not documented. May be we need separate list of public API in more regular format for comparison with header, documentation, python3.def, refcounts.dat, etc. There is a code for parsing preprocessor instructions in Argument Clinic. May be it can be reused here. |
As I recall, the tool is ready to go. The problem is the argument over what to do with all the APIs that currently are incorrect. Zach had a list at PyCon last year, and the consensus seemed to be that "someone" should figure out whether each API should be stable or not. Since Zach understandably doesn't want to be that someone, there's really no way to reliably tool this process when there are a number of existing violations. (FWIW, I believe we should just add them to the stable list, as that's the only way to guarantee source compatibility. They've already shipped in 3.5 and people can use them on non-Windows OSs.) |
See also closely related bpo-26900. |
It already is, actually; its availability was a major factor in my thinking this would be possible and starting to work on it. |
I just ran into this because PyModule_AddFunctions() was added to the stable ABI in 3.5, but was not added to python3.dll. This is blatantly a compilation error. We _need_ python3.def and the stable ABI to be in sync. I've rerun the script against Python 3.6 and attached it. I'm also going to post to python-dev to attract some attention. I think at this stage we ought to update python3.dll to include everything that could be used from the headers. Unfortunately, that means extensions could potentially break if built on 3.6.1 and run against 3.6.0, but it's inevitable that there will be some breakage and this way it's only temporary. |
Guess I should at least pose the question - Ned, would you consider the change in my patch for 3.6.0? Only the one file is affected, and it only contributes to one DLL that is (a) recommended, (b) not widely used and (c) very incompatible with the included headers. |
I don't know enough about how python3.dll is used in the Windows world to ask the right questions. One obvious one: would this change likely affect any third-party binaries already built for 3.6.0? On the other hand, if we have to have one, it would be better to have a compatibility break at 3.6.0 rather than at 3.6.1. I really don't want to take any more changes at this point for 3.6.0 but if you are convinced it is the right thing to do and won't break anything, get it in for 3.6.1 now and I'll reluctantly cherrypick it for 3.6.0. |
Assuming we only make additive changes, then nothing that currently exists will break. However, extensions may then be built using 3.6.1 that will fail on earlier versions. This goes against the entire purpose of the limited API - making the change for 3.6.0 at least means they will work for the entire 3.6 series. This is one of those "all options are bad" scenarios. Making the change today will prevent us from fixing the functions that should not have leaked into the limited API, but then again they've already leaked and people can compile against them. Perhaps the best approach is to revise all functions in the limited API, restrict back to those that were in 3.3 and carefully add new ones from there. Then we just apologise for 3.5.[0-2] and 3.6.0 potentially producing invalid binaries and recommend getting a newer version to build with. I've convinced myself that 3.6.0 is too soon to make the right fix here, so it'll probably always be a bad version in this respect. Hopefully by 3.6.1 we can fix up the limited API and make it reliable again :( |
PyCmpWrapper_Type is not defined. Some underscored names are defined only in special builds (with Py_REF_DEBUG, Py_TRACE_REFS). PyAST_* and PyNode_* functions are not documented, ast.h and node.h are private headers. Names like _PyArg_Parse_SizeT shouldn't be removed. |
You're right, these should be excluded (possibly from the release too?) I've been playing with the script in a separate context and I think I've hit a few issues (though I have made some modifications, so YMMV). It's probably just as well we waited. I'll thoroughly validate it before coming back with a new patch. |
New changeset 9cb87e53e324 by Serhiy Storchaka in branch '3.5': New changeset 0927b5c80c50 by Serhiy Storchaka in branch '3.6': New changeset e64a82371d72 by Serhiy Storchaka in branch 'default': |
Reordered to make diffs easier for reviewing. |
Good call. I think I might actually replace the script with a build step that uses the C preprocessor to get all the names, then a script to generate the file. If someone who knows the POSIX build system can help out, we could generate a list on each build for various platforms and fail when it doesn't match what's committed. Since the preprocessor doesn't have to generate valid code, we should be able to cross-compile for other platforms, though that's less critical. |
See also bpo-29058. |
Here is a shell script that uses the C preprocessor and other Unix tools to generate PC/python3.def. It doesn't add PyAST_* and PyNode_* functions and doesn't remove PyArg_VaParse* functions. But it still removes some names that are no longer used in public interface, but should be kept in the stable ABI for compatibility. E.g. PyCFunction_New (now it is a macro expanded to PyCFunction_NewEx) or _PyTrash_deposit_object (invoked by the macros before 3.2.4). It seems to me that this is unavoidable, and the resulting file must be manually edited. |
Following patch against 3.6 just adds new names to PC/python3.def. The only removed names are PyExc_MemoryErrorInst and PyUnicode_SetDefaultEncoding. They don't exist in 3.x. |
Those macros like PyCFunction_New actually need to become functions again, since we need real exports to forward them to the right place. Converting stable API functions to macros breaks the ABI, since modules compiled against the old one can no longer link to newer versions, and modules compiled against newer ones cannot link to older versions. I'll try and figure out a way we can compile those functions directly into python3.dll, since that's going to leave us with the best ability to use macros for non-stable API builds. We will, however, need to update the header files to define actual functions rather than macros when Py_LIMITED_API is set. |
They do. Functions PyCFunction_New, _PyTrash_deposit_object etc are compiled and exported even if they don't represented in headers. Actually we can remove the PyCFunction_New function since it was converted to to a macro before Python 3.0 (faabd9867fb8). No one Python 3 extension uses it. But this is separate issue. _PyTrash_deposit_object was used in 3.2.3 and should be kept for binary compatibility. |
Attached a patch that integrates generation into the build of python3.dll. There are also a couple of fixes for headers to retain functions/data that were previously exported. The patch includes an updated python3.def file, though I expect that will go away and we'll check in the python3.txt file that is generated on build (which includes the return type). That way we can fail builds that modify the file (regardless of platform) and require an explicit step when adding to the stable API. There are still a few functions towards the end of python3.def that I need to figure out how to deal with. (Also, my sort script doesn't quite match the result from however Serhiy sorted it last, but it's as close as I could get.) |
It is not good that the script removes existing names and adds non-existing names. It is safe to remove names that don't exist in all releases starting from 3.2.0.
As for PyArg_VaParse and PyArg_VaParseTupleAndKeywords see bpo-11626. Seems they were excluded from limited API by mistake. I suggest first fix errors in python3.def, remove non-existing names, add new names, and only after this add the ability of generating it if it doesn't break python3.def. |
New changeset 8423f86486b3 by Serhiy Storchaka in branch '3.5': New changeset b5470d08969c by Serhiy Storchaka in branch '3.6': New changeset 45507a5751d8 by Serhiy Storchaka in branch 'default': |
New changeset d95fee442e27 by Serhiy Storchaka in branch '3.5': New changeset cb864fc4b3be by Serhiy Storchaka in branch '3.6': New changeset 513852ad0c5c by Serhiy Storchaka in branch 'default': |
Unfortunately, there's no way too remove defined names from the DLL during Python 3 at all, except where the prototype was never provided. PyCFunction_New has always been a macro, and PySys_SetDefaultEncoding looks to have been removed before the API was committed. Maintaining an export and a macro is going to be awkward, so I think we should just change them back to functions. Any half-decent optimizing compiler is going to inline it within core, and it's better than having two implementations in different places of the same function. But yes I agree, fix the def first and then automate. But I don't want to fix it but adding things that were never usable. |
New changeset 2e5ad97c9c19 by Serhiy Storchaka in branch '3.5': New changeset 86a412584c02 by Serhiy Storchaka in branch '3.6': New changeset 3a2595f82447 by Serhiy Storchaka in branch 'default': |
Reawakening this issue so we can finish it off - the stable ABI has been declared "clean" on python-dev, so there shouldn't be any more concerns about APIs leaking in now. We can assume that everything accessible from the header files should be included in python3.def. If it has been cleaned up though, we might need to go through and restore anything that used to be in python3.def but wasn't meant to be. Those are stable. |
I plan to do this slightly differently, see PEP-652. |
The symbols exported by python3.dll are now generated from Misc/stable_abi.txt as per PEP-652. This is slightly more work than generating the list fully automatically, but the extra work is negligible compared to all the things you should think about when adding API that'll be be supported until Python 4.0. Headers are now checked on Unix (GCC) only. If anyone wants to work on a cross-platform C parser/checker, please let me know. |
PEP-652 (bpo-43795) is now in; I'm closing this issue. A cross-platform C parser/checker would still be an improvement, but the constraints (no external dependencies in CPython) and benefits (not many compared to the goal of strict separation of internal/public/stable headers) aren't really aligned in its favor. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: