Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-108444: Add PyLong_AsInt() public function #108445

Merged
merged 4 commits into from
Aug 24, 2023
Merged

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Aug 24, 2023

  • Rename _PyLong_AsInt() to PyLong_AsInt().
  • Add documentation.
  • Add test.
  • For now, keep _PyLong_AsInt() as an alias to PyLong_AsInt().

📚 Documentation preview 📚: https://cpython-previews--108445.org.readthedocs.build/

@vstinner
Copy link
Member Author

@serhiy-storchaka: You added the function 10 years, what do you think of making it public now?

* Rename _PyLong_AsInt() to PyLong_AsInt().
* Add documentation.
* Add test.
* For now, keep _PyLong_AsInt() as an alias to PyLong_AsInt().
@serhiy-storchaka
Copy link
Member

@serhiy-storchaka: You added the function 10 years, what do you think of making it public now?

Just last week I thought about this. (int)PyLong_AsLong() is a common error, so we should make the correct way more easier.

Should we add also PyLong_AsIntAndOverflow() and PyLong_AsUnsignedInt(). I guess that PyLong_AsUnsignedIntMask() is not needed, as (unsigned)PyLong_AsUnsignedLongMask() does correct thing.

Should we add PyLong_AsShort() and others?

@vstinner
Copy link
Member Author

Having to check result == -1 && PyErr_Occurred() is a pain, and we are trying to avoid that. That's why the new PyDict_GetItemRef() return -1 on error, rather than returning directly the item.

Maybe the API should: int PyLong_AsInt(PyObject *obj, int **result): the Python int value is written into *result, return 0 on success, or return -1 on error (wrong type, overflow). If the fact that the name is too close to current function, maybe we use a different name: PyLong_ToInt(), PyLong_GetInt(), ...

Should we add PyLong_AsShort() and others?

I would prefer to not add too many functions. The int is common, I'm not sure about the short type. For example, there is no existing private function for short.

Copy link
Member

@serhiy-storchaka serhiy-storchaka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

with self.assertRaises(OverflowError):
PyLong_AsInt(INT_MAX + 1)

# invalid type
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add also tests for objects with __index__

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I added a test

@@ -0,0 +1,4 @@
Add :c:func:`PyLong_AsInt` function: similar to :c:func:`PyLong_AsLong`, but
store the result in a C :c:expr:`int` instead of a C :c:expr:`long`.
Previously, it was known as the the private function :c:func:`!_PyLong_AsInt`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Previously, it was known as the the private function :c:func:`!_PyLong_AsInt`
Previously, it was known as the private function :c:func:`!_PyLong_AsInt`

Doc/whatsnew/3.13.rst Outdated Show resolved Hide resolved
vstinner and others added 2 commits August 24, 2023 22:39
…OSS.rst

Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
Co-authored-by: Kirill Podoprigora <kirill.bast9@mail.ru>
@erlend-aasland
Copy link
Contributor

This API extends the list of problematic APIs listed in capi-workgroup/problems#1. How about a non-ambiguous return value?

@erlend-aasland
Copy link
Contributor

@serhiy-storchaka
Copy link
Member

_PyLong_AsInt() is used more than 420 times in the CPython code. For comparison, PyLong_AsLong() is used only about 160 times, and PyLong_AsLongLong() -- less than 25 times.

I suppose that the similar proportion should be in the third-party code. But they use PyLong_AsLong() and often ignore integer overflow in conversion to int. Simple replacing PyLong_AsLong() with PyLong_AsInt() in right places should fix a lot of bugs. Replacing it with a function with different interface is not so easy.

@erlend-aasland
Copy link
Contributor

_PyLong_AsInt() is used more than 420 times in the CPython code.

That number is true, but the majority of it comes from Argument Clinic. If you subtract generated code, you will see that the actual number is less then a tenth of 420:

$ git grep _PyLong_AsInt  | grep -v clinic | wc -l
36

I'm not sure a handful of cases warrants the introduction of yet another ambiguous API.

@serhiy-storchaka
Copy link
Member

That number is true, but the majority of it comes from Argument Clinic.

And what? Argument Clinic uses _PyLong_AsInt() much more that PyLong_AsLong() because conversion to int is much more wanted.

@erlend-aasland
Copy link
Contributor

And what? Argument Clinic uses _PyLong_AsInt() much more that PyLong_AsLong() because conversion to int is much more wanted.

The point is that there is not 420 occurrences of _PyLong_AsInt to replace, there are only 38; two in clinic.py, and the remaining 36 in various .c files.

You wrote:

_PyLong_AsInt() is used more than 420 times in the CPython code.

Replacing it with a function with different interface is not so easy.

I assumed you meant that it was not easy to use a different API because of the huge number of occurrences (420). My interpretation was that since the actual number of occurrences that need replacement is significantly lower, your argument of it being hard to replace is moot.

@serhiy-storchaka
Copy link
Member

No, I mean that you can expect a huge number of use cases for PyLong_AsInt() in hand written third-party code which currently uses PyLong_AsLong() with a chance of integer overflow. Replacing (int)PyLong_AsLong() with PyLong_AsInt() is much simpler that with a function with other interface.

@erlend-aasland
Copy link
Contributor

No, I mean that you can expect a huge number of use cases for PyLong_AsInt() in hand written third-party code which currently uses PyLong_AsLong() with a chance of integer overflow. Replacing (int)PyLong_AsLong() with PyLong_AsInt() is much simpler that with a function with other interface.

Aha, sorry, I misunderstood you. Yes, that is a valid point. I withdraw my complain.

@vstinner
Copy link
Member Author

This API extends the list of problematic APIs listed in capi-workgroup/problems#1. How about a non-ambiguous return value?

I'm well aware of PyLong_AsLong() API issue. But if we fix the API, I would prefer to fix the whole PyLong API family, not a single function. It would be surprising to have a single function with a similar name, but a different API.

By the way, the private _PyLong_AsInt() function is used by these projects:

Affected projects (9):

  • catboost (1.2)
  • fastobo (0.12.2)
  • gnureadline (8.1.2)
  • multiprocess (0.70.14)
  • numpy (1.25.0)
  • orjson (3.9.1)
  • pythonnet (3.0.1)
  • pyzstd (0.15.9)
  • typed_ast (1.5.5)

@vstinner vstinner merged commit be436e0 into python:main Aug 24, 2023
22 checks passed
@bedevere-bot
Copy link

There's a new commit after the PR has been approved.

@serhiy-storchaka: please review the changes made to this pull request.

@erlend-aasland
Copy link
Contributor

But if we fix the API, I would prefer to fix the whole PyLong API family, not a single function.

Yes, that makes sense.

@vstinner
Copy link
Member Author

The function is more if it's worth it to propose a new set of PyLong_AsLong-like function which avoid having to check PyErr_Occurred(): API like int PyLong_GetLong(PyObject *obj, long **result) that I proposed above.

@erlend-aasland
Copy link
Contributor

... or even better: int PyInt_GetLong(...), so it is named after the Python type.

@encukou
Copy link
Member

encukou commented Aug 31, 2023

I don't think this should be added to the stable ABI now.
If we do fix the whole PyLong API family, this is one more function that will need to be deprecated shortly after it was added.

@vstinner
Copy link
Member Author

I don't think this should be added to the stable ABI now. If we do fix the whole PyLong API family, this is one more function that will need to be deprecated shortly after it was added.

The private _PyLong_AsInt() function seems to be pretty commonly used in the wild, so adding it as public API helps people to migrate to this new public C API.

I would like to add it the limited C API to be able to use it in C code generated by Argument Clinic which seems to like a lot PyLong_AsInt(). For example, in the main branch, there are 70 .c.h files generated by AC which calls PyLong_AsInt().

For now, there is no concrete plan to deprecate PyLong_AsLong() nor replacement.

@serhiy-storchaka
Copy link
Member

Is there a way to gather some reliable statistics about non-public functions used in third-party code? If they are used in a number of non-toy projects, we should consider to provide a public API to satisfy the need.

@encukou
Copy link
Member

encukou commented Aug 31, 2023

If they are used in a number of non-toy projects, we should consider to provide a public API to satisfy the need.

I'm on board with that. But we should also use these opportunities to provide better API. Private functions can get away with being hacky.

@vstinner
Copy link
Member Author

I'm on board with that. But we should also use these opportunities to provide better API.

Known issues of PyLong_AsLong() were discussed in previous comments and it was decided to add PyLong_AsInt() as it is anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants