-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make implicit returns explicit in longobject.c (in CHECK_SMALL_INT) #81993
Comments
In longobject.c we have the following usage a few times: PyObject *
PyLong_FromLong(long ival)
{
PyLongObject *v;
// ... more locals ...
if (ival < 0) {
/* negate: can't write this as abs_ival = -ival since that
invokes undefined behaviour when ival is LONG_MIN */
abs_ival = 0U-(unsigned long)ival;
sign = -1;
}
else {
// ... etc. etc. The CHECK_SMALL_INT macro contains a #define CHECK_SMALL_INT(ival) \
do if (-NSMALLNEGINTS <= ival && ival < NSMALLPOSINTS) { \
return get_small_int((sdigit)ival); \
} while(0) That's not even an error condition -- in fact it's the fast, hopefully reasonably-common, path. An implicit return like this is pretty surprising for the reader. And it only takes one more line (plus a close-brace) to make it explicit: if (IS_SMALL_INT(ival)) {
return get_small_int((sdigit)ival);
} so that seems like a much better trade. Patch written, will post shortly. |
I've just sent #59408 which is the first of two patches for this. It's quite small. The second patch, which completes the change, is also small: It depends on the first one, so I think the easiest is for me to send it as a PR after the first is merged? Happy to do another workflow if that'd make review easier. (I don't see guidance in the devguide on handling a sequence of patches, so I'm making this up.) |
Thanks for the contribution Greg. I understand your reasoning but am going to decline PR 15203. Though the feature is not experimentnal, it is something we want be able to modify, shut-off, or extend at build time. Sometimes for testing we turn it off in order to identify identity test bugs. Also, eveonline was controlling this to save memory. |
P.S. Making the returns explicit is a reasonable change to make, but the core #ifdef logic should remain. |
Interesting! Well, if the option is useful for testing, that's certainly a
Also interesting. I feel like there must be something about their setup Or do you mean they were increasing the numbers? One thought I had looking
I'll rework the explicit-return patch to send without this change. It'll be On Sat, Aug 10, 2019, 23:19 Raymond Hettinger <report@bugs.python.org>
|
An explicit return would be a nice improvement. On the other hand, if it is tricky and requires something more than minor surgery, that would be a hint that it isn't worth it. There is some value in code that is stable and known to be working just fine. See: https://en.wikipedia.org/wiki/There_are_known_knowns |
Definitely true! I think this change turned out quite clean, though. Just sent, as #59421 -- please take a look. |
How about using a hybrid implementation for For example, on 64-bit platform, if (a integer >=-9223372036854775808 and <=9223372036854775807), use a native People mostly use +-* operations, maybe using native int is faster, even including the cost of overflow checking. If operation will overflow or other operation like **, we can transform native int to current form and run in current code path. |
Ma Lin: Everything in Python is an object, nothing is native. There is room to reconsider the implementation of integer objects, known internally as PyLong objects. If you want to tease-out an alternative implementation, please do so on the python-ideas maillist. Small integer operations (including) indexing used to be smaller and faster in Python 2.7, so there is likely some room for improvement on what we have now. |
I sent twice, but it doesn't appear in Python-Ideas list. |
python-dev likely isn't the right place. That is a forum for more mature ideas. |
Thanks, Raymond, for the review on #59421! Shortly after posting this issue, I noticed a very similar story in CHECK_BINOP. I've just posted #59653 to similarly make returns explicit there. It basically consists of a number of repetitions of - CHECK_BINOP(self, other);
+ if (!PyLong_Check(self) || !PyLong_Check(other)) {
+ Py_RETURN_NOTIMPLEMENTED;
+ } with the names |
Agreed, that idea should be posted to python-list if they're having issues posting to python-ideas. Python-dev certainly wouldn't be appropriate for that. |
How about we don't do any more of these :-) I understand the desire to have an explicit return but don't think these minor aesthetics are worth it. The existing code was correct. The patches increase the total code volume. There are subtle differences during the transformation (macros don't require type declarations). They are a bit tedious to review and are eating up our time in the back and forth. All changes like this risk introducing bugs into otherwise correct code. Minor code churn is rarely worth it. May I suggest directing your efforts towards fixing known bugs or implementing requested features. It isn't our goal to create more work for one another ;-) |
There frequently is value in improving code readability, as it can improve maintainability in the long term, but I definitely understand your point. |
Well, I would certainly be grateful for a review on my fix to bpo-18236. ;-) There's also a small docs bug at #59506. I do think there's significant value in making code easier to read and less tricky. If the project continues to be successful for a long time to come, then that means the code will be read many, many more times than it's written. But one particular spot where it seems our experiences interestingly differ is:
As a reviewer I generally find it much less work to review a change when it's intended to have no effect on the code's behavior. First, because it's easier to confirm no effect than to pin down what the effects are; then because the whole set of questions about whether the effects are desirable doesn't arise. As a result I often ask contributors (to Zulip, say) to split a change into a series of small pure refactors, followed by a very focused diff for the behavior change. So that's certainly background to my sending as many PRs that don't change any behavior as PRs that do. I actually have quite a number of draft changes built up over the last few weeks. I've held back on sending them all at once, partly because I've felt I have enough open PRs and I wanted to get a better sense of how reviews go. Perhaps I'll go pick out a couple more of them that are bugfixes, features, and docs to send next. (You didn't mention docs just now, but given the care I see you take in adding to them and in revising What's New, I think we agree that work there is valuable.) |
Sadly, #59915 was merged without mentioning the bpo-37812 (this issue) in the final commit message :-( commit 6b51998
This second change introduced a regression: bpo-38205 "Python no longer compiles without small integer singletons". I don't understand the whole issue. A first change converted a macro to a static inline function. The second change converted the static inline fnuction to a macro... but the overall change introduced a regression. Why not reverting the first change instead of pushing a new change? Morever, if using a static inline function is causing issues, it would be nice to add a comment to explain why, so the issue will be avoided in the future. |
Victor, I agree that both changes should be reverted. We should avoid churning long stable, bug free code for minimal aesthetic value. |
Thanks Victor for linking that issue back here.
Not quite. The first change converted a macro if (is_small_int(ival)) {
return get_small_int((sdigit)ival);
} The second change converted The second change also changed an
The change that caused the build failure you're seeing (GH-15710) was intended for bpo-38015 . Details there; briefly, common compilers on x86_32 would emit some unsightly extra instructions with It's not clear to me if anyone benchmarked to see if the conversion to a macro had any measurable performance benefit. There's some measurement here: So I think the short of it is that the static inline function was not causing any issue, and an easy fix for the build failure is to revert #59915. Alternatively, another easy fix would be to replace just the Either way, I don't think there's any reason here to revert #59421. |
I tested on that day, also use this command: python.exe -m pyperf timeit -s "from collections import deque; consume = deque(maxlen=0).extend; r = range(256)" "consume(r)" --duplicate=1000 I remember the results are: Since the difference is too obvious, I tested it only once for each version. |
Sorry Greg, I know you're enthusiastic about this but I think both changes were a mistake. As the person who merged them, I've decided to revert them in favor of stable code (FWIW, I do care about Ma Lin's concerns in the other BPO and I don't want the generated code to be any different than it was). We're wasted a lot of dev time on something that never promised much value in the first place. So, I'll revert and close this tracker issue so we can move on to something more substantive. |
Separately from whether there was or wasn't such an issue here, I think it's interesting to note that the build failure bpo-38205 is an example of exactly the opposite! It was caused by a combination of And both of them led to that build failure in classic ways. Specifically, the build failure happened because (a) this (b) the call was behind an When conditional preprocessing is kept to a minimum -- here, if the |
OK, so be it. I'll certainly agree that this series of threads consumed a lot more time than I anticipated or hoped, though I think we have different analyses of why that was. (As you can probably guess, my previous message crossed with yours.) |
Raymond:
I'm one of the first to advocate to replace ugly macros with clean static inline functions. Macros are evil and can be too easily misused. On PR 15216, Benjamin and Raymond advised to replace the macro with a function. I'm also a believer that one day, we could replace duplicated C functions accepting variant of C "integers" types with a single larger type:
Sadly, it seems like such change has a small performance loss, and so cannot be done in "performance critical" code. I consider that longobject.c is such performance critical code. -- I converted some macros of the Python C API to static inline functions, like Py_INCREF() or _PyObject_GC_TRACK(): Compared to macros, static inline functions have multiple advantages:
-- Again, I advice to add a small comment in longobject.c macros to explain that converting them to functions would loose performances. |
When small integer are disabled at compilation time (NSMALLPOSINTS=0 and a NSMALLNEGINTS=0), I suggest to remove code related to small integers. IMHO CHECK_SMALL_INT() macro is a good practical solution for that. So I suggest to restore this macro. |
As someone who has only more recently started learning the C-API (and C in general), I'm certainly in favor of replacing macros with functions when possible. I might be a bit biased, but it definitely makes the code a lot easier to understand (especially the macros with implicit returns). As long as there's not a significant performance loss and it doesn't introduce new complications, I don't see an issue with it. From my understanding, readability isn't as high of a priority in the C code, but certainly there's some benefit to improving areas that are more difficult to understand. The easier the code is to understand, the lower the overall maintenance cost becomes. Of course, this should certainly be done in moderation to reduce the introduction of unnecessary new bugs and the cost in reviewer time for more pressing concerns. |
I hesitate to come back to this thread, because as Raymond says it's consumed a lot of time already. But I think this point is of broader interest than the specific few lines we've been discussing:
Victor, can you be more specific about the problem this is solving? I think the existing code in master really doesn't leave any problems in place that CHECK_SMALL_INT solves. In master (as of 2702638), here's the code that CHECK_SMALL_INT would replace: if (IS_SMALL_INT(ival)) {
return get_small_int((sdigit)ival);
} When NSMALLPOSINTS=0 and NSMALLNEGINTS=0 , this expands in the preprocessor to the equivalent of: if (0) {
return (Py_UNREACHABLE(), NULL);
} (Specifically, it expands to whatever that expands to, since Py_UNREACHABLE and possibly NULL are also macros.) A compiler that's attempting to do any significant optimization at all has to be able to discard code that's inside (And at a quick empirical sanity-check: gcc, clang, and msvc all do so at any optimization setting other than "disabled". In fact gcc and clang do so even with optimizations disabled.) You made the case very nicely above that macros are evil. The IS_SMALL_INT macro is fairly mild, and it has a performance benefit on some platforms to justify it. But CHECK_SMALL_INT causes a return from the enclosing function, which seems quite high on the "evil macro" scale. |
I'm unassigning this. It has been blown profoundly out of proportion. The recommendation of the senior most developer is being ignored, and now two developers are speaking in terms of strong belief systems and labeling long-stable code as "evil". This doesn't bode well and it makes it difficult to conduct reasoned discourse. (Meanwhile, we have another have another long-term active developer who routinely adds new macros to existing code because he thinks that is an improvement. That portends a pointless tug-of-war predicated almost solely in not liking how other people code). |
Recent commits for longobject.c
The concern for this issue is: implicit return from macro.
Then this commit is not necessary.
This commit introduced a compiler warning due to this line [1]: [1] the line:
IMO this commit reduces readability a bit. We can sort out these problems. |
Sorry, but there was nothing wrong with the CHECK_SMALL_INT macro, to my eyes, to begin with - except that it was burdened with an over-elaborate "do ... while(0)" wrapper. Not all macros are _intended_ to be "cheap functions". Like this one, some are intended to be merely textual replacement, to minimize tedious and error-prone redundant typing, in a specific file. Even "bulletproofing them" with tricks like "do ... while(0)" cruft is counterproductive in such cases. They're not _intended_ to capture abstractions, neither for general use. This one was intended to check its argument and possibly return, exactly what its expansion did (after mentally tossing out the distracting "do while" noise). Nobody competent to address logic errors in longobject.c was confused about that in the slightest. Knowing the details of how small integers may be cached is part of what "competent" means in this context, and what the macro does is so simple it's no burden to learn or to use. Is there an "aesthetic code cleanup" patch that's ever turned out well? ;-) |
Let me apologize about this bit -- I was thinking of the term in quotes (referring to the earlier comment), but I didn't write it clearly that way. I don't think any of this code is evil, past or present, and I regret that I carelessly gave that impression. I think there isn't currently any problem to be solved here, and I hope we can all put our energy elsewhere. (For my part, I just spent the balance of the evening implementing an old feature request on unicodedata.) Victor, I'd still be quite interested in better understanding your thinking (as this example relates to macros in general) if you'd like to reply, though this thread may not be the right place to continue. You can find my email in Git, and I'm on Zulip and Discourse; and I'd be happy to start or follow a thread in a forum you think appropriate. Or if you'd rather drop it entirely, that's fine too. |
I think opening a thread in https://discuss.python.org/c/users to talk about deciding between the usage of functions and macros (discussing when each may be appropriate) would be beneficial to the community at large.
Apologies if I added to that, I certainly respect your judgement on this issue. Pretty much everyone involved in this discussion has more experience in working with the CPython C-API than I do, you most of all (as far I'm aware). My perspective was coming from someone attempting to understand it better, and explaining how those learning the C-API might find it confusing. I don't find the code itself to be at all "evil", just potentially confusing. That long-stable code has provided a lot of benefit over the years, and I can definitely appreciate the effort that was put into writing it. |
From what I can tell, any remotely extensive aesthetic code patch I've seen has been highly controversial. I think they can have value in some cases, but the value relative to the potential cost may have been a bit overstated in this particular case. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: