Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

8297963: Partially fix string expansion issues in UTIL_DEFUN_NAMED and related macros #11458

Closed
wants to merge 9 commits into from

Conversation

TheShermanTanker
Copy link
Contributor

@TheShermanTanker TheShermanTanker commented Dec 1, 2022

The UTIL macros have several problematic flaws that may not seem apparent initially, the worst of which include parsing commas as actual argument separators to the underlying m4 macros, and completely, but very subtly wrecking shell constructs. I've done my best here to fix some of these issues in this commit, so others don't end up suffering like I did while I was trying to implement 8296478. Most notably, DESC and CHECKING_MSG should now function correctly when a comma is included as part of their description strings (Other named arguments do not yet have this fix implemented, it was simply far too tedious, but it should be straightforward after this commit to fix those as well). I've also fixed several bugs that seem to have flown under the radar (Such as UTIL_DEFUN_NAMED always adding an extra space in between the parameter name and actual value, even if not required). Majority of the fix involved UTIL_DEFUN_NAMED, however while it now properly passes along arguments it receives to macros that use it to implement their logic, the latter still need to be mindful of macro expansions inside their own body

A couple of limitations from the original implementation still remain: You still cannot have the parameter specifier inside its own block, for instance IF_GIVEN: [IF_GIVEN: []] would not work properly (IF_GIVEN: [IF_GIVEN []] without the colon in the inner string will, however). Additionally, if a space is not given in between the parameter and its value (Eg DEFAULT:[] instead of the usual DEFAULT: []) the processing that UTIL_DEFUN_NAMED performs to correct this problem will still interfere with shell code that relies on the : token. Unfortunately, despite my best efforts I was unable to properly resolve these last 2 limitations


Progress

  • Change must be properly reviewed (1 review required, with at least 1 Reviewer)
  • Change must not contain extraneous whitespace
  • Commit message must refer to an issue

Issue

  • JDK-8297963: Partially fix string expansion issues in UTIL_DEFUN_NAMED and related macros

Reviewers

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk pull/11458/head:pull/11458
$ git checkout pull/11458

Update a local copy of the PR:
$ git checkout pull/11458
$ git pull https://git.openjdk.org/jdk pull/11458/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 11458

View PR using the GUI difftool:
$ git pr show -t 11458

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/11458.diff

@bridgekeeper
Copy link

bridgekeeper bot commented Dec 1, 2022

👋 Welcome back jwaters! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

@openjdk openjdk bot changed the title 8297963 8297963: Partially fix string expansion issues in the autoconf UTIL macros Dec 1, 2022
@openjdk openjdk bot added the rfr Pull request is ready for review label Dec 1, 2022
@openjdk
Copy link

openjdk bot commented Dec 1, 2022

@TheShermanTanker The following label will be automatically applied to this pull request:

  • build

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

@openjdk openjdk bot added the build build-dev@openjdk.org label Dec 1, 2022
@mlbridge
Copy link

mlbridge bot commented Dec 1, 2022

# WARNING: Underneath this comment, you will find actual demons in the foreach block below waiting to rip your very soul out and destroy everything you hold dear
# Proceed at the risk of your own sanity, and don't say I didn't warn you when you inevitably suffer the same torment that I went through should you decide
# that you need to dive into it to fix anything
# ~Julian
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤣 While I do appreciate humor and a bit of personal touch in comments (and I fully sympathize with your description!), this feels a bit over the top. Can you tone it down a bit and make it slightly more professional (it is okay to keep a bit of the sentimentality I think, to fully make the point on how difficult it is to modify).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, sorry for letting the pent up bitterness overflow into the comments, I'll do just that. I've lost track of how much coffee and milk I've consumed while trying to get this to work...
Probably enough to shorten my lifespan by several decades at this point, if I'm being honest

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made like two or three false starts at even getting this macro to work in the first place, after thinking "oh, that would be a good idea", trying different solutions, thinking "nah, it is impossible" to coming back to it later again with a new approach. And even with the correct approach it was basically a lot of trial and horror of permutating different m4 keywords until you got the wanted result... So I fully understand what you have been going through. At least you can call yourself "hard-core m4 programmer" now as a result. 🎖️

m4_if(ARG_IF_GIVEN, , [m4_define([ARG_IF_GIVEN], [:])])
m4_if(ARG_IF_NOT_GIVEN, , [m4_define([ARG_IF_NOT_GIVEN], [:])])
m4_if(ARG_IF_ENABLED, , [m4_define([ARG_IF_ENABLED], [:])])
m4_if(ARG_IF_DISABLED, , [m4_define([ARG_IF_DISABLED], [:])])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This definitely look cleaner. Was it needed to get things to work, or just a clean-up? (I'm just curious)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The former, the code blocks inside ARG_IF_GIVEN and similar were getting expanded multiple times prior to the change, which was wreaking quite a lot of havoc on any shell code that they might have contained. I ultimately solved that by simply deliberately avoiding their redefinition if they were not empty (m4_define needs to be quoted here because it would expand and start redefining the macro otherwise irregardless of whether their check actually passed or not!)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Of course. The value was reassigned, losing the quoting in the process.

# Proceed at the risk of your own sanity, and don't say I didn't warn you when you inevitably suffer the same torment that I went through should you decide
# that you need to dive into it to fix anything
# ~Julian
m4_foreach([arg], m4_dquote(m4_dquote_elt($3)), [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually had to check up what m4_dquote does. It is "meant for hard-core M4 programmers.". No sh*t...

I think your changes are okay, but it's hard to say. I can't read this kind of m4 incantations fluently. I assume you have tested this thoroughly?

My only worry here is that you seem to be adding many layers of quoting (two cases of nested m4_dquote, each of which adds "double" quoting) -- will m4 really be able to properly "dequote" this, so there won't be some superfluous [ ... ] suddenly, where there shouldn't be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this quite extensively before the final draft of his change and can verify that no extra quotes are added, but I can explain what each level of quoting does, so you don't have to go through the absolute nightmare of trying to decipher them like I did:

$3 is initially a comma separated string list of quoted arguments, the m4_dquote_elt call wraps each one in another layer of quoting, which prevents the m4_dquote immediately after this from expanding each one of them. When called on a comma separated string m4_dquote acts the same as [$3], but each element in the list has 2 layers of extra quoting, which is what we want.

The foreach call now immediately strips one level of quoting away from each element. This can be confirmed with an m4_dumpdef in the first line of the foreach code block. So far so good.

Now comes the check for if the parameter name was correctly separated from the values. If it was, our arg still has 1 extra quoting layer, which is good. If not however, arg is nested inside another m4_dquote before the patsubst; It immediately expands and loses that quoted layer, which the call to m4_dquote immediately restores. It then loses that quote in patsubst again, so another m4_dquote is invoked on it to get that layer of quoting back. All in all, whichever path was taken, we still have 1 extra layer of quoting, which then helps use breeze through the macro expansions afterwards, until m4_pushdef is encountered. Again arg loses the extra layer of quoting, and immediately has it restored by m4_dquote, bringing it back to 1 extra layer of quoting. A second m4_dquote after this now brings it to 2 extra levels of quoting. Perfect! The 2 m4_bpatsubst calls after this use up exactly 2 levels of quoting, and just like that, we've managed to pass the exact value we've received into the ARG_ macros with the exact same level of quoting, as if it were never touched at all. A consequence of this (I don't know if you'd consider this good or bad) is that nested cells to macros implemented with UTIL_DEFUN_NAMED no longer need to quote their ARG_ parameters to the function being called, because macro expansions are now exact (This is reflected in the last push I made to this branch, in flags.m4, FLAGS_COMPILER_CHECK_ARGUMENTS)

Contrary to the name, if safely used m4_dquote only adds an additional layer of quoting for each call, which was the really infuriating part (m4_quote ignores quoted arguments entirely, forcing you to use dquote instead!)

I committed to this branch earlier just now, and one of the changes was removing a call to m4_dumpdef([ARG_][]arg_name) just after the m4_pushdef that I forgot about and left in, that was what I've been using to verify that the values passed to the ARG_ macros are ultimately correct

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! That was a very comprehensive explanation. I think that (rather than the more poetically apt soul-eating demon story) would actually be helpful to have in the comments. But since it is quite long, maybe you can just add a pointer to the canonicalized URL to this post?

# For details on how this work, see https://git.openjdk.org/jdk/pull/11458#discussion_r1038173051

or something like that.

@magicus
Copy link
Member

magicus commented Dec 2, 2022

And maybe rename the issue Partially fix string expansion issues in UTIL_DEFUN_NAMED?

@TheShermanTanker
Copy link
Contributor Author

And maybe rename the issue Partially fix string expansion issues in UTIL_DEFUN_NAMED?

I did change a few lines outside of UTIL_DEFUN_NAMED though, that new issue name might not be entirely accurate

@magicus
Copy link
Member

magicus commented Dec 2, 2022

I did change a few lines outside of UTIL_DEFUN_NAMED though, that new issue name might not be entirely accurate

Yes, but that is just removing quirks that are not needed now that UTIL_DEFUN_NAMED is fixed. The essence of the PR is to fix that macro.

@TheShermanTanker TheShermanTanker changed the title 8297963: Partially fix string expansion issues in the autoconf UTIL macros 8297963: Partially fix string expansion issues in UTIL_DEFUN_NAMED and related macros Dec 3, 2022
@TheShermanTanker
Copy link
Contributor Author

Ah, I see what you mean, have renamed the title of both accordingly

@TheShermanTanker
Copy link
Contributor Author

I did change a few lines outside of UTIL_DEFUN_NAMED though, that new issue name might not be entirely accurate

Yes, but that is just removing quirks that are not needed now that UTIL_DEFUN_NAMED is fixed. The essence of the PR is to fix that macro.

Out of curiosity, was the quoting of ARG_PREFIX and other ARG_ macros when passing them as arguments to other macros implemented with UTIL_DEFUN_NAMED an unwanted workaround to stop them from expanding before this change?

@magicus
Copy link
Member

magicus commented Dec 5, 2022

Out of curiosity, was the quoting of ARG_PREFIX and other ARG_ macros when passing them as arguments to other macros implemented with UTIL_DEFUN_NAMED an unwanted workaround to stop them from expanding before this change?

Most likely. I imagine I tested around a bit until I got it to work. :)

As you have noted, keeping track of what the proper level of m4 quoting should be is quite mind-killing, so we have basically not done that, but just tried adding or removing quoting until stuff starts working, if it breaks. Most places m4 is quite tolerant against both too much or too little quoting.

Copy link
Member

@magicus magicus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine now. Thank you for doing this!

@openjdk
Copy link

openjdk bot commented Dec 5, 2022

@TheShermanTanker This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8297963: Partially fix string expansion issues in UTIL_DEFUN_NAMED and related macros

Reviewed-by: ihse

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 4 new commits pushed to the master branch:

  • 2300ed4: 8291769: Translation of switch with record patterns could be improved
  • eab0ada: 8296545: C2 Blackholes should allow load optimizations
  • dea2161: 8297959: Provide better descriptions for some Operating System JFR events
  • d523d9d: 8297864: Dead code elimination

Please see this link for an up-to-date comparison between the source branch of this pull request and the master branch.
As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

@openjdk openjdk bot added the ready Pull request is ready to be integrated label Dec 5, 2022
@TheShermanTanker
Copy link
Contributor Author

Looks fine now. Thank you for doing this!

No worries! And now I need to obligatorily scream into the void before integrating this

@TheShermanTanker
Copy link
Contributor Author

WHOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO

@TheShermanTanker
Copy link
Contributor Author

/integrate

@openjdk
Copy link

openjdk bot commented Dec 6, 2022

Going to push as commit f8f4630.
Since your change was applied there have been 29 commits pushed to the master branch:

  • 2a243a3: 8267617: Certificate's IP x509 NameConstraints raises ArrayIndexOutOfBoundsException
  • 923c746: 8298057: (fs) Remove PollingWatchService.POLLING_INIT_DELAY
  • 0bd04a6: 8297951: C2: Create skeleton predicates for all If nodes in loop predication
  • f5ad515: 8297247: Add GarbageCollectorMXBean for Remark and Cleanup pause time in G1
  • e975418: 8298102: Remove DirtyCardToOopClosure::_last_explicit_min_done
  • 04012c4: 8298111: Cleanups after UseMallocOnly removal
  • ee9ba74: 8295184: Printing messages with a RecordComponentElement does not include position
  • ba2d28e: 8298027: Remove SCCS id's from awt jtreg tests
  • 8d8a28f: 8296489: tools/jpackage/windows/WinL10nTest.java fails with timeout
  • 884b9ad: 8293453: tools/jpackage/share/AddLShortcutTest.java "Failed: Check the number of mismatched pixels [1024] of [1024] is < [0.100000] threshold"
  • ... and 19 more: https://git.openjdk.org/jdk/compare/777fb52ef5b0d95b756ce4fa71a7ddf2d7d2a8f1...master

Your commit was automatically rebased without conflicts.

@openjdk openjdk bot added the integrated Pull request has been integrated label Dec 6, 2022
@openjdk openjdk bot closed this Dec 6, 2022
@openjdk openjdk bot removed ready Pull request is ready to be integrated rfr Pull request is ready for review labels Dec 6, 2022
@openjdk
Copy link

openjdk bot commented Dec 6, 2022

@TheShermanTanker Pushed as commit f8f4630.

💡 You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

@TheShermanTanker TheShermanTanker deleted the patch-2 branch December 6, 2022 08:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
build build-dev@openjdk.org integrated Pull request has been integrated
2 participants