Proposal: support multiple formats for syntax directives #2937

jedevc · 2022-06-29T14:19:51Z

See #2812 for context, and #2904 for the related proposal regarding compatibility with shebang lines in Dockerfiles.

This proposal is presented as code to give a clearer view of what an implementation may look like, however, this is unlikely to be the final implementation. The code introduces basic support for two additional ways of syntax, so that the full list of possibilities is:

#syntax=foo, for script-style comments supported in most scripting languages, and Dockerfiles
//syntax=foo, for C-style comments supported in C/C++ and javascript/typescript
{"syntax": "foo", ...}, for JSON documents

(Note that XML support isn't included, and that JSON document support is included as a demonstration of how we could expand this support in the future).

The code is slightly more complex than it appears it needs to be - this is to handle a couple of constraints:

The default directive style should remain as #key=value
Mix-and-matching different directive styles should not be supported
Additionally allow easy extension in the future for other directive styles

Importantly, this does not change the implementation of how Dockerfiles parse their directives, these are defined in a separate regex that remains unmodified.

Please share your thoughts 🎉

frontend/dockerfile/dockerfile2llb/directives.go

tonistiigi

The way this should work should be that there aren't new ways to define directives for Dockerfile, # remains the only supported syntax for that. When reading #syntax specifically from the directives, if there are no matches, then there should be a fallback that checks for other versions as well.

Eg.

// syntax=docker/dockerfile
FROM alpine

Is still an invalid Dockerfile and should error.

This is what was wrong in the previous shebang PR as well. If you do it in a fallback, feel free to add support for skipping the shebang line as well.

frontend/dockerfile/dockerfile2llb/directives.go

jedevc · 2022-07-19T09:35:59Z

I think what you're describing is how it works currently. This directive parsing code modified is only used for extracting the syntax directive, the directive parsing used to actually parse the dockerfile is located here. The utilities in directives.go are only used to determine to forward to a gateway.

The fallback behavior is also as you describe - we run through the directive parsers one-by-one, attempting to extract a syntax key from each - until we find one. If none are found, we can just give up and assume that it's the default # and parse directives from there.

frontend/dockerfile/dockerfile2llb/directives.go

tonistiigi · 2022-07-19T16:43:09Z

I think what you're describing is how it works currently.

With the fallback ParseDirectives() should stay as it is currently(you can optimize regexp). Only in DetectSyntax() when ParseDirectives() call doesn't return a syntax directive it should try if the first line is //, skip shebang if needed, try json etc.

jedevc · 2022-07-19T17:11:56Z

Alright sounds good - I think I was confused since DetectSyntax seems to be the only thing that ever calls the ParseDirectives function, but I suppose there could be external callers we'd want to maintain compatibility for.

jedevc · 2022-07-25T10:42:57Z

Have done another pass through this, ParseDirectives now only parses # directives, while DetectSyntax searches through all the supported possibilities. I've also added a commit to skip shebang lines for #-style comments, I'm not sure if we want them for the other styles, since they won't be valid comments in the language that uses them, so I'm inclined to not allow it (or to wait until we get a valid use case).

I wonder if we might consider extending ParseDirectives in a follow-up? I do see a couple of codebases using it, so it might be nice to allow users to specify a syntax parameter for something like ParseDirectivesForSyntax to parse for multiple languages.

frontend/dockerfile/dockerfile2llb/directives.go

jedevc · 2022-11-15T13:30:46Z

Have completely rewritten the code to be significantly less clever, so the DetectSyntax function should be much easier to read from top-to-bottom.

In addition to adding the shebang, c-style comments and json parsing, I've also moved the appropriate functionality into the parser package and exposed the directives parser properly, so that we can get identical behavior to how the directives parsed inside the parser (which was actually different before, e.g. doing different casing, etc.)

tonistiigi

PTAL the CI errors

tonistiigi · 2022-11-16T02:36:13Z

frontend/dockerfile/parser/parser.go

-	d := &directives{
-		seen: map[string]struct{}{},
-	}
+	d := &directives{}


Are you intentionally leaving the parser nil?

Yup, I've reworked the parser to work from a nil state easily, so that when using it outside of this package, no extra New methods, etc, would be required.

frontend/dockerfile/parser/directives.go

tonistiigi · 2022-11-16T02:42:59Z

frontend/dockerfile/parser/directives.go

+
+	// use directive with different comment prefix, and search for //syntax=
+	directiveParser = DirectiveParser{Line: line}
+	directiveParser.setComment("//")


Technically shebang should be between the two directive parsers and // should not accept multiline but just do a regexp match. But whatever, we can leave it like this.

tonistiigi · 2022-11-16T02:46:39Z

frontend/dockerfile/parser/directives.go

+	var directive struct {
+		Syntax string `json:"syntax"`
+	}
+	if err := json.Unmarshal(dt, &directive); err == nil {


No change required: I wonder if we could check that this is a JSON without full unmarshal. Eg. when user makes a typo in the JSON it would be better to give them JSON parse error(or maybe we can still parse syntax without hitting the typo) if the file is still mostly JSON than to just skip over the syntax forwarding.

I like the idea of attempting to parse only up until we have the value of syntax and then immediately stopping - not quite sure how to do that though, I'm fairly sure that the builtin json package doesn't allow that level of control.

This patch introduces support for allowing multiple formats for syntax directives, to determine the syntax of the file to forward to the appropriate gateway. Additionally, support for shebangs for syntax detection is also permitted. This does not change the dockerfile syntax and allow different directives or shebangs during dockerfile parsing, these changes only affect the syntax detection. Additionally, the relevant functions are moved into the parser package from the dockerfile2llb package, so that we can share code between the both the dockerfile directive parsing, and the syntax detection code, and also removes the dockerfile2llb.ParseDirectives function which appeared to scan directives in the same was as the parser, but was not aware of various quirks (e.g. case-insensitivity). Signed-off-by: Justin Chadwell <me@jedevc.com>

tianon · 2022-12-14T01:04:07Z

frontend/dockerfile/parser/directives.go

+	return directives, nil
+}
+
+// DetectSyntaxFromDockerfile returns the syntax of provided input.


This probably should've just been DetectSyntax, right? 😇

(The function itself doesn't have the FromDockerfile suffix on the name 👀)

#3404 fixes this 🎉

thaJeztah · 2022-12-14T13:49:50Z

frontend/dockerfile/parser/directives_test.go

+	// for some reason Moby implementation in case insensitive for escape
+	dt = `# EScape=\
+# KEY = FOO bar


Yeah, this was a clear case of "fatigue" kicking in after a very long debate about the feature (it originally was implemented as a new ESCAPE Dockerfile instruction). Once we got to the concept of "parser directives", there was debate about "upper" or "lowercase"; moby/moby#22268 (comment), and in the end it was decided to "recommend" lowercase, but "allow" uppercase moby/moby#22268 (comment) moby/moby#22268 (comment)

I'd be open to (perhaps for Dockerfile instructions as well) discuss making it slightly more strict, and accept lower and UPPER, but not mIxEDcAsE (which we currently support in most cases), although we should look if there's actual uses of mixed-case and if it would be too much of a breaking change (we could "warn" ahead perhaps).

jedevc marked this pull request as draft June 29, 2022 14:30

thaJeztah reviewed Jun 29, 2022

View reviewed changes

frontend/dockerfile/dockerfile2llb/directives.go Outdated Show resolved Hide resolved

tonistiigi reviewed Jul 18, 2022

View reviewed changes

frontend/dockerfile/dockerfile2llb/directives.go Outdated Show resolved Hide resolved

thaJeztah reviewed Jul 19, 2022

View reviewed changes

frontend/dockerfile/dockerfile2llb/directives.go Outdated Show resolved Hide resolved

jedevc force-pushed the directive-proposal branch from 6f3235b to 1597f80 Compare July 19, 2022 10:13

jedevc marked this pull request as ready for review July 19, 2022 10:13

jedevc force-pushed the directive-proposal branch from 1597f80 to 23f86ea Compare July 25, 2022 09:55

jedevc requested a review from tonistiigi July 25, 2022 09:55

crazy-max added the status/design-review label Sep 14, 2022

tonistiigi reviewed Sep 21, 2022

View reviewed changes

frontend/dockerfile/dockerfile2llb/directives.go Outdated Show resolved Hide resolved

frontend/dockerfile/dockerfile2llb/directives.go Outdated Show resolved Hide resolved

jedevc added this to the v0.11.0 milestone Oct 18, 2022

jedevc force-pushed the directive-proposal branch from d78a4f5 to 6ad8e14 Compare October 24, 2022 08:29

tonistiigi reviewed Nov 10, 2022

View reviewed changes

tonistiigi added status/code-review and removed status/design-review labels Nov 10, 2022

jedevc force-pushed the directive-proposal branch 2 times, most recently from bc84707 to f6f26c4 Compare November 15, 2022 13:28

tonistiigi reviewed Nov 16, 2022

View reviewed changes

jedevc force-pushed the directive-proposal branch from f6f26c4 to 9b73e88 Compare November 16, 2022 14:12

tonistiigi approved these changes Nov 17, 2022

View reviewed changes

tonistiigi merged commit f771330 into moby:master Nov 17, 2022

tianon reviewed Dec 14, 2022

View reviewed changes

thaJeztah reviewed Dec 14, 2022

View reviewed changes

jedevc deleted the directive-proposal branch December 14, 2022 14:39

ciaranmcnulty mentioned this pull request Feb 6, 2023

Prefixed syntax directives in JSON files #3590

Open

ojab mentioned this pull request Apr 21, 2023

dependabot should bump dockerfile syntax dependabot/dependabot-core#4078

Open

jedevc mentioned this pull request May 25, 2023

Alternative #syntax= comment syntax #2812

Closed

jedevc removed the status/code-review label May 25, 2023

jedevc mentioned this pull request Jul 5, 2023

Proposal: Directives should be parsed even if they appear after a shebang line in the dockerfile #2904

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: support multiple formats for syntax directives #2937

Proposal: support multiple formats for syntax directives #2937

jedevc commented Jun 29, 2022

tonistiigi left a comment

jedevc commented Jul 19, 2022

tonistiigi commented Jul 19, 2022

jedevc commented Jul 19, 2022

jedevc commented Jul 25, 2022

jedevc commented Nov 15, 2022 •

edited

tonistiigi left a comment

tonistiigi Nov 16, 2022

jedevc Nov 16, 2022

tonistiigi Nov 16, 2022

tonistiigi Nov 16, 2022

jedevc Nov 16, 2022

tianon Dec 14, 2022

jedevc Dec 14, 2022 •

edited

thaJeztah Dec 14, 2022

Proposal: support multiple formats for syntax directives #2937

Proposal: support multiple formats for syntax directives #2937

Conversation

jedevc commented Jun 29, 2022

tonistiigi left a comment

Choose a reason for hiding this comment

jedevc commented Jul 19, 2022

tonistiigi commented Jul 19, 2022

jedevc commented Jul 19, 2022

jedevc commented Jul 25, 2022

jedevc commented Nov 15, 2022 • edited

tonistiigi left a comment

Choose a reason for hiding this comment

tonistiigi Nov 16, 2022

Choose a reason for hiding this comment

jedevc Nov 16, 2022

Choose a reason for hiding this comment

tonistiigi Nov 16, 2022

Choose a reason for hiding this comment

tonistiigi Nov 16, 2022

Choose a reason for hiding this comment

jedevc Nov 16, 2022

Choose a reason for hiding this comment

tianon Dec 14, 2022

Choose a reason for hiding this comment

jedevc Dec 14, 2022 • edited

Choose a reason for hiding this comment

thaJeztah Dec 14, 2022

Choose a reason for hiding this comment

jedevc commented Nov 15, 2022 •

edited

jedevc Dec 14, 2022 •

edited