New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper parser for Dockerfile #2266

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
9 participants
@emtq

emtq commented Oct 17, 2013

Closes #2225. This pull request implements a proper parser for Docker build files.

@jpetazzo

This comment has been minimized.

Show comment
Hide comment
@jpetazzo

jpetazzo Oct 17, 2013

Contributor

Whoa, that's a lot of lines of code :-)
(Yeah, I know, I'm easily impressed...)
It looks like some files are generated by some others, right?

Contributor

jpetazzo commented Oct 17, 2013

Whoa, that's a lot of lines of code :-)
(Yeah, I know, I'm easily impressed...)
It looks like some files are generated by some others, right?

@emtq

This comment has been minimized.

Show comment
Hide comment
@emtq

emtq Oct 17, 2013

buildfile_lexer.go and buildfile_parser.go are autogenerated by nex and 'go tool yacc' respectively.

emtq commented Oct 17, 2013

buildfile_lexer.go and buildfile_parser.go are autogenerated by nex and 'go tool yacc' respectively.

@SvenDowideit

This comment has been minimized.

Show comment
Hide comment
@SvenDowideit

SvenDowideit Oct 21, 2013

Contributor

excellent - this should fix the things I was just exploring -

  1. losing tabs inside quotes in RUN, ENV
  2. space hash being treated as a comment, though the spec suggests it should not :)

excellent 👍

Contributor

SvenDowideit commented Oct 21, 2013

excellent - this should fix the things I was just exploring -

  1. losing tabs inside quotes in RUN, ENV
  2. space hash being treated as a comment, though the spec suggests it should not :)

excellent 👍

@emtq

This comment has been minimized.

Show comment
Hide comment
@emtq

emtq Oct 22, 2013

This commit is superseded by #2337.

emtq commented Oct 22, 2013

This commit is superseded by #2337.

@@ -2,6 +2,16 @@
DEST=$1
if ! go run vendor/src/github.com/wagerlabs/nex/nex.go -e -o buildfile_lexer.go buildfile.nex; then

This comment has been minimized.

@shykes

shykes Nov 12, 2013

Collaborator

This is not the original dependency. We can't vendor a personal fork.

@shykes

shykes Nov 12, 2013

Collaborator

This is not the original dependency. We can't vendor a personal fork.

This comment has been minimized.

@emtq

emtq Nov 13, 2013

I will fix this.

@emtq

emtq Nov 13, 2013

I will fix this.

This comment has been minimized.

@emtq

emtq Nov 19, 2013

Fixed now.

@emtq

emtq Nov 19, 2013

Fixed now.

@shykes

This comment has been minimized.

Show comment
Hide comment
@shykes

shykes Nov 12, 2013

Collaborator

I am not convinced I want to merge this.

  • It is a complete rewrite of a critical part of docker..
  • .. which has a lot of legacy quirks..
  • .. and for which reverse-compatibility is extremely important (breaking change = incorrect or failed builds)
  • .. by a first-time contributor with no track record of maintaining his changes..
  • .. and who is openly using this pull request as advertisement material to sell contract work to my employer (will they get an invoice when you need to maintain this?) No offense Joel but it's true :)
  • All this without fixing an immediate problem.

Lastly, I have no plan to extend the Dockerfile syntax into a powerful DSL. Its purpose is to be a simple "assembly-style" language. The future of Dockerfiles is to make it easier to call build scripts of your own, in your favorite syntax: shell scripts, chef scripts, puppet scripts... These are the tools which need a sophisticated lexer - I'm not sure we do.

Thoughts?

Collaborator

shykes commented Nov 12, 2013

I am not convinced I want to merge this.

  • It is a complete rewrite of a critical part of docker..
  • .. which has a lot of legacy quirks..
  • .. and for which reverse-compatibility is extremely important (breaking change = incorrect or failed builds)
  • .. by a first-time contributor with no track record of maintaining his changes..
  • .. and who is openly using this pull request as advertisement material to sell contract work to my employer (will they get an invoice when you need to maintain this?) No offense Joel but it's true :)
  • All this without fixing an immediate problem.

Lastly, I have no plan to extend the Dockerfile syntax into a powerful DSL. Its purpose is to be a simple "assembly-style" language. The future of Dockerfiles is to make it easier to call build scripts of your own, in your favorite syntax: shell scripts, chef scripts, puppet scripts... These are the tools which need a sophisticated lexer - I'm not sure we do.

Thoughts?

@jpetazzo

This comment has been minimized.

Show comment
Hide comment
@jpetazzo

jpetazzo Nov 13, 2013

Contributor

I can't tell about the track record of the author or the plug-in ad (didn't see that?) but if the Dockerfile parser has "legacy quirks", shouldn't we fix it now (i.e. the earlier the better)? Looks like minor parsing issues pop up on regular intervals.

Contributor

jpetazzo commented Nov 13, 2013

I can't tell about the track record of the author or the plug-in ad (didn't see that?) but if the Dockerfile parser has "legacy quirks", shouldn't we fix it now (i.e. the earlier the better)? Looks like minor parsing issues pop up on regular intervals.

@SvenDowideit

This comment has been minimized.

Show comment
Hide comment
@SvenDowideit

SvenDowideit Nov 13, 2013

Contributor

+1 in the wiki world, we always regretted not fixing the legacy quirks in the language - it made maintenance so much harder 10 years later - and if it would mean i won't need to fix the TAB issue I found :)

mmm, maybe i need to go test this

Contributor

SvenDowideit commented Nov 13, 2013

+1 in the wiki world, we always regretted not fixing the legacy quirks in the language - it made maintenance so much harder 10 years later - and if it would mean i won't need to fix the TAB issue I found :)

mmm, maybe i need to go test this

@shykes

This comment has been minimized.

Show comment
Hide comment
@shykes

shykes Nov 13, 2013

Collaborator

Guys, the time to worry about quirks was before shipping. Once it's shipped, we need to support it. Especially when it affects a file format which many people have already checked into source control and use actively.

@solomonstre
@docker

On Tue, Nov 12, 2013 at 10:16 PM, Sven Dowideit notifications@github.com
wrote:

+1 in the wiki world, we always regretted not fixing the legacy quirks in the language - it made maintenance so much harder 10 years later - and if it would mean i won't need to fix the TAB issue I found :)

mmm, maybe i need to go test this

Reply to this email directly or view it on GitHub:
#2266 (comment)

Collaborator

shykes commented Nov 13, 2013

Guys, the time to worry about quirks was before shipping. Once it's shipped, we need to support it. Especially when it affects a file format which many people have already checked into source control and use actively.

@solomonstre
@docker

On Tue, Nov 12, 2013 at 10:16 PM, Sven Dowideit notifications@github.com
wrote:

+1 in the wiki world, we always regretted not fixing the legacy quirks in the language - it made maintenance so much harder 10 years later - and if it would mean i won't need to fix the TAB issue I found :)

mmm, maybe i need to go test this

Reply to this email directly or view it on GitHub:
#2266 (comment)

@solomonstre

This comment has been minimized.

Show comment
Hide comment
@solomonstre

solomonstre Nov 13, 2013

I don't mean that we necessarily need to preserve all quirks in formol forever. But once it's shipped, how fast we fix it is no longer the most important. The most important is to carefully plan out the least disruptive sunset possible (let's call it LDSP).

Rushing the merge of a completely new lexer definitely doesn't qualify as the LDSP, I think we can agree on that much :)

@solomonstre
@docker

On Tue, Nov 12, 2013 at 10:19 PM, Solomon Hykes notifications@github.com
wrote:

Guys, the time to worry about quirks was before shipping. Once it's shipped, we need to support it. Especially when it affects a file format which many people have already checked into source control and use actively.

@solomonstre
@docker
On Tue, Nov 12, 2013 at 10:16 PM, Sven Dowideit notifications@github.com
wrote:

+1 in the wiki world, we always regretted not fixing the legacy quirks in the language - it made maintenance so much harder 10 years later - and if it would mean i won't need to fix the TAB issue I found :)

mmm, maybe i need to go test this

Reply to this email directly or view it on GitHub:

#2266 (comment)

Reply to this email directly or view it on GitHub:
#2266 (comment)

solomonstre commented Nov 13, 2013

I don't mean that we necessarily need to preserve all quirks in formol forever. But once it's shipped, how fast we fix it is no longer the most important. The most important is to carefully plan out the least disruptive sunset possible (let's call it LDSP).

Rushing the merge of a completely new lexer definitely doesn't qualify as the LDSP, I think we can agree on that much :)

@solomonstre
@docker

On Tue, Nov 12, 2013 at 10:19 PM, Solomon Hykes notifications@github.com
wrote:

Guys, the time to worry about quirks was before shipping. Once it's shipped, we need to support it. Especially when it affects a file format which many people have already checked into source control and use actively.

@solomonstre
@docker
On Tue, Nov 12, 2013 at 10:16 PM, Sven Dowideit notifications@github.com
wrote:

+1 in the wiki world, we always regretted not fixing the legacy quirks in the language - it made maintenance so much harder 10 years later - and if it would mean i won't need to fix the TAB issue I found :)

mmm, maybe i need to go test this

Reply to this email directly or view it on GitHub:

#2266 (comment)

Reply to this email directly or view it on GitHub:
#2266 (comment)

@emtq

This comment has been minimized.

Show comment
Hide comment
@emtq

emtq Nov 13, 2013

I'm not charging for this PR, it's free.

The fact that I'm using this PR to showcase my coding skills is orthogonal to its technical merits.

This PR does fix a number of issues in the current ad-hoc parser and is easily maintainable by me or @jpetazzo ;-).

emtq commented Nov 13, 2013

I'm not charging for this PR, it's free.

The fact that I'm using this PR to showcase my coding skills is orthogonal to its technical merits.

This PR does fix a number of issues in the current ad-hoc parser and is easily maintainable by me or @jpetazzo ;-).

@emtq

This comment has been minimized.

Show comment
Hide comment
@emtq

emtq Nov 13, 2013

To be more precise, this PR fixes issues #2315, #2072 and makes it easy to resolve #2210, #2167, #1149, #1996, #1834, #1352 and #1321 among others.

I did a bit of maintenance and improvement on #2337 (rejected), following suggestions from @tianon y @jpetazzo.

I also contributed to other open source projects where, dare I say, accepting my patches took less than a few weeks and turned out to be FAR less emotional.

emtq commented Nov 13, 2013

To be more precise, this PR fixes issues #2315, #2072 and makes it easy to resolve #2210, #2167, #1149, #1996, #1834, #1352 and #1321 among others.

I did a bit of maintenance and improvement on #2337 (rejected), following suggestions from @tianon y @jpetazzo.

I also contributed to other open source projects where, dare I say, accepting my patches took less than a few weeks and turned out to be FAR less emotional.

@solomonstre

This comment has been minimized.

Show comment
Hide comment
@solomonstre

solomonstre Nov 13, 2013

Hi Joel, your contribution is always accepted whether we merge it or not, there is nothing emotional about this.

#2337 was rejected because I don't think we need an INCLUDE verb. @jpetazzo and @tianon are not core project maintainers so their suggestions, though informative, don't guarantee anything.

You didn't address my last question which is: how confident are you that 100% of existing Dockerfiles will work unmodified with this lexer? Can we increase that confidence? If it's not 100%, what's the plan for a graceful sunset? That's what I'm interested in - not how fast we merge it. 

 

Sorry about the "weeks of delay". We have huge numbers of contributions and are doing our best to keep up and give useful feedback. You will increase your chances by discussing your design earlier in the mailing list or irc, checking for interest from the actual maintainers, keeping your changes small, and most of all, not making me feel like I somehow need to rush something in. I really don't give a shit how fast your patches were merged in other projects - mentioning it comes across as condescending and self-centered.

@solomonstre
@docker

On Wed, Nov 13, 2013 at 3:59 AM, Joel Reymont notifications@github.com
wrote:

To be more precise, this PR fixes issues #2315, #2072 and makes it easy to resolve #2210, #2167, #1149, #1996, #1834, #1352 and #1321 among others.
I did a bit of maintenance and improvement on #2337 (rejected), following suggestions from @tianon y @jpetazzo.

I also contributed to other open source projects where, dare I say, accepting my patches took less than a few weeks and turned out to be FAR less emotional.

Reply to this email directly or view it on GitHub:
#2266 (comment)

solomonstre commented Nov 13, 2013

Hi Joel, your contribution is always accepted whether we merge it or not, there is nothing emotional about this.

#2337 was rejected because I don't think we need an INCLUDE verb. @jpetazzo and @tianon are not core project maintainers so their suggestions, though informative, don't guarantee anything.

You didn't address my last question which is: how confident are you that 100% of existing Dockerfiles will work unmodified with this lexer? Can we increase that confidence? If it's not 100%, what's the plan for a graceful sunset? That's what I'm interested in - not how fast we merge it. 

 

Sorry about the "weeks of delay". We have huge numbers of contributions and are doing our best to keep up and give useful feedback. You will increase your chances by discussing your design earlier in the mailing list or irc, checking for interest from the actual maintainers, keeping your changes small, and most of all, not making me feel like I somehow need to rush something in. I really don't give a shit how fast your patches were merged in other projects - mentioning it comes across as condescending and self-centered.

@solomonstre
@docker

On Wed, Nov 13, 2013 at 3:59 AM, Joel Reymont notifications@github.com
wrote:

To be more precise, this PR fixes issues #2315, #2072 and makes it easy to resolve #2210, #2167, #1149, #1996, #1834, #1352 and #1321 among others.
I did a bit of maintenance and improvement on #2337 (rejected), following suggestions from @tianon y @jpetazzo.

I also contributed to other open source projects where, dare I say, accepting my patches took less than a few weeks and turned out to be FAR less emotional.

Reply to this email directly or view it on GitHub:
#2266 (comment)

@jpetazzo

This comment has been minimized.

Show comment
Hide comment
@jpetazzo

jpetazzo Nov 14, 2013

Contributor

Hmmm (deep thinking)

I still believe that having a proper parser is important; but I understand the fear of breaking stuff.

@wagerlabs, do you think that it would be doable to have the two parsers in parallel, and compare their output? (I don't even know if they output a kind of AST or comparable structure, so this might be a completely silly suggestion.)

@shykes, would that be an acceptable way? I.e. having both parsers in parallel, and something showing huge warnings if a difference is found during a build?

Alternatively — have the ability to switch between parsers with a command-line (/API) flag?

IMHO, if a Dockerfile breaks, it's either a problem with the parser (and it should be trivial to fix), or because the Dockerfile is broken. Random example: a Dockerfile using a weird UTF8 character to skip the next command, or some weird crap like that :)

Contributor

jpetazzo commented Nov 14, 2013

Hmmm (deep thinking)

I still believe that having a proper parser is important; but I understand the fear of breaking stuff.

@wagerlabs, do you think that it would be doable to have the two parsers in parallel, and compare their output? (I don't even know if they output a kind of AST or comparable structure, so this might be a completely silly suggestion.)

@shykes, would that be an acceptable way? I.e. having both parsers in parallel, and something showing huge warnings if a difference is found during a build?

Alternatively — have the ability to switch between parsers with a command-line (/API) flag?

IMHO, if a Dockerfile breaks, it's either a problem with the parser (and it should be trivial to fix), or because the Dockerfile is broken. Random example: a Dockerfile using a weird UTF8 character to skip the next command, or some weird crap like that :)

@emtq

This comment has been minimized.

Show comment
Hide comment
@emtq

emtq Nov 14, 2013

@jpetazzo I don't see a lot of advantage in keeping two parsers afloat. I agree with you that any problems with the new parser should be trivial to fix.

@solomonstre I'm confident that 99% of existing Dockerfiles will work unmodified with this parser. There are tests for the current parser and the new parser passes them. I added a few extra tests and the confidence can be increased by further expanding the test suite.

emtq commented Nov 14, 2013

@jpetazzo I don't see a lot of advantage in keeping two parsers afloat. I agree with you that any problems with the new parser should be trivial to fix.

@solomonstre I'm confident that 99% of existing Dockerfiles will work unmodified with this parser. There are tests for the current parser and the new parser passes them. I added a few extra tests and the confidence can be increased by further expanding the test suite.

@emtq

This comment has been minimized.

Show comment
Hide comment
@emtq

emtq Nov 14, 2013

@solomonstre @shykes

No hard feelings about #2337. I will also refrain from needless jabs in the future ;-).

Please see buildfile_test.go and buildfile_parser_test.go for the tests that apply to this PR and ensure that existing Dockerfiles will work unmodified with this parser.

I take pride in my work and will quickly fix any bugs in the new parser, regardless of my contractual arrangements with Docker.

emtq commented Nov 14, 2013

@solomonstre @shykes

No hard feelings about #2337. I will also refrain from needless jabs in the future ;-).

Please see buildfile_test.go and buildfile_parser_test.go for the tests that apply to this PR and ensure that existing Dockerfiles will work unmodified with this parser.

I take pride in my work and will quickly fix any bugs in the new parser, regardless of my contractual arrangements with Docker.

@emtq

This comment has been minimized.

Show comment
Hide comment
@emtq

emtq Nov 19, 2013

What's the next step here?

emtq commented Nov 19, 2013

What's the next step here?

@emtq

This comment has been minimized.

Show comment
Hide comment
@emtq

emtq Nov 19, 2013

I fixed the vendoring so we are pulling in the original nex repo now.

What's the next to merge this pull request?

emtq commented Nov 19, 2013

I fixed the vendoring so we are pulling in the original nex repo now.

What's the next to merge this pull request?

@shykes

This comment has been minimized.

Show comment
Hide comment
@shykes

shykes Nov 30, 2013

Collaborator

Hi @wagerlabs, now that the craziness of the 0.7 release is behind us, we can take a second look at this.

Collaborator

shykes commented Nov 30, 2013

Hi @wagerlabs, now that the craziness of the 0.7 release is behind us, we can take a second look at this.

@gabrtv

This comment has been minimized.

Show comment
Hide comment
@gabrtv

gabrtv Dec 18, 2013

+1 for a proper lexer. It would have made #3521 easier (in part because I spent 2 hours debugging what turned out to be #3026!)

gabrtv commented Dec 18, 2013

+1 for a proper lexer. It would have made #3521 easier (in part because I spent 2 hours debugging what turned out to be #3026!)

@drewcrawford

This comment has been minimized.

Show comment
Hide comment
@drewcrawford

drewcrawford Jan 3, 2014

@solomonstre I really want an include verb.

drewcrawford commented Jan 3, 2014

@solomonstre I really want an include verb.

@crosbymichael

This comment has been minimized.

Show comment
Hide comment
@crosbymichael

crosbymichael Feb 17, 2014

Contributor

I'm going to close this now. When we are ready to look at the dockerfile parser again in the future and evaluate if the added complexity and code will be worth the feature gains we can reopen.

Thanks.

Contributor

crosbymichael commented Feb 17, 2014

I'm going to close this now. When we are ready to look at the dockerfile parser again in the future and evaluate if the added complexity and code will be worth the feature gains we can reopen.

Thanks.

@0xcaff

This comment has been minimized.

Show comment
Hide comment
@0xcaff

0xcaff Jun 2, 2014

It is probably best to get this done before 1.0.

0xcaff commented Jun 2, 2014

It is probably best to get this done before 1.0.

@moby moby locked and limited conversation to collaborators Jul 30, 2014

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.