Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Leading dashes in flag names #4686

Closed
snoyberg opened this issue Aug 14, 2017 · 37 comments
Closed

Leading dashes in flag names #4686

snoyberg opened this issue Aug 14, 2017 · 37 comments

Comments

@snoyberg
Copy link
Collaborator

After a discussion on the Stack issue tracker, it was revealed that the Cabal flag name requirements are rather lax, and currently seem to allow both sequential separators (- and _), as well as leading and trailing separators. For example, this command line works just fine:

cabal configure -f -totally--positive__

NOTE: this is using:

cabal-install version 1.24.0.2
compiled using version 1.24.2.0 of the Cabal library 

However, it's unclear how to set this from the command line to either true or false. If I add this line on its own to my cabal file:

flag -totally--positive__

I see the following in my shell session:

$ cabal configure -f -totally--positive__
Resolving dependencies...
Configuring foo-0.1.0.0...
cabal: '--exact-configuration' was given, but the following flags were not
specified: FlagName "-totally--positive__"
$ cabal configure -f +-totally--positive__
Resolving dependencies...
Configuring foo-0.1.0.0...
cabal: '--exact-configuration' was given, but the following flags were not
specified: FlagName "-totally--positive__"
$ cabal configure -f --totally--positive__
Resolving dependencies...
Configuring foo-0.1.0.0...

As you can see, ultimately the double-leading-dash was able to disable the flag, but no combination of + and - I tried enabled the flag.

My recommendation: strengthen to parser to behave like the package name parser, disallowing leading, trailing, or sequential separators. Next best thing: remove leading separators.

@phadej
Copy link
Collaborator

phadej commented Aug 14, 2017

There is no reason to disallow sequential separators, but we will disallow (because it's confusing) and discourage non ascii characters (cabal check), i.e. effectively making flag parser [a-z0-9_][a-z0-9_-]*.

Note: flag names are case-insensitive.

@snoyberg
Copy link
Collaborator Author

"There is no reason" is not a true statement. It may be your opinion, but there are certainly reasons. One reason: the disparity between flag and package name syntaxes is confusing. You may think such confusion is warranted, but please don't imply that I have no reasons for making my request. Thank you!

@phadej phadej mentioned this issue Aug 14, 2017
3 tasks
@phadej
Copy link
Collaborator

phadej commented Aug 14, 2017

I understand that it would be more elegant to have identifier parser and derive package, flag, executable, test-suite and other name parser from it, but I'm quite sure we cannot retrofit that anymore.

I'd turn this issue around, the package name is special, others not so:

  • we have to disallow trailing all-digit components, because foo-0 would look like package identifier (i.e. name + version)
  • double dashes may cause problems with downstream package managers, like Debian's apt

But there's no reason to disallow writing, it might be silly, but someone might want to do so.

executable funny--0
  main-is: Main.hs
  ...

We can *discourage**double dash in flag names, via cabal check, but I personally don't see that as bad as non-ascii and leading dash. (Which I check in #4687)

@snoyberg
Copy link
Collaborator Author

@phadej I'm having a difficult time discussing this with you, as you are repeatedly telling me that there's no reason for this, despite the fact that I've told you otherwise. If you're telling me that, in fact, this decision is not open for discussion and you are making a call without allowing my to have any input, then please clarify. Stating that my reasons do not exist is not appropriate.

@phadej
Copy link
Collaborator

phadej commented Aug 14, 2017

Is there any technical reason to disallow flag--names. Yes, it's aesthetically unpleasant, and might confuse tool implementors. First one is purely subjective reason, and the second one should be addressed with clear spec and proper library support, whatever the syntax is.

Background: The issue with stack happened, because there weren't any FlagName format specification. The current implementation is confusing (I don't know where to look), e.g.

parseFlagAssignment = Parse.sepBy1 parseFlagValue Parse.skipSpaces1
where
parseFlagValue =
(do Parse.optional (Parse.char '+')
f <- parseFlagName
return (f, True))
+++ (do _ <- Parse.char '-'
f <- parseFlagName
return (f, False))
parseFlagName = liftM (mkFlagName . lowercase) ident
ident :: Parse.ReadP r String
ident = Parse.munch1 identChar >>= \s -> check s >> return s
where
identChar c = isAlphaNum c || c == '_' || c == '-'
check ('-':_) = Parse.pfail
check _ = return ()
implies that it's [a-z0-9_-]+, disallowing leading - (check). I'm fixing this as part of #4654, by writing current behavior in the spec.

Library support: I'm working on parsec based Cabal parser, and in Cabal-2.2 there will be a parsec parser for FlagName, which stack (and other tools) could reuse.

@phadej
Copy link
Collaborator

phadej commented Aug 14, 2017

To be clear: -totally--positive__ will be invalid soon, but totally--positive__ would be still ok.. Trailing or sequential dashes doesn't trigger any cli usage issues, only leading ones.

@hvr
Copy link
Member

hvr commented Aug 14, 2017

Fwiw, I don't even see much of a big deal either regarding leading hyphens (and that's though it was me who originally brought up the suggestion to discourage leading hyphens in flag names).

My rationale for suggesting to discourage the use of leading hyphens in flag names is not so much a technical one (there's always a way to non-ambiguously express +/- based flag constraints in the presence of leading hyphens; it's no big deal to make the grammar definitive here), but rather one of usability. But one can at least argue there's a modest bit of inconvenience where the non explicitly +/- tagged forms of the flag toggling syntax are allowed, as well as possibly some legacy compat concerns.

But as for sequences of dashes or underscores or a flag name made up of a sequence of as, such as aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa, we'd end up having to debate where to draw the line regarding a flag naming style-guide. As much fun as that may sound to some of us, I don't want to go down that road, and I'd only consider actual technical reasons (which don't involve YAML) for things beyond discouraging leading hyphens.

@snoyberg
Copy link
Collaborator Author

Is there any technical reason to disallow flag--names

Given that my reasons have been rejected until now, I'm not sure what I can say that will pass your bar of "technical reason." So let me turn it around: you're currently tightening the restrictions on flag names; what technical reasons do you have for that?

@23Skidoo
Copy link
Member

My recommendation: strengthen to parser to behave like the package name parser, disallowing leading, trailing, or sequential separators.

FTR, I'm okay with this, but we should check that there are no .cabal files on Hackage that we no longer can parse due to this change. If it is the case, then we should reject old flag name syntax only for cabal-version: >= 2.2 files, and add a warning for older ones.

@phadej
Copy link
Collaborator

phadej commented Aug 14, 2017

@snoyberg you propose tightening the restrictions on flag names, not we.

Cabal check warning about unicode is indeed subjective (And I could actually add it for package names as well).

Leading dash is exactly what you proposed.

@snoyberg
Copy link
Collaborator Author

Quoting you above:

There is no reason to disallow sequential separators, but we will disallow (because it's confusing) and discourage non ascii characters (cabal check), i.e. effectively making flag parser [a-z0-9_][a-z0-9_-]*.

Perhaps I'm misunderstanding you, because that comment stated you wanted to disallow non ascii characters. If you mean that you just want to add a warning for that, that's fine. If your argument is that banning previously valid flag names breaks backwards compatibility (not explicitly stated, but somewhat implied above), that's a discussion to have. As I see it:

  • Herbert proposed removing support for leading - in AesonException when downloading nightly since 2017-08-13 commercialhaskell/stack#3345 (comment)
  • I proposed here removing support as well for a few other cases, as initially mentioned. I did not propose any specific rollout plan, since it was premature to discuss such a thing
  • You made the comment I just quoted, which implied changing the parser. It sounds now like you are only referring to changing cabal check
  • If there is an absolute inability to change the parser (which I'm not convinced of), but there is openness to modifying cabal check (which your comments imply strongly there is), then at the very least I would recommend cabal check have a warning.

@phadej
Copy link
Collaborator

phadej commented Aug 14, 2017

For the whole time I was talking about cabal check warning. I do mention cabal check, I don't imply changing parser.

  • cabal check flag names #4687 shows that removing leading dash is a bit problematic, as it breaks things a bit.
    • EDIT: it seems that the only thing you could do, is to declare a flag starting with a dash, but you cannot use it in the same .cabal anyway.
  • I'll amend cabal check flag names #4687 to work with leading-dash flags (so the cli and constraints will work), there is no technical restriction to be lax. There is no proper reason to start allowing leading-dash flags.
  • However, cabal check will cause warning if flag starts with a dash, so Hackage won't accept packages with such flags. (AFAIK there isn't any such packages).
    • I'll check cabal parser to issue proper (i.e. not cryptic) parse error if someone tries to declare flag starting with a dash.
  • So, Hackage will be strict, but users can have -lead-dash (and unicode) flags in their private projects, if they like so.

@23Skidoo
Copy link
Member

#4687 shows that removing leading dash is a bit problematic, as it breaks things a bit.

Can you please briefly summarise what the problems are here?

@phadej
Copy link
Collaborator

phadej commented Aug 14, 2017

EDIT: I was confused, you can declare flag -foobar, but you cannot really use it:

% cat lead-dash.cabal                           
name:                lead-dash
version:             0.1.0.0
synopsis:            Lorem
description:         Lorem Ipsum
license:             BSD3
license-file:        LICENSE
author:              Oleg Grenrus
maintainer:          oleg.grenrus@iki.fi
category:            Testing
build-type:          Simple
extra-source-files:  ChangeLog.md
cabal-version:       >=1.10

flag -foobar
  manual: True
  default: False

library
  exposed-modules:     LeadDash
  build-depends:       base >=4.9 && <4.11
  hs-source-dirs:      src
  default-language:    Haskell2010

  if flag(-foobar):
    build-depends: base >=4.10
[FL973] ~/mess/lead-dash % /opt/cabal/1.24/bin/cabal configure -f+-foobar
Warning: The package list for 'hackage.haskell.org' does not exist. Run 'cabal
update' to download it.
cabal: lead-dash.cabal:24: Parse of field 'if' failed.

@phadej
Copy link
Collaborator

phadej commented Aug 14, 2017

Edited my previous~2 comment.

@tfausak
Copy link
Collaborator

tfausak commented Aug 14, 2017

Hackage will be strict, but users can have -lead-dash (and unicode) flags in their private projects, if they like

Just chiming in to say that these differences between Hackage and Cabal are annoying to deal with. It means we end up with the actual standard (although there's no standard in this particular case) as well as the de facto standard on Hackage. This has happened before with the PVP: haskell/pvp#4 (comment)

@phadej
Copy link
Collaborator

phadej commented Aug 14, 2017

@tfausak note, that this issue is partly invalid. You could declare flag -foo but you couldn't use it in the .cabal file.

The only PVP cabal check is about omitted upper bound on base, so please keep that argument out of this discussion.

@tfausak
Copy link
Collaborator

tfausak commented Aug 14, 2017

I brought up the PVP issue because it's an example of the actual spec (PVP) not matching the spec in practice (Hackage). I dislike situations like that and would like to avoid it here.

Currently Cabal's flag parser matches [-_\p{Letter}|\p{Number}]+. The proposed solution to this issue, #4687, tightens that parser but only for cabal check. That means the Cabal flag "spec", such as it is, allows flags that Hackage rejects.

I think this is a problem because it would be reasonable for, say, Stack to implement a Cabal flag parser that matches the de facto Hackage spec. But then a user could try to make a flag like --⓪__א__⓪-- in a private/local package, see that it fails in Stack, and complain.

@phadej
Copy link
Collaborator

phadej commented Aug 14, 2017 via email

@23Skidoo
Copy link
Member

23Skidoo commented Aug 14, 2017

So, to summarise:

  • We'll be disallowing leading dashes (cabal check flag names #4687), which don't work anyway. This will fix the main problem mentioned in this bug report.
  • Multiple separators (e.g. -- and __) will still be allowed because there are packages on Hackage using this syntax (example). I don't think that removing support for them (which will have to be conditional on cabal-version, as mentioned above) buys us much, but I'm open to be persuaded.
  • I'm not sure whether there are packages on Hackage that use leading underscores or trailing separators in flag names. If someone can validate that there are indeed no such packages, then a patch removing support for such flag names would be accepted.
  • We should document flag name syntax in the user manual. Thanks to everyone who brought up the issue of the missing spec.

@tfausak
Copy link
Collaborator

tfausak commented Aug 14, 2017

From the linked issue: The PVP allows package-1 and package-1.0 to co-exist. Hackage disallows those packages because the two versions are confusing to use in bounds. Specifically, package >= 1 and package >= 1.0 don't mean the same thing.

For this issue: Cabal allows flag -foo. Hackage would disallow that via cabal check.

@23Skidoo
Copy link
Member

Yeah, I've long wanted to fix the Eq instance for Version, but that didn't happen because reasons.

@tfausak
Copy link
Collaborator

tfausak commented Aug 14, 2017

Only two packages have flags that run afoul of @snoyberg's proposed new rules:

  • cassava-0.5.1.0 defines bytestring--lt-0_10_4, which has two hyphens in a row.
  • deepseq-bounded starting with 0.6.0.0 defines both abbrev_wn_and_tn_concrete_syntax_to_number_alone__safe_only_to_depth_19 and abbrev_wn_and_tn_concrete_syntax_to_single_digit__can_only_express_down_to_depth_9 , which have two underscores in a row. (And at 71 and 82 characters respectively, these are the longest flags.)

No flag has any non-ASCII characters. In other words, every flag matches ^[-_a-z0-9]+$. (The Cabal parser converts flags to lowercase, so I can't say anything about the cases of flags in general. Regardless, everything would match that regex in a case-insensitive manner.)

I got this information from this script: https://gist.github.com/tfausak/009829dc73386b0779a822b4137cba18/3eb1319662f3c214c6c75057a603757347c81543

@phadej
Copy link
Collaborator

phadej commented Aug 14, 2017

And the reasons are in #3515 (comment).

FWIW, if @tfausak wants to fix PVP, you should open a PR for https://github.com/haskell/pvp about what precisely B MUST be greater means, when Bold is absent. I did open the original issue (haskell/pvp#4) but I lost steam to proceed.

@tfausak
Copy link
Collaborator

tfausak commented Aug 14, 2017

I fully understand the PVP problem already. I only brought it up as an example of the actual spec (PVP) not matching the de facto spec (Hackage). I don't want to derail this thread; let's stay focused on Cabal flags.

@23Skidoo
Copy link
Member

@tfausak Thanks! So we can remove support for trailing separators as well.

@hvr Can you please elaborate a little bit here about your vague future idea for using leading underscores for something, so that we could decide on whether to leave them in or remove them.

@hvr
Copy link
Member

hvr commented Aug 14, 2017

@23Skidoo To be honest, I don't see any benefit of arbitrarily limiting the lexical syntax to begin with. Usually the burden of proof lies with the party that proposes the change, and I haven't see any compelling reason to impose this arbitrary limitation so far. In fact, any arbitrary exception we add makes the regexp only more complex. Initially the regexp was assumed to be [[:alnum:]_-]+ which was elegant and simple. The only concession I have suggested and which I considered sensible is the exception to disallow leading - for the sake of CLI usability and compat concerns (as it turned out, leading - have compat issues anyway). So that leaves us with [[:alnum:]_][[:alnum:]_-]*. Every other suggestion I've heard so far has no practical benefit whatsoever; it surely doesn't facilitate parser implementations, nor does would it provide any performance benefits, nor does it help to make grammars less ambiguous. If we arbitrarily disallow trailing [_-], why not also disallow trailing numbers, or maybe while we're at it, also disallow certain characters. So please excuse me if this all sounds rather absurd to me, and most importantly since we're considering arbitrarily disallowing some trailing characters which doesn't solve any actual technical problem to begin with, but in the contrary even complicates the implementation and its specification, which ironically makes this even more confusing for users to understand what are considered valid flag names. I already regret having even mentioned the suggestion to disallow leading hyphens, given how this escalated. TLDR: Let's just agree on [[:alnum:]_][[:alnum:]_-]* and move on to another topic.

@tfausak
Copy link
Collaborator

tfausak commented Aug 14, 2017

It sounds like this ship has already sailed, but I'll say it again: I think that having the actual spec (the Cabal file format) differ from the spec in practice (Hackage) is a bad idea. Obviously there's some benefit in restricting flag names, otherwise #4687 wouldn't exist. Why not codify that in the spec rather than allowing it locally but rejecting it on upload?

I also disagree with this appeal to regex simplicity. Why not match tags with \S+? What could be simpler?

@23Skidoo
Copy link
Member

To clarify, I don't feel very strongly about this stuff, and am fine with keeping [[:alnum:]_][[:alnum:]_-]*, but if someone wrote a patch tightening the allowed syntax, I wouldn't be opposed to merging it.

I also find @tfausak's point about keeping Hackage-accepted and Cabal-accepted formats as close as possible persuasive (though this is not always possible because of historical reasons, like with the Version type where we need to be able to distinguish foo-1 from foo-1.0 because there are already such packages on Hackage).

@tfausak
Copy link
Collaborator

tfausak commented Aug 15, 2017

I see that @phadej merged #4687, which checks for invalid flags with cabal check but allows Cabal to parse them. Is there any way that I can either (a) appeal this decision and revert the changes to cabal check, or (b) encourage changing Cabal to outright reject flag names that cabal check would warn about?

Regardless, it seems like this issue should be closed as wontfix, based on the comments so far.

@phadej
Copy link
Collaborator

phadej commented Aug 15, 2017

Meta: this issue is Leading dashes in flag names, so technically this issue is fixed.

Every other change would benefit from being a separate issue.

@tfausak
Copy link
Collaborator

tfausak commented Aug 15, 2017

But... it's not fixed. #4687 (comment):

-foobar is still allowed in GenericPackageDescription parser, but you cannot really use it.

This is the Cabal issue tracker, not the Hackage issue tracker.

@23Skidoo
Copy link
Member

encourage changing Cabal to outright reject flag names that cabal check would warn about?

I'd accept a patch that did that.

@phadej
Copy link
Collaborator

phadej commented Aug 15, 2017

@tfausak
Copy link
Collaborator

tfausak commented Aug 15, 2017

@phadej: Hopefully you can forgive the error since that code came from an entirely separate PR (#4654, via 90b848a) that I didn't know about.

@23Skidoo: I'll start working on a PR.

@phadej
Copy link
Collaborator

phadej commented Aug 15, 2017

@tfausak, yes, sometimes speed of Cabal development surprises us!

% cat lead-dash.cabal 
name:                lead-dash
version:             0.1.0.0
synopsis:            Lorem
description:         Lorem Ipsum
license:             BSD3
license-file:        LICENSE
author:              Oleg Grenrus
maintainer:          oleg.grenrus@iki.fi
category:            Testing
build-type:          Simple
extra-source-files:  ChangeLog.md
cabal-version:       >=1.10

flag -foobar
  manual: True
  default: False

library
  exposed-modules:     LeadDash
  build-depends:       base >= 4.9 && < 4.11
  hs-source-dirs:      src
  default-language:    Haskell2010

% /home/ogre/Documents/other-haskell/cabal/dist-newstyle/build/x86_64-linux/ghc-8.0.2/cabal-install-2.1.0.0/build/cabal/cabal check
Warning: lead-dash.cabal:14:1:
unexpected "-"
expecting white space: "-foobar"
cabal: Failing parsing "./lead-dash.cabal".

btw, Could you add <?> "Flag name", that should improve error messages

@23Skidoo
Copy link
Member

OK, since the main issue here (leading dashes in flag names) has been resolved, I'm closing this as fixed; if anyone wants to also disallow leading underscores and trailing separators, please create a PR and we'll discuss it there.

tfausak added a commit to tfausak/cassava that referenced this issue Sep 8, 2017
This flag name was changed without comment in 6e1ff78. It has caused a few problems:

- commercialhaskell/stack#3345
- commercialhaskell/stackage#2755
- commercialhaskell/stackage#2759
- commercialhaskell/stackage#2842
- haskell/cabal#4686
- haskell-hvr#150

Those problems either caused or accelerated some (attempted) changes in Cabal:

- haskell/cabal#4654
- haskell/cabal#4687
- haskell/cabal#4696

Those problems also caused a change in Stack:

- commercialhaskell/stack#3349

In short: Cabal never had any trouble with this. Stack did. The current master version of Stack can handle flags like this, but no released versions of it can. It's not clear when the next version of Stack will be released. In the meantime, no Stack users can use this package. This commit changes the offending flag to something that Stack can handle. By using a flag that Stack can handle, Stack users can once again use this package, and it can return to Stackage. There are no apparent downsides to using a more compatible flag name.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants