Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration property analysis does not handle properties written in snake_case correctly #327

Closed
wilkinsona opened this issue Jul 16, 2019 · 16 comments

Comments

@wilkinsona
Copy link
Member

commented Jul 16, 2019

This can be reproduced using the same steps as described in #326.

Upon import, you should see two unknown properties warnings for application.yml in start-site:

Unknown property 'spring.devtools.restart.additional_paths'	application.yml	/start-site/src/main/resources	line 1268	org.eclipse.lsp4e.diagnostic
Unknown property 'spring.resources.static_locations'	application.yml	/start-site/src/main/resources	line 1264	org.eclipse.lsp4e.diagnostic

Both warnings are incorrect as the input is valid due to Boot's relaxed binding. As you can see the properties are written using snake_case. If they are changed to use kebab-case or camelCase the warnings disappear.

There are some more details about the formats on the Spring Boot wiki.

@kdvolder

This comment has been minimized.

Copy link
Member

commented Jul 17, 2019

The boot wiki isn't exactly defining the rules unfortunately. It just has some examples and then some words like this:

There are actually many more variants that are supported

This has been a bit of a struggle for us to support and I don't think we really have any hope in actually implementing this 'correctly' since there is no actual clear description of the 'correct' behavior. To make matters worse, I beleave the exact rules have subtly changed between versions of boot and when I looked at them a long time ago, much of the peculiars felt more like an accident of the particular implementation than an actual choice made by design.

If there was a clear and complete description of what exactly the rules are on whether some identifier matches another then we might be able to support all these variants (even though it would add considerable complexity).

As it is though, since we really have no hope of implementing all the possible variants without really knowing the precise rules, it ultimately boils down to a decision on which variants we want to support on a case by case basis and decide if that case is worth the extra complexity. A decision was made a long time ago to just support camelCase and hyphenated form because that seemed to be what most people seem to use in practice.

I somewhat regret we even added camelCase to the list. So, I'm inclined not to add the underscore variant to that list either as it would add too much complexity and no-body has complained about this so far.

Maybe as a compromise we could add a special case to the validations... that if an identifier contains an '_' we simply don't validate it. I.e. at least we wouldn't give these false errors, but you also won't get an error if you did make a typo in a underscored name. What do you think about that?

@wilkinsona

This comment has been minimized.

Copy link
Member Author

commented Jul 17, 2019

This has been a bit of a struggle for us to support and I don't think we really have any hope in actually implementing this 'correctly' since there is no actual clear description of the 'correct' behaviour.

Sorry. We've felt your pain there too. We made some mistakes in 1.x that made relaxed binding a struggle for us to support as well.

To make matters worse, I beleave the exact rules have subtly changed between versions of boot and when I looked at them a long time ago, much of the peculiars felt more like an accident of the particular implementation than an actual choice made by design.

Boot 2.0 gave us the opportunity to revisit the binding and tighten up the rules considerably. That does mean there are some differences between 1.5 and 2.0, but 1.5 is EoL on 1 August so I think it can be safely ignored now. The binding rules are, as far as I can remember, consistent across 2.x.

As it is though, since we really have no hope of implementing all the possible variants without really knowing the precise rules,

It should be possible for us to provide precise rules now if they will be of use to you.

Maybe as a compromise we could add a special case to the validations... that if an identifier contains an '_' we simply don't validate it. I.e. at least we wouldn't give these false errors, but you also won't get an error if you did make a typo in a underscored name. What do you think about that?

I think it would be a shame if the validation worked with only a subset of valid syntax, that's why I opened this issue. If users see some properties being validated it sets an expectation that they'll all be validated. I suspect that ignoring properties that contain _ will result in users being disappointed that the tools missed a problem that then leads to a failure at runtime.

@wilkinsona

This comment has been minimized.

Copy link
Member Author

commented Jul 17, 2019

@mbhave has reminded me of a wiki page that she wrote that describes the binding in more detail, including information about both properties files and YAML files.

You should be able to match keys in .properties or .yml files by:

  1. Removing any special characters (anything other than alphanumerics and .)
  2. Converting to lower case
  3. Comparing with entries in the metadata with - characters removed
@kdvolder

This comment has been minimized.

Copy link
Member

commented Jul 17, 2019

Hmm... those simple rules for getting a 'canonical form' of the key... that sounds easy enough! Also thanks for the wiki page link. Very very useful!

Just a bit of strange/silly question though. If things like dashes are to be completely ignored then why do they even get included in the metadata at all?

I guess I can answer it msyelf. If you would have changed this (and removed them) then STS current implementation would be broken now, even for the hyphenated case :-)

I am pleasantly surprised how much more logical these rules are now in Boot 2.0 than they used to be. We can definitely work with those to make it all work much better!

@philwebb

This comment has been minimized.

Copy link
Member

commented Jul 18, 2019

If things like dashes are to be completely ignored then why do they even get included in the metadata at all?

The meta-data is also used by tools to offer code completion and we generally prefer kebab case (all our own documentation uses it).

So in your above example, if the user type spring.devtools.restart. and hits ctrl-space we want to show them additional-paths even if they ultimately decide to useadditional_paths.

We could probably do with refreshing our documentation about this.

@snicoll

This comment has been minimized.

Copy link
Member

commented Jul 18, 2019

My 2c on this issue from the metadata level and how I think we should handle user configuration. There is a table in the reference documentation that describes exactly what we aim to support, that is:

  • kebab-case as the canonical format and the format we should show to the user regardless of their preference
  • camel case
  • snake case
  • An example of OS env variables that doesn't concern this issue as far as I know

We've written this table exactly for the purpose of narrowing things down and clarify what we intend to support officially (even though the binder indeed allows other special cases to be bound because that was the sensible thing to do from an architecture standpoint.

As far as I am concerned, I am perfectly happy if you only support those 3 variants and just ignore anything else. This is also consistent in our doc.

@wilkinsona

This comment has been minimized.

Copy link
Member Author

commented Jul 18, 2019

I've opened a Boot issue to provide a helper method in our spring-boot-configuration-metadata module that performs the mapping. We realise you may not want to depend on the module directly in STS, but hopefully such a helper method would be useful anyway as it could be added to this repo alongside the other code.

@kdvolder

This comment has been minimized.

Copy link
Member

commented Jul 18, 2019

The helper method might indeed be helpful, even if only to read it to understand the exact intended rules. And we may even decide to copy the code verbatim for convenience.

From my point of view where validation is concerned the rules as described in @mbhave wiki page are probably even simpler to implement and support than just limitiing it to 3 specific cases.

From completion sugestions point of view it is different however because in that case it needs to be capable of generating whatever form(s) of the key to insert in the document. But I think we will just not bother supporting the alternatives there and only offer the recommended Kebab form.

Supporting the other forms is just too complex in a number of ways and we'd basically prefer if user just stick to kebab. If the user really goes out of their way to manually type something else, or gets a file that already has other variants in use, we will just put the bar at tolerating those alternate forms up to the point that they do not cause false warnings in validation. But various bits of the support we have are just too much reliant on recognizing the canonical hyphenated key that it would just be a bit too complex to support all the other forms completely (generating them in completions, implementing preferences for them, writing test for all the cases, supporting the aliases in navigation bean / schema structure etc. etc.).

So if a user decides to go against the recommended use of kebab case, then realistically, the tools will just work a little less well for them and we just don't want to encourage that by actually offering them a preference to choose that option.

@snicoll

This comment has been minimized.

Copy link
Member

commented Jul 18, 2019

From completion sugestions point of view it is different however because in that case it needs to be capable of generating whatever form(s) of the key to insert in the document. But I think we will just not bother supporting the alternatives there and only offer the recommended Kebab form.

I think that's the right call. I'd very much like that tools only offers the canonical (i.e. kebab case) format in auto-completion.

Supporting the other forms is just too complex in a number of ways

Perhaps we could move that discussion in the issue we've created and craft the API together in an attempt to make that easier?

@kdvolder

This comment has been minimized.

Copy link
Member

commented Jul 18, 2019

Perhaps we could move that discussion in the issue we've created and craft the API together in an attempt to make that easier?

Sure, I've added some suggestions to the ticket. I think it is definitely a good idea to put thought into this. However, I don't want to be deceptive. Realistically, we probably will not actually use those helper methods to fix our issues right now. We are simply not going to re-implement what we have, or invest heavily into refactoring it to support all the variants 'perfectly'. The cost/benefit analysis just doesn't make sense to me. I'd rather invest the time and effort in other things, for example making sure things work properly when the user does follow the recommendations and uses kebab case.

Fixing the validation aspect is relatively feasible (especially given the simpler rules where we can actually have a canonical key to match). I think it also tends to be something that really bothers people when they see false warnings. So I will go ahead and fix that.

@kdvolder

This comment has been minimized.

Copy link
Member

commented Jul 18, 2019

Found a easier way to reproduce the issue.

  1. create a new boot project with "Spring Starter" wizard and add 'Dev Tools' and 'Web' starter dependencies.
  2. Create application.yml file and populate with this contents:
spring:
  resources:
    static_locations: []
  devtools:
    restart:
      additional_paths: []

The static_locations and additional_paths are flagged as unknown properties (but shouldn't be).

@kdvolder

This comment has been minimized.

Copy link
Member

commented Jul 18, 2019

Sadly, I thought it would be a quick ix. But... this is going to be lot harder to fix then I thought. Unfortunately there's just too many places in the code where we are doing stuff like looking up property infos in indexed metadata. The indexing logic has to change somewhat drastically (using a canonical key slightly different from the data it is indexing). This subtly breaks assumptions many callers make of the returned data. Things like assuming that if you look for data based on a 'property prefix' the you will get back entries that spell the property names exactly matching this prefix. There are just too many places/assumptions like that for this to be an easy fix. It will be lengthy refactoring and require much more and better test coverage of the various special cases introduced by the fact that hyphens are removed from the canonical index keys.

I could do a 'quick and dirty' fix just for the specific snakeyaml case reported here but it would be just a bit of a hack and leave many other situations where snake case could be used still broken.

So I'll have to put more thought into how best to approach this and not break a bunch of things in the process. I'll need to put this on the backburner for a bit.

@kdvolder

This comment has been minimized.

Copy link
Member

commented Jul 24, 2019

Rather than keep this bug open for a long time, we decided to just do a 'quick and dirty' fix for the snake case support right now. Then at a later time we might do the bigger refactoring that actually 'does the right thing' w.r.t. using a 'canonical' representation of property names in indexing and lookups.

@wilkinsona

This comment has been minimized.

Copy link
Member Author

commented Aug 13, 2019

Thanks for the update. Is there an issue or Tracker story for the bigger refactoring?

@martinlippert

This comment has been minimized.

Copy link
Member

commented Aug 13, 2019

@wilkinsona Isn't that what #331 is about? Or do you mean something different?

@wilkinsona

This comment has been minimized.

Copy link
Member Author

commented Aug 13, 2019

Yep, thanks. Sorry, I missed it begin opened among the events above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.