Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ConfigDocument API #280

Merged
merged 19 commits into from Mar 24, 2015
Merged

Conversation

MSLilah
Copy link
Contributor

@MSLilah MSLilah commented Mar 10, 2015

This work came out of the discussion in #272.

These commits add a new grouping of classes, ConfigNode, to the library, which are used to form an AST. This AST contains all the tokens from which it was parsed, so the original text can be reproduced from them.

I'm opening this up for review now as the implementation is complete, so I think that portion is ready for review. The testing code is rough right now, largely because of the difficulty involved with crafting instances of ConfigNodeComplexValue by hand, so I'm still working on improving that, although any feedback on that front is welcome.

This PR adds the ConfigDocument API in its entirety, including its backend representation with ConfigNodes, the ConfigDocument Parser, and the ConfigDocument Interface.

Add various ConfigNode classes to represent the various
ConfigNode types. All of these are fully functional, with the
exception of ConfigNodeComplexValue, which lacks the ability
to delete duplicates of a key or add a new key.
Remove repeats of a key when setting a value in a
ConfigNodeComplexValue. Add a new node type, ConfigNodeKeyValue,
to represent a key-value pair and its surrounding whitespace.
Add the desired path passed into the setValueOnPath method in
ConfigNodeComplexValue if that path does not exist in the node.
import com.typesafe.config.ConfigNode;

abstract class AbstractConfigNode implements ConfigNode {
final private Token token;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like the "responsibility" of this class is to be a node with only one token, so ConfigNodeSingleToken or something?

Some nodes will have multiple tokens right? Or not? I would expect the base class for all nodes to abstract that a little, maybe with something like protected Iterable tokens() and then for ConfigNodeSingleToken that would return Collections.singletonList(token) or the like.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then render would of course render the whole list of tokens. but maybe we have no multi-token nodes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@havocp As it stands right now, each token lives in its own node. However, some nodes have child nodes instead of Tokens.

@havocp
Copy link
Collaborator

havocp commented Mar 10, 2015

To avoid crafting complex nodes by hand, one approach would be to go ahead and implement the tokens-to-ConfigNode parser. That would also allow implementing a round-trip test that no tokens or formatting are lost in the parsing process and that the ConfigNode type hierarchy is complete. (Round trip test means: configNodeParser("any valid config stuff here").render().equals("any valid config stuff here") ... this can be tested for the files that are already in src/test/resources perhaps)

I think you could cut-and-paste Parser.java; the changes I see the need to make right away are:

  • you can drop all that TokenWithComments / attractsLeading/TrailingComments stuff and instead just generate comments nodes
  • generate nodes for, rather than discarding, tokens such as ignored whitespace and field separators
  • don't actually process includes, just make nodes for them
  • might want to split the path expression parsing stuff out of Parser.java into its own file so you can use it too

I think if your new parser doesn't have the TokenWithComments stuff and shares the path expression parsing stuff, it won't be that many lines of code, and the lines it has will be based on the Parser.java lines. Most of the verbosity in Parser.java is about nice error messages (which would indeed be nice to keep). Don't be afraid of cut-and-paste for the bulk of the parser, since we can clean that up later by making the existing Parser.java go from ConfigNode => ConfigValue instead of using tokens.

@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 10, 2015

@havocp Yeah, I was thinking that implementing the parser would make testing easier, but I felt like there was still some value in directly crafting config nodes for tests.

I have some more tests of config nodes that I haven't pushed up yet. Do you want me to push those up? And would you prefer the Parser be added to this PR or a new PR? I was planning to add in the parser in a new PR after this was merged, but I can add it into this PR if you'd like.

Add more robust tests for ConfigNode classes. Add tests to
test arrays, duplicate removal, and addition of non-existent
paths.
@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 10, 2015

@havocp I've added some more tests and will get to work on addressing your feedback, thanks!

@havocp
Copy link
Collaborator

havocp commented Mar 10, 2015

I agree that some direct tests for the nodes are good too. I'm thinking it might be worth adding the parser here because I'm having trouble feeling we have the node API right without also having the parser and roundtrip tests. What do you think?

What I would like to keep separate is the modification of Parser.java - mostly because that's when we risk breaking existing functionality. It would be nice if this PR is only adding new stuff, and then make it a separate step to unify with the old stuff.

(exception: maybe that path expression parser code can move into its own PathParser file, since that's hopefully just a simple move.)

@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 10, 2015

@havocp Hmm, that makes sense to me. This feature isn't really finished without the Parser, so if they get merged separately, we'll have an incomplete, in-progress feature in the repository. I'll address your feedback, then get to work on the new Parser. I'll add it as a commit to this PR once it's ready to be reviewed.

Address various PR feedback, including:
  * Remove unused map from ConfigNodeComplexValue
  * Stop caching KeyValue indexes in ConfigNodeComplexValue
  * Return an Iterable<Token> for the children() methods in
    ConfigNodeComplexValue and ConfigNodeKeyValue
  * Change all ConfigNode classes to implement AbstractConfigNode
  * Remove the constructor and the token instance variable
    from AbstractConfigNode. Make the render() method final and
    have it use a new tokens() method which returns the list
    of tokens contained by the node.
  * Stop caching values in ConfigNodeKeyValue
@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 11, 2015

@havocp I've just pushed up a new commit addressing most of your feedback. In regards to the path changes, I'm getting to work on that now, and will push that up in a separate commit.

Save the list of Tokens from which a Path was created when a
Path is parsed from a string in Parser.parsePath. Change
ConfigNodeKey to store a Path instead of a token.
@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 11, 2015

@havocp I've pushed up a new commit that saves the Tokens in Paths produced by Parser.parsePath. It should be ready for review, so any feedback is much appreciated!

Add copyright information to all ConfigNode classes. Document
the ConfigNode interface.
Extract the logic to parse a Path out of the Parser and into
a new PathParser class.
return children;
}

protected Collection<Token> tokens() {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one should have @Override

@havocp
Copy link
Collaborator

havocp commented Mar 23, 2015

apologies for being slow here, have been traveling and so on.

I'm thinking there might be some work to make the ConfigNode subtypes a little more "semantic," like have nodes for includes and comments and so on. I'd think the public API has stuff like ConfigNodeComment rather than ConfigNodeSingleToken. The old parser then would do very little (drop purely syntactic nodes, process includes, ...)

Is that part of what you're doing now or does it not make sense ?

@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 23, 2015

@havocp Yeah, that's part of what I'm doing now for the subsequent PR. If you'd like I can merge what I've done (minus the Parser changes) into this PR.

@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 23, 2015

Also, no problem on the response time! We know you've been busy with Scala Days and such. :)

@havocp
Copy link
Collaborator

havocp commented Mar 23, 2015

It's fine to keep it separate; I'll get this one merged. I don't think I have any more substantive comments on this PR I just wanted to at least read all the code.

@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 23, 2015

@havocp Great! Let me know if you have any feedback, I'm happy to make any more changes you think are necessary (particularly in regards to test coverage).

Make the javadoc string for the ConfigDocument interface more
explicit about what it does with the value string when setting
a value. Add support for parsing a reader into a ConfigDocument.
@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 23, 2015

@havocp I noticed you had given me some feedback around the ConfigDocument that I hadn't addressed yet, I pushed up a new commit that addresses it.

* @param path the path at which to set the desired value
* @param newValue the value to set at the desired path, represented as a string. This
* string will be parsed into a ConfigNode, and the text will be inserted
* as-is into the document, with leading and trailing whitespace removed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trimming whitespace surprises me a little - if we're going to some lengths here to let people specify the exact text they want, why not let them have extra whitespace? they can trim it if they want ...

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you parse a JSON file then set a value that uses HOCON syntax, what happens?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In terms of whitespace trimming, this was primarily due to the fact that the desired value could be a simple value, which does not support multiple tokens/nodes. We could modify that for sure, though.

In terms of your question on value-setting: the ConfigDocument object saves the options which were used to parse it, and then uses those to parse any values passed to the setValue() method. If you parse a JSON file then set a value that uses HOCON syntax, it should conceivably throw a parse error. I don't have a test for that, but could easily add one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems worth testing to me, since breaking that behavior could certainly break an app (and it seems easy to break)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've written a test and verified the behavior.

One thing to note: when parsing a single JSON value for node replacement, if a concatenation is given, the parser will simply parse the first value in the concatenation into a node and ignore all other items in the concatenation. Is that acceptable? Or would you prefer it error out when a concatenation is given and JSON is being parsed?

Also, do you want me to modify the setValue method to NOT trim whitespace? I think it could get a little weird, since this is written in such a way that the parent object keeps track of leading and trailing whitespace, so we'd have to either stick the value with leading and trailing whitespace into a concatenation node, make a new wrapper for simple config nodes that contains the whitespace, make ConfigNodeSimpleValue multi-token but not multi-node, or have a way to insert multiple nodes at once into a config node instead of just one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm thinking that a concatenation is a non-valid value in JSON, so I'll have it throw an exception if there's a concatenation in JSON.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should probably parse the entire concatenation, which would be a "value" in this sense, but if that's hard it would be better to throw an error than to silently ignore stuff.

I guess if we don't trim whitespace a weird thing results where setting a value isn't idempotent, unless we made setting a value remove all whitespace around the existing value. Not desirable. What about comments? they pose similar issues.

A simple punt would be "anything except the value itself is an error" - that would allow us to be more forgiving in the future without breaking people.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I missed that this was in JSON. agreed invalid json should throw.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'm not entirely sure how parsing the whole concatenation would work if it's JSON. Since the rendered output of the node created by the single value parsing needs to be valid JSON, we'd need to modify the rendered result so it's valid JSON. If we're combining simple values that's not a huge deal (can just put quotes around the rendered output) but when we get to objects and arrays we run into a problem wherein we need to merge not just the values but the text contained within them to form a single object, which seems pretty complicated. Sorry, missed your last comment

Sounds like throwing an error if there's surrounding whitespace might be a good way to go. I could also add a new Node type or make ConfigNodeSimpleValue multi-token, though, either of those two approaches shouldn't be difficult.

@havocp
Copy link
Collaborator

havocp commented Mar 23, 2015

The test coverage looks reasonable to me, I'm also thinking that once you port the existing parser to use this one, we'll get a ton of coverage "for free" from all the existing tests.

@havocp
Copy link
Collaborator

havocp commented Mar 23, 2015

the only outright bug I saw was the !=/equals thing, so unless I'm confused there we should probably fix it. otherwise, I've been through this code and it looks good to merge.

@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 23, 2015

@havocp Great! I'll go through and address your last bits of feedback, hopefully it won't take long

@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 23, 2015

Thanks for the review!

Implement an `equals()` method for AbstractConfigNodes. Add a
test to ensure an exception is thrown when calling setValue()
on a ConfigDocument when passing in a value with HOCON syntax
when the document was parsed as JSON. Add a setValue() method to
the ConfigDocument interface that takes a ConfigValue instead of
a String. When parsing a single value, throw an exception if a
concatenation is seen when parsing JSON.
@havocp
Copy link
Collaborator

havocp commented Mar 24, 2015

throwing on any kind of stuff (comments, whitespace, concatenation not supported by JSON) around the value in setValue makes sense to me, because otherwise it gets into all kinds of thorny issues where we can have multiple trees possible for the same rendered file.

@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 24, 2015

@havocp Alright, I'll get to work now on making a new simple value class that is multi-node to contain whitespace/comments, which will be used solely in the case of setting a value.

@havocp
Copy link
Collaborator

havocp commented Mar 24, 2015

I was trying to say don't do that - I don't think we want the whitespace/comments to be part of the value node sometimes, and part of the parent node sometimes. Also, I don't think we want setting a value to be non-idempotent. Instead, setValue could give up and throw if there's any "extras" around the value. That way we can punt on all the semantics of whitespace/comments.

@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 24, 2015

Ah, sorry, misunderstood what you were saying! That should be a fairly quick fix.

Update the single value parsing method to throw an exception
when there is leading or trailing whitespace. Modify concatenation
parsing in the ConfigDocumentParser to put back any trailing
whitespace. Add a hashCode method for AbstractConfigNodes.
@MSLilah
Copy link
Contributor Author

MSLilah commented Mar 24, 2015

@havocp Alright, I've addressed your last few comments!

assertTrue(e.getMessage.contains("ConfigDocument had an array at the root level"))
}
assertTrue(exceptionThrown)
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for future reference, you can write this as val e = intercept[ConfigException] { document.setValue("a", "1") } which asserts that the exception of that type is in fact thrown, then you can check the message.

havocp added a commit that referenced this pull request Mar 24, 2015
@havocp havocp merged commit 25a9f91 into lightbend:master Mar 24, 2015
@havocp
Copy link
Collaborator

havocp commented Mar 24, 2015

Merged, thanks very much!

@havocp havocp mentioned this pull request Dec 3, 2018
hkupty added a commit to hkupty/config that referenced this pull request May 18, 2021
Usage of this array seems to have been dropped by lightbend#280, but the array
remained.

Cleaning up so it doesn't allocate the array needlessly anymore.

Fixes lightbend#730.
hkupty added a commit to hkupty/config that referenced this pull request May 21, 2021
Usage of this array seems to have been dropped by lightbend#280, but the array
remained.

Cleaning up so it doesn't allocate the array needlessly anymore.

Fixes lightbend#730.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants