Add ConfigDocument API #280

MSLilah · 2015-03-10T16:29:49Z

This work came out of the discussion in #272.

These commits add a new grouping of classes, ConfigNode, to the library, which are used to form an AST. This AST contains all the tokens from which it was parsed, so the original text can be reproduced from them.

I'm opening this up for review now as the implementation is complete, so I think that portion is ready for review. The testing code is rough right now, largely because of the difficulty involved with crafting instances of ConfigNodeComplexValue by hand, so I'm still working on improving that, although any feedback on that front is welcome.

This PR adds the ConfigDocument API in its entirety, including its backend representation with ConfigNodes, the ConfigDocument Parser, and the ConfigDocument Interface.

Add various ConfigNode classes to represent the various ConfigNode types. All of these are fully functional, with the exception of ConfigNodeComplexValue, which lacks the ability to delete duplicates of a key or add a new key.

Remove repeats of a key when setting a value in a ConfigNodeComplexValue. Add a new node type, ConfigNodeKeyValue, to represent a key-value pair and its surrounding whitespace.

Add the desired path passed into the setValueOnPath method in ConfigNodeComplexValue if that path does not exist in the node.

havocp · 2015-03-10T19:47:44Z

config/src/main/java/com/typesafe/config/impl/AbstractConfigNode.java

+import com.typesafe.config.ConfigNode;
+
+abstract class AbstractConfigNode implements ConfigNode {
+    final private Token token;


it seems like the "responsibility" of this class is to be a node with only one token, so ConfigNodeSingleToken or something?

Some nodes will have multiple tokens right? Or not? I would expect the base class for all nodes to abstract that a little, maybe with something like protected Iterable tokens() and then for ConfigNodeSingleToken that would return Collections.singletonList(token) or the like.

then render would of course render the whole list of tokens. but maybe we have no multi-token nodes?

@havocp As it stands right now, each token lives in its own node. However, some nodes have child nodes instead of Tokens.

havocp · 2015-03-10T20:19:26Z

To avoid crafting complex nodes by hand, one approach would be to go ahead and implement the tokens-to-ConfigNode parser. That would also allow implementing a round-trip test that no tokens or formatting are lost in the parsing process and that the ConfigNode type hierarchy is complete. (Round trip test means: configNodeParser("any valid config stuff here").render().equals("any valid config stuff here") ... this can be tested for the files that are already in src/test/resources perhaps)

I think you could cut-and-paste Parser.java; the changes I see the need to make right away are:

you can drop all that TokenWithComments / attractsLeading/TrailingComments stuff and instead just generate comments nodes
generate nodes for, rather than discarding, tokens such as ignored whitespace and field separators
don't actually process includes, just make nodes for them
might want to split the path expression parsing stuff out of Parser.java into its own file so you can use it too

I think if your new parser doesn't have the TokenWithComments stuff and shares the path expression parsing stuff, it won't be that many lines of code, and the lines it has will be based on the Parser.java lines. Most of the verbosity in Parser.java is about nice error messages (which would indeed be nice to keep). Don't be afraid of cut-and-paste for the bulk of the parser, since we can clean that up later by making the existing Parser.java go from ConfigNode => ConfigValue instead of using tokens.

MSLilah · 2015-03-10T20:24:41Z

@havocp Yeah, I was thinking that implementing the parser would make testing easier, but I felt like there was still some value in directly crafting config nodes for tests.

I have some more tests of config nodes that I haven't pushed up yet. Do you want me to push those up? And would you prefer the Parser be added to this PR or a new PR? I was planning to add in the parser in a new PR after this was merged, but I can add it into this PR if you'd like.

Add more robust tests for ConfigNode classes. Add tests to test arrays, duplicate removal, and addition of non-existent paths.

MSLilah · 2015-03-10T20:56:35Z

@havocp I've added some more tests and will get to work on addressing your feedback, thanks!

havocp · 2015-03-10T20:58:00Z

I agree that some direct tests for the nodes are good too. I'm thinking it might be worth adding the parser here because I'm having trouble feeling we have the node API right without also having the parser and roundtrip tests. What do you think?

What I would like to keep separate is the modification of Parser.java - mostly because that's when we risk breaking existing functionality. It would be nice if this PR is only adding new stuff, and then make it a separate step to unify with the old stuff.

(exception: maybe that path expression parser code can move into its own PathParser file, since that's hopefully just a simple move.)

MSLilah · 2015-03-10T21:01:39Z

@havocp Hmm, that makes sense to me. This feature isn't really finished without the Parser, so if they get merged separately, we'll have an incomplete, in-progress feature in the repository. I'll address your feedback, then get to work on the new Parser. I'll add it as a commit to this PR once it's ready to be reviewed.

Address various PR feedback, including: * Remove unused map from ConfigNodeComplexValue * Stop caching KeyValue indexes in ConfigNodeComplexValue * Return an Iterable<Token> for the children() methods in ConfigNodeComplexValue and ConfigNodeKeyValue * Change all ConfigNode classes to implement AbstractConfigNode * Remove the constructor and the token instance variable from AbstractConfigNode. Make the render() method final and have it use a new tokens() method which returns the list of tokens contained by the node. * Stop caching values in ConfigNodeKeyValue

MSLilah · 2015-03-11T17:04:15Z

@havocp I've just pushed up a new commit addressing most of your feedback. In regards to the path changes, I'm getting to work on that now, and will push that up in a separate commit.

Save the list of Tokens from which a Path was created when a Path is parsed from a string in Parser.parsePath. Change ConfigNodeKey to store a Path instead of a token.

MSLilah · 2015-03-11T18:30:30Z

@havocp I've pushed up a new commit that saves the Tokens in Paths produced by Parser.parsePath. It should be ready for review, so any feedback is much appreciated!

Add copyright information to all ConfigNode classes. Document the ConfigNode interface.

Extract the logic to parse a Path out of the Parser and into a new PathParser class.

havocp · 2015-03-12T00:38:59Z

config/src/main/java/com/typesafe/config/impl/ConfigNodeComplexValue.java

+        return children;
+    }
+
+    protected Collection<Token> tokens() {


this one should have @Override

havocp · 2015-03-23T20:25:18Z

apologies for being slow here, have been traveling and so on.

I'm thinking there might be some work to make the ConfigNode subtypes a little more "semantic," like have nodes for includes and comments and so on. I'd think the public API has stuff like ConfigNodeComment rather than ConfigNodeSingleToken. The old parser then would do very little (drop purely syntactic nodes, process includes, ...)

Is that part of what you're doing now or does it not make sense ?

MSLilah · 2015-03-23T20:36:41Z

@havocp Yeah, that's part of what I'm doing now for the subsequent PR. If you'd like I can merge what I've done (minus the Parser changes) into this PR.

MSLilah · 2015-03-23T20:42:26Z

Also, no problem on the response time! We know you've been busy with Scala Days and such. :)

havocp · 2015-03-23T20:47:05Z

It's fine to keep it separate; I'll get this one merged. I don't think I have any more substantive comments on this PR I just wanted to at least read all the code.

MSLilah · 2015-03-23T20:51:55Z

@havocp Great! Let me know if you have any feedback, I'm happy to make any more changes you think are necessary (particularly in regards to test coverage).

Make the javadoc string for the ConfigDocument interface more explicit about what it does with the value string when setting a value. Add support for parsing a reader into a ConfigDocument.

MSLilah · 2015-03-23T21:04:30Z

@havocp I noticed you had given me some feedback around the ConfigDocument that I hadn't addressed yet, I pushed up a new commit that addresses it.

havocp · 2015-03-23T21:08:41Z

config/src/main/java/com/typesafe/config/ConfigDocument.java

+     * @param path the path at which to set the desired value
+     * @param newValue the value to set at the desired path, represented as a string. This
+     *                 string will be parsed into a ConfigNode, and the text will be inserted
+     *                 as-is into the document, with leading and trailing whitespace removed.


trimming whitespace surprises me a little - if we're going to some lengths here to let people specify the exact text they want, why not let them have extra whitespace? they can trim it if they want ...

if you parse a JSON file then set a value that uses HOCON syntax, what happens?

In terms of whitespace trimming, this was primarily due to the fact that the desired value could be a simple value, which does not support multiple tokens/nodes. We could modify that for sure, though.

In terms of your question on value-setting: the ConfigDocument object saves the options which were used to parse it, and then uses those to parse any values passed to the setValue() method. If you parse a JSON file then set a value that uses HOCON syntax, it should conceivably throw a parse error. I don't have a test for that, but could easily add one.

seems worth testing to me, since breaking that behavior could certainly break an app (and it seems easy to break)

I've written a test and verified the behavior.

One thing to note: when parsing a single JSON value for node replacement, if a concatenation is given, the parser will simply parse the first value in the concatenation into a node and ignore all other items in the concatenation. Is that acceptable? Or would you prefer it error out when a concatenation is given and JSON is being parsed?

Also, do you want me to modify the setValue method to NOT trim whitespace? I think it could get a little weird, since this is written in such a way that the parent object keeps track of leading and trailing whitespace, so we'd have to either stick the value with leading and trailing whitespace into a concatenation node, make a new wrapper for simple config nodes that contains the whitespace, make ConfigNodeSimpleValue multi-token but not multi-node, or have a way to insert multiple nodes at once into a config node instead of just one.

Actually, I'm thinking that a concatenation is a non-valid value in JSON, so I'll have it throw an exception if there's a concatenation in JSON.

I think it should probably parse the entire concatenation, which would be a "value" in this sense, but if that's hard it would be better to throw an error than to silently ignore stuff.

I guess if we don't trim whitespace a weird thing results where setting a value isn't idempotent, unless we made setting a value remove all whitespace around the existing value. Not desirable. What about comments? they pose similar issues.

A simple punt would be "anything except the value itself is an error" - that would allow us to be more forgiving in the future without breaking people.

oh I missed that this was in JSON. agreed invalid json should throw.

Hmm, I'm not entirely sure how parsing the whole concatenation would work if it's JSON. Since the rendered output of the node created by the single value parsing needs to be valid JSON, we'd need to modify the rendered result so it's valid JSON. If we're combining simple values that's not a huge deal (can just put quotes around the rendered output) but when we get to objects and arrays we run into a problem wherein we need to merge not just the values but the text contained within them to form a single object, which seems pretty complicated. Sorry, missed your last comment

Sounds like throwing an error if there's surrounding whitespace might be a good way to go. I could also add a new Node type or make ConfigNodeSimpleValue multi-token, though, either of those two approaches shouldn't be difficult.

havocp · 2015-03-23T21:32:50Z

The test coverage looks reasonable to me, I'm also thinking that once you port the existing parser to use this one, we'll get a ton of coverage "for free" from all the existing tests.

havocp · 2015-03-23T21:34:12Z

the only outright bug I saw was the !=/equals thing, so unless I'm confused there we should probably fix it. otherwise, I've been through this code and it looks good to merge.

MSLilah · 2015-03-23T21:45:46Z

@havocp Great! I'll go through and address your last bits of feedback, hopefully it won't take long

MSLilah · 2015-03-23T21:45:55Z

Thanks for the review!

Implement an `equals()` method for AbstractConfigNodes. Add a test to ensure an exception is thrown when calling setValue() on a ConfigDocument when passing in a value with HOCON syntax when the document was parsed as JSON. Add a setValue() method to the ConfigDocument interface that takes a ConfigValue instead of a String. When parsing a single value, throw an exception if a concatenation is seen when parsing JSON.

havocp · 2015-03-24T00:14:27Z

throwing on any kind of stuff (comments, whitespace, concatenation not supported by JSON) around the value in setValue makes sense to me, because otherwise it gets into all kinds of thorny issues where we can have multiple trees possible for the same rendered file.

MSLilah · 2015-03-24T15:56:37Z

@havocp Alright, I'll get to work now on making a new simple value class that is multi-node to contain whitespace/comments, which will be used solely in the case of setting a value.

havocp · 2015-03-24T15:59:01Z

I was trying to say don't do that - I don't think we want the whitespace/comments to be part of the value node sometimes, and part of the parent node sometimes. Also, I don't think we want setting a value to be non-idempotent. Instead, setValue could give up and throw if there's any "extras" around the value. That way we can punt on all the semantics of whitespace/comments.

MSLilah · 2015-03-24T16:00:16Z

Ah, sorry, misunderstood what you were saying! That should be a fairly quick fix.

Update the single value parsing method to throw an exception when there is leading or trailing whitespace. Modify concatenation parsing in the ConfigDocumentParser to put back any trailing whitespace. Add a hashCode method for AbstractConfigNodes.

MSLilah · 2015-03-24T17:16:29Z

@havocp Alright, I've addressed your last few comments!

havocp · 2015-03-24T17:36:49Z

config/src/test/scala/com/typesafe/config/impl/ConfigDocumentTest.scala

+                assertTrue(e.getMessage.contains("ConfigDocument had an array at the root level"))
+        }
+        assertTrue(exceptionThrown)
+    }


for future reference, you can write this as val e = intercept[ConfigException] { document.setValue("a", "1") } which asserts that the exception of that type is in fact thrown, then you can check the message.

Add ConfigDocument API

havocp · 2015-03-24T17:37:15Z

Merged, thanks very much!

Usage of this array seems to have been dropped by lightbend#280, but the array remained. Cleaning up so it doesn't allocate the array needlessly anymore. Fixes lightbend#730.

MSLilah added 3 commits March 10, 2015 09:23

Add ConfigNode classes

13f2cb3

Add various ConfigNode classes to represent the various ConfigNode types. All of these are fully functional, with the exception of ConfigNodeComplexValue, which lacks the ability to delete duplicates of a key or add a new key.

Remove repeats when setting value in ConfigNode

faf8d42

Remove repeats of a key when setting a value in a ConfigNodeComplexValue. Add a new node type, ConfigNodeKeyValue, to represent a key-value pair and its surrounding whitespace.

Add path to ConfigNodeComplexValue if nonexistent

fa0aaea

Add the desired path passed into the setValueOnPath method in ConfigNodeComplexValue if that path does not exist in the node.

havocp reviewed Mar 10, 2015
View reviewed changes

Add more robust ConfigNode tests

3166db4

Add more robust tests for ConfigNode classes. Add tests to test arrays, duplicate removal, and addition of non-existent paths.

Keep tokens in Path

cce8c20

Save the list of Tokens from which a Path was created when a Path is parsed from a string in Parser.parsePath. Change ConfigNodeKey to store a Path instead of a token.

Add docs and copyright information

0a804de

Add copyright information to all ConfigNode classes. Document the ConfigNode interface.

MSLilah force-pushed the task/create-ConfigNode branch from 560bd9b to 0a804de Compare March 11, 2015 20:35

Extract Path parsing into new class

b19e38f

Extract the logic to parse a Path out of the Parser and into a new PathParser class.

havocp reviewed Mar 12, 2015
View reviewed changes

Address some more PR feedback

c3a2e07

Make the javadoc string for the ConfigDocument interface more explicit about what it does with the value string when setting a value. Add support for parsing a reader into a ConfigDocument.

havocp reviewed Mar 23, 2015
View reviewed changes

Update single value parsing

b70d4c1

Update the single value parsing method to throw an exception when there is leading or trailing whitespace. Modify concatenation parsing in the ConfigDocumentParser to put back any trailing whitespace. Add a hashCode method for AbstractConfigNodes.

havocp reviewed Mar 24, 2015
View reviewed changes

havocp added a commit that referenced this pull request Mar 24, 2015

Merge pull request #280 from fpringvaldsen/task/create-ConfigNode

25a9f91

Add ConfigDocument API

havocp merged commit 25a9f91 into lightbend:master Mar 24, 2015

havocp mentioned this pull request Dec 3, 2018

Empty comments #603

Open

hkupty added a commit to hkupty/config that referenced this pull request May 18, 2021

Remove unnecessary array

5a4bb04

Usage of this array seems to have been dropped by lightbend#280, but the array remained. Cleaning up so it doesn't allocate the array needlessly anymore. Fixes lightbend#730.

This was referenced May 18, 2021

Remove unnecessary array hkupty/config#1

Closed

Remove unnecessary array #731

Merged

hkupty added a commit to hkupty/config that referenced this pull request May 21, 2021

Remove unnecessary array

fca5de9

Usage of this array seems to have been dropped by lightbend#280, but the array remained. Cleaning up so it doesn't allocate the array needlessly anymore. Fixes lightbend#730.

Add ConfigDocument API #280

Add ConfigDocument API #280

Conversation

MSLilah commented Mar 10, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

havocp commented Mar 10, 2015

MSLilah commented Mar 10, 2015

MSLilah commented Mar 10, 2015

havocp commented Mar 10, 2015

MSLilah commented Mar 10, 2015

MSLilah commented Mar 11, 2015

MSLilah commented Mar 11, 2015

Choose a reason for hiding this comment

havocp commented Mar 23, 2015

MSLilah commented Mar 23, 2015

MSLilah commented Mar 23, 2015

havocp commented Mar 23, 2015

MSLilah commented Mar 23, 2015

MSLilah commented Mar 23, 2015

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

havocp commented Mar 23, 2015

havocp commented Mar 23, 2015

MSLilah commented Mar 23, 2015

MSLilah commented Mar 23, 2015

havocp commented Mar 24, 2015

MSLilah commented Mar 24, 2015

havocp commented Mar 24, 2015

MSLilah commented Mar 24, 2015

MSLilah commented Mar 24, 2015

Choose a reason for hiding this comment

havocp commented Mar 24, 2015