Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add json parser for input files #5993

Merged
merged 4 commits into from
Mar 12, 2018

Conversation

MFraters
Copy link
Contributor

@MFraters MFraters commented Mar 4, 2018

This adds the possibility of parsing json files to the parameter handler. I tried to follow the xml implementation as closely as possible, but I think the xml implementation contains a bug, which doesn't to use spaces in the names of variables. That is solved here by making the mangler public and using it to escape/mangle the text.

If it is agreed that this is a correct implementation, I can add some tests. We also might want to make demangle public.

@drwells
Copy link
Member

drwells commented Mar 4, 2018

Thanks for taking the time to implement and submit this 👍

I have a few comments:

  1. We should just make mangle and demangle internal functions that are defined only in parameter_handler.cc. If I understand this correctly we need it in the recursive parsing functions.
  2. Where are the json and XML parsers different? I was not able to spot the difference.
  3. Would it be possible to add a test that fails on master where we have an XML file with an invalid parameter name?

@MFraters
Copy link
Contributor Author

MFraters commented Mar 4, 2018

  1. Sure, I will move them. Yes that is correct, they are at least needed for the sjon parser, see point 3.

  2. The only difference is the lines with read_xml and read_json and the mangle/key lines. But looking more closely, I see that the xml parser has some code and styling restrictions especially for xml, which are not needed for json. So I will simplify the code a bit.

  3. The problem with the xml parser is a bit more fundamental. Calling it a bug might not have been entirely correct, since the description says it is only mend for data writen out by the parameter handler itself and the GUI, which is already mangled. The problem is that the xml standart doesn't support multi word keys, because a word after the space is considered to be an attribute. So that is the reason why the xml reader doesn't 'need' a mangler. We could still support multiword characters, because as far as I know we do not need attributes, but that is I think a different discussion.

@MFraters MFraters force-pushed the add_json_parser_for_input_files branch 2 times, most recently from 22993e8 to 6a2197b Compare March 4, 2018 14:21
@MFraters
Copy link
Contributor Author

MFraters commented Mar 4, 2018

@drwells Thanks for the comments btw :)

Just a little clarification on 3: in short yes, that would just be an xml file with non mangled parameter names.

I have updated the code and I think that I have addressed your comments.

@@ -963,6 +963,15 @@ class ParameterHandler : public Subscriptor
*/
virtual void parse_input_from_xml (std::istream &input);

/**
* Parse input from an XML stream to populate known parameter fields. This
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy paste error here and below: not XML but JSON

* Mangle a string so that it doesn't contain any special characters or
* spaces.
*/
static std::string mangle (const std::string &s);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thanks for moving these. They were part of the class from back in a time when anonymous namespaces weren't available or didn't work well. They're obviously better placed in the .cc file.

// declared in the ParameterHandler object); if so, copy the value of these
// nodes into the destination object
void
read_json_recursively (const boost::property_tree::ptree &source,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this function differ from the corresponding XML one? If I see this correctly, then the input is just the ptree, and it shouldn't matter whether we got that ptree by reading XML or reading JSON?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't differ yet. I am currently looking into supporting json arrays, and then it will differ quite a lot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wrong, it differs a tiny bit. The json actually uses mangle on the key, where the xml one doesn't. I personally think the xml could also use the mangle, but I am sure the json one needs it. So should I keep them separate, or add the mangle to the xml file? And should I change the name then to xml_json or leave it at xml?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually pretty sure the XML also uses the mangle on the keys -- because that's how we internally store the data in ParameterHandler: the key is always mangled in the ptree data structure.

You can see that if you compare the tests/parameter_handler_read_xml.cc and tests/prm/parameter_handler_read_xml.prm files. Look, for example, for the "double 1" and "string list" parameters that have spaces in them.

So if you're mangling on the key, I wonder if you're really mangling twice.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The declarations in tests/parameter_handler/parameter_handler_read_xml.cc have spaces in them, but the xml file tests/parameter_handler/prm/parameter_handler_read_xml.prm which is read in doesn't (see for example double_201):


<?xml version="1.0" encoding="utf-8"?>
<ParameterHandler><int1><value>2</value><default_value>1</default_value><documentation>doc 1</documentation><pattern>0</pattern><pattern_description>[Integer range -2147483648...2147483647 (inclusive)]</pattern_description></int1><int2><value>3</value><default_value>2</default_value><documentation>doc 2</documentation><pattern>1</pattern><pattern_description>[Integer range -2147483648...2147483647 (inclusive)]</pattern_description></int2><ss1><double_201><value>2.234</value><default_value>1.234</default_value><documentation>doc 3</documentation><pattern>2</pattern><pattern_description>[Double -1.79769e+308...1.79769e+308 (inclusive)]</pattern_description></double_201><ss2><double_202><value>5.321</value><default_value>4.321</default_value><documentation>doc 4</documentation><pattern>3</pattern><pattern_description>[Double -1.79769e+308...1.79769e+308 (inclusive)]</pattern_description></double_202></ss2></ss1><Testing_25testing><string_26list><value>__&lt; &amp; &gt; ; /</value><default_value>&lt; &amp; &gt; ; /</default_value><documentation>docs 1</documentation><pattern>4</pattern><pattern_description>[Anything]</pattern_description></string_26list><int_2aint><value>2</value><default_value>2</default_value><documentation/>
<pattern>5</pattern><pattern_description>[Integer range -2147483648...2147483647 (inclusive)]</pattern_description></int_2aint><double_2bdouble><value>7.1415926</value><default_value>6.1415926</default_value><documentation>docs 3</documentation><pattern>6</pattern><pattern_description>[Double -1.79769e+308...1.79769e+308 (inclusive)]</pattern_description></double_2bdouble></Testing_25testing></ParameterHandler>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, exactly: the original text that had the spaces has been mangled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the original text which is read in. It is mangled because the xml writer outputs mangled text, because xml doesn't support multi word keys, and the xml ready seems to be written to only support xml which is writen out by the parameter handler.

Copy link
Contributor Author

@MFraters MFraters Mar 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what I am trying to say is that the xml read function is only tested with already mangled keys. (I think because xml itself doesn't support unmangled keys)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct -- the XML reader is written with XML files in mind that were written by the XML writer, and not by hand.

You can change that for the JSON though I would suggest that you keep the JSON reader and writer in sync so that you can read files back in that the JSON writer had previously output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, I see what you mean. My preferred solution would be to always mangle and demangle like we do with the normal parameter output, but for now I will get it out and we can discuss this in a possible follow up patch.

@masterleinad
Copy link
Member

/run-tests

@bangerth
Copy link
Member

bangerth commented Mar 6, 2018

@MFraters -- we're big believers in an incremental approach. For the moment, can you remove the function I commented above and instead call the existing recursive function? Then add a test similar to the XML test? (Maybe also add a changelog entry.)

I think this is already some progress in functionality that's worth committing as is, and it should be relatively uncontroversial -- let's get that one in already.

In a second step, you can come back to the issue of how you want to represent JSON files that do not 1:1 map onto ParameterHandler objects, and in particular how you want to deal with arrays. That is a more controversial issue, and I think it would be useful to first open a github issue in which you can discuss how you want to approach this. We can then discuss this before you spend too much time on implementing the feature.

@MFraters MFraters force-pushed the add_json_parser_for_input_files branch from 6a2197b to 3e0a6a8 Compare March 6, 2018 18:15
@MFraters MFraters force-pushed the add_json_parser_for_input_files branch from 3e0a6a8 to af10170 Compare March 8, 2018 16:20
Copy link
Member

@bangerth bangerth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good with the exception of the two doc places. Can you fix those and let me know?

/**
* Parse input from an JSON stream to populate known parameter fields. This
* could be from a file originally written by the print_parameters() function
* using the XML output style and then modified by hand as necessary, or from
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-> JSON

* could be from a file originally written by the print_parameters() function
* using the XML output style and then modified by hand as necessary, or from
* a file written using this method and then modified by the graphical
* parameter GUI (see the general documentation of this class).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take out this sentence. The GUI doesn't use JSON (though it could, in principle).

@bangerth
Copy link
Member

I just edited things myself. Let's let the tester run over this again, and if the tester is happy, anyone can commit!

@masterleinad masterleinad merged commit 06b5abd into dealii:master Mar 12, 2018
@bangerth
Copy link
Member

I forgot to ask for this: @MFraters, would you be willing to add an entry to doc/news/changes/minor as a follow-up pull request?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants