Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration of I/O of ParameterHandler #9888

Closed
peterrum opened this issue Apr 14, 2020 · 57 comments · Fixed by #9919
Closed

Configuration of I/O of ParameterHandler #9888

peterrum opened this issue Apr 14, 2020 · 57 comments · Fixed by #9919
Milestone

Comments

@peterrum
Copy link
Member

Recently there have been a couple of additions to the ParameterHandler class, in particular to its I/O functionalities.

In the case of output, the additions are:

  • ShortXML and ShortJSON
  • the optional sort_alphabetical flag

To handle all the new cases and the old cases and to be backwards compatible, we could modify the OutputStyle the following way:

  enum OutputStyle {
    with_documentation = 0,                      //  0
    without_documentation = 1,                   //  1

    sort_alphabetically = 2                      //  2

    Text = 4,                                    //  4
    ShortText = Text | without_documentation,    //  5

    LaTex = 8,                                   //  8

    Description = 16,                            // 16

    JSON = 32,                                   // 32
    ShortXML = XML | without_documentation,      // 33

    JSON = 64,                                   // 64
    ShortJSON = JSON | without_documentation,    // 65
  }

Any options on such a way to configure the output style (motivated by update flags of FEValues)?

In the case of input, the additions are:

  • skip_undefined for XML and JSON files
  • the flag assert_mandatory_entries_are_found

Wouldn't it be a good time to introduce a struct for these so that the list of defaulted parameters in function call doesn't become too long?

@elauksap @nfehn This might be also relevant for you.

@peterrum peterrum added this to the Release 9.2 milestone Apr 14, 2020
@nfehn
Copy link
Contributor

nfehn commented Apr 14, 2020

Is the idea to use sort_alphabetically similarly to without_documentation and introduce further types of OutputStyle?

My way to address the question regarding a good design would be to look at the code and choose a design for which the implementation is most simple and easy to understand. In my experience this also leads to a user friendly interface. I'm not sure if the user interface is simpler when setting up a data structure instead of calling the function with an additional parameter. The fact that both standard and short versions are entities of OutputStyle leads to cumbersome if-statements in the code currently, which would probably become more complicated if the number of parameters is further increased. Although this goes in a different direction, I would actually like to make efforts to remove the Short versions from OutputStyle if incompatible changes will be made in this context. An enum type with combinations of parameters might have a higher complexity than a set of two or three independent parameters?

Can you give a concrete example illustrating that the suggested design leads to simplifications?

@luca-heltai
Copy link
Member

luca-heltai commented Apr 14, 2020

This would allow a code like

prm.print_parameters("file.prm", Text | Short | Alphabetical | MarkMandatory);

which is much better, in my opinion, w.r.t

prm.print_parameters("file.prm", ShortText, true, false, false, ...);

where MarkMandatory is just an example of an additional argument that will find its way in the implementation at some point.

The code path is actually simpler:

if(style & ShortText) 
  do short
if(style & Alphabetical)
  do alphabetical

etc. Why do you think it would be more complicated?

@pcafrica
Copy link
Contributor

pcafrica commented Apr 14, 2020

Agree with @luca-heltai!
Such a switch can just be implemented in the same fashion as the already-mentioned UpdateFlags in FEValues. I find it more readable and, most of all, more easily mantainable in case new flags need to be added in the future!

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

What about such an Assert in ParameterHandler::recursively_print_parameters()

  // this function should not be necessary for XML or JSON output...
  Assert((style != XML) && (style != JSON), ExcInternalError());

@luca-heltai Maybe I did not understand you correctly, but you say that ShortXML will also run into this assert and throw an exception? so you will not have to change this part of the code, or do you (XML=32, ShortXML=33)?

@pcafrica
Copy link
Contributor

It should be enough to check (style & XML) == 0, this will consider both XML and ShortXML automatically.

What about such an Assert in ParameterHandler::recursively_print_parameters()

  // this function should not be necessary for XML or JSON output...
  Assert((style != XML) && (style != JSON), ExcInternalError());

@luca-heltai Maybe I did not understand you correctly, but you say that ShortXML will also run into this assert and throw an exception? so you will not have to change this part of the code, or do you (XML=32, ShortXML=33)?

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

So we can consider this a bug in the current implementation? My point is that you have to change the code, not that you can fix it.

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

@luca-heltai Regarding complecity, how does the number of entities of OutputStyle grow with the number of independent parameters for your example Text | Short | Alphabetical | MarkMandatory?

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

Another example taken from ParameterHandler::print_parameters()

  if (style == ShortXML || style == ShortJSON)
    {
      recursively_remove_documentation_from_tree(current_entries);
    }

  if (style == XML || style == ShortXML)
    {
      boost::property_tree::ptree single_node_tree;
      single_node_tree.add_child("ParameterHandler", current_entries);

      write_xml(out, single_node_tree);
      return out;
    }

  if (style == JSON || style == ShortJSON)
    {
      write_json(out, current_entries);
      return out;
    }

A better design in my opinion is using a bool short along with enum OutputStyle { Text, JSON, XML}

  if (short)
    {
      recursively_remove_documentation_from_tree(current_entries);
    }

  if (style == XML)
    {
      boost::property_tree::ptree single_node_tree;
      single_node_tree.add_child("ParameterHandler", current_entries);

      write_xml(out, single_node_tree, short);
      return out;
    }

  if (style == JSON)
    {
      write_json(out, current_entries, short);
      return out;
    }

It should be enough to check (style & XML) == 0, this will consider both XML and ShortXML automatically.

Maybe one can also write the above code example more elegantly with the new OutputStyle design, but my suggestion does not make implicit assumptions about how OutputStyle has been constructed, the parameters are completely independent. For me, this is clearly easier to understand and less complex.

@pcafrica
Copy link
Contributor

So we can consider this a bug in the current implementation? My point is that you have to change the code, not that you can fix it.

I suspect it is wrong, also ShortXML should be asserted as well!

@luca-heltai Regarding complecity, how does the number of entities of OutputStyle grow with the number of independent parameters for your example Text | Short | Alphabetical | MarkMandatory?

OutputStyle is going to contain only the list of independent parameters, not all the possible combination! One bit for short/long, one bit for alphabetical/unsorted, one bit for each style (Text, XML, ecc) and so on.
Then all the combinations are exposed to the user via the bitwise operators.

I think we should take care of not giving the user the possibility to make mistakes, such as choosing Text | XML | JSON. Throwing an exception should be enough to prevent this situation to happen.

@pcafrica
Copy link
Contributor

Another example taken from ParameterHandler::print_parameters()

  if (style == ShortXML || style == ShortJSON)
    {
      recursively_remove_documentation_from_tree(current_entries);
    }

  if (style == XML || style == ShortXML)
    {
      boost::property_tree::ptree single_node_tree;
      single_node_tree.add_child("ParameterHandler", current_entries);

      write_xml(out, single_node_tree);
      return out;
    }

  if (style == JSON || style == ShortJSON)
    {
      write_json(out, current_entries);
      return out;
    }

A better design in my opinion is using a bool short along with enum OutputStyle { Text, JSON, XML}

  if (short)
    {
      recursively_remove_documentation_from_tree(current_entries);
    }

  if (style == XML)
    {
      boost::property_tree::ptree single_node_tree;
      single_node_tree.add_child("ParameterHandler", current_entries);

      write_xml(out, single_node_tree, short);
      return out;
    }

  if (style == JSON)
    {
      write_json(out, current_entries, short);
      return out;
    }

It should be enough to check (style & XML) == 0, this will consider both XML and ShortXML automatically.

Maybe one can also write the above code example more elegantly with the new OutputStyle design, but my suggestion does not make implicit assumptions about how OutputStyle has been constructed, the parameters are completely independent. For me, this is clearly easier to understand and less complex.

That's a good and elegant solution for sure, but - as already stated by @luca-heltai - it is poorly scalable as the number of inputs increases!

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

I think we should take care of not giving the user the possibility to make mistakes, such as choosing Text | XML | JSON. Throwing an exception should be enough to prevent this situation to happen.

Why not choosing a design where errors are excluded by design?

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

it is poorly scalable as the number of inputs increases!

What do you mean exactly by scalable? Do you refer to concrete examples from the code or is this a worst case scenario? I think we should take into account that not every parameter will enter the same function.

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

to be more precise, not every parameter will enter every function within the implementation

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

OutputStyle is going to contain only the list of independent parameters, not all the possible combination! One bit for short/long, one bit for alphabetical/unsorted, one bit for each style (Text, XML, ecc) and so on.

So OutputStyle would look differently than introduced in the very beginning? How would it look like?

@pcafrica
Copy link
Contributor

it is poorly scalable as the number of inputs increases!

What do you mean exactly by scalable? Do you refer to concrete examples from the code or is this a worst case scenario? I think we should take into account that not every parameter will enter the same function.

I only mean that the number of inputs would necessarily grow in case new options need to be added: one more input for every new option to set.

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

Ok, I agree. Maybe I just don't know what you plan for the future. Regarding the current code version, I actually do not have the impression that interfaces are too fat.

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

@luca-heltai: Regarding your example

This would allow a code like

prm.print_parameters("file.prm", Text | Short | Alphabetical | MarkMandatory);

which is much better, in my opinion, w.r.t

prm.print_parameters("file.prm", ShortText, true, false, false, ...);

Why don't you need a class name or namespace in front of Text, Short, etc.?

@peterrum
Copy link
Member Author

Thanks for the vivid discussion! Let's wait until the weekend to make a final decision. Personally I would favor the new enum-approach.

@nfehn Regarding your comments:

  • the line shown in Configuration of I/O of ParameterHandler #9888 (comment) is obviously a bug (introduced by me - since I have not added ShortXML and ShortJSON there) but should be fixed with the new enum approach by a simple bit operation.

  • I see your point that not all options are available for formats. However, this would be the first step towards making the functions more similar (and maybe the first step towards Deprecate ParameterHandler::parse_input_...() functions to make them private in the long run? #9878). Valid bit combinations could be asserted.

  • The reason why, I opened this issue is that some of the parameters are new since the last release and we could still modify them without backwards-compatibility headaches if we see the need. I am pretty sure there will be additional parameters in the future.

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

I hope you don't feel accused because of the bug but I am convinced that we have to discuss these things concretely and not with foo-bar toy examples. Nevertheless, the bug shows that the design is error prone and errors pass peer-review because of its complexity. We simply have to accept that we make mistakes and should therefore choose a design that minimizes these kinds of errors. I am sharing so many concerns here because now we have the chance to change it for the better. I think it is a false conclusion to believe that the complexity of a code is purely defined by the number of parameters of functions. What do we gain from a maybe nicer user interface on the application level if this costs us a series of bug fixes, not mentioning the difficulties people will encounter when trying to make changes to the code.

It has also been referred to FEValues more than once in the above discussion, for me this is also a sign of the complexity of this approach, i.e. we need an analogy to a well known concept of the dealii community to explain it, it is not self-explanatory. Let me therefore raise another concern: Is it really a good design that one needs the concept of UpdateFlags in mind to make changes to ParameterHandler? I think we should make efforts that ease to modify and extend functionality of the library by "deal.II newcomers" (not everyone is a dealii-developer with decades of experience).

@pcafrica
Copy link
Contributor

pcafrica commented Apr 15, 2020

So OutputStyle would look differently than introduced in the very beginning? How would it look like?

It can be something like:

enum OutputStyle {
  Short = 0x0001,
  Alphabetical = 0x0002,
  MarkMandatory = 0x0004,

  Text = 0x0010,
  LaTex = 0x0020,
  Description = 0x0040,
  XML = 0x0080,
  JSON = 0x0100,
}

Then you can check (omitting the namespace for brevity):

if ((style & Text) && (style & Short)) // ShortText
  ...
if ((style & XML) && !(style & Alphabetical)) // Unsorted XML with documentation.
  ...

and so on.

Ok, I agree. Maybe I just don't know what you plan for the future. Regarding the current code version, I actually do not have the impression that interfaces are too fat.

I don't know either what is coming in the future! But we better get ready, since in my opinion the effort of writing a more general and user-friendly interface using bitwise flags is the same than other approaches, so it is worth it! 😄

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

@elauksap

enum OutputStyle {
Short = 0x0001,
Alphabetical = 0x0002,
MarkMandatory = 0x0004,
Text = 0x0010,
LaTex = 0x0020,
Description = 0x0040,
XML = 0x0080,
JSON = 0x0100,
}

For me, this is a struct with one enum and three bools. Why can't we just be explicit about what is going on.

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

@luca-heltai regarding the example in the beginning: I agree regarding the idea, but there is more to it (I will skip mark_mandatory because this does currently not exist in the code): I think you would have to write

prm.print_parameters("file.prm", 
  ParameterHandler::Text | 
  ParameterHandler.:Short | 
  ParameterHandler::Alphabetical);

Is this really much simpler than

const bool short = true, alphabetical = true;
prm.print_parameters("file.prm", ParameterHandler::Text, short, alphabetical);

or using the new function introduces in #9862

prm.create_default_input_file("file.prm", short, alphabetical);

? I agree that a sequence of true, false, false is not perfect because one does not immediately see the meaning of each bool, but it might be a typical use case that the bools that I introduced above might already be member variables of a class and are used at different places within a code.

Now imagine a user wants to explicitly have the option short as a parameter in his application code and arrange the data in his own data structures (e.g. because the code existed before this refactoring). So my question is: how does the call to print_parameters() look like for your preferred design if you have to work with

bool short = true, alphabetical = true;

(some logic in the program might change the value of these bools)? would you write the call to print_parameters() multiple times in combination with two nested if-else-branches? Would one again introduce parameters outside the function call? In either case, complexity will not be reduced.

I have the concern that we will give up a lot of flexibility by the enum OutputStyle design, and in the end not make things simpler.

@luca-heltai
Copy link
Member

This is not performance critical code. It will be run once (maybe twice) per program. I don't think we should worry about the "code complexity" at all. I don't really care how ugly the code looks like internally.

What I care about is the following observation: we have been struggling in the past in several places where we had booleans , and then every function call in the user code became

some_function(argument, true, true, false, true, false);

and everybody had to look up the documentation all the time to figure out what was wrong.

Now think about the case in which you name your booleans, in order to document what is happening:

bool short = true, alphabetical = true;

well, now you call the function with

some_function(short, alphabetical);

but you get this wrong, and for some reason, you write:

some_function(alphabetical, short);

I find myself doing this type of mistakes a lot. This means that the function is badly designed. I don't care that internally things may look more complicated with the enum design, but if you need to pass more than one boolean to a function, it is something that will cause bugs in user codes that are always difficult to find. No matter if it is "easier" internally, we should at all cost avoid this. Think of FEValues: would you want to have

FEValues(mapping, fe, quad, true, true, false, false, false, true, false, true, .....);

or

FEValues(mapping, fe, quad, update_values | update_quadrature | update_normals);

The first option is not acceptable (IMHO).

@peterrum
Copy link
Member Author

I agree with @nfehn that the extended enum is somewhat a struct with one enum (OutputStyle) and two bools (short and alphabetical). Merging these two one enum as proposed in the description is possible since the two parameters are of type bool. However, maybe in the future this might change?

Just like @luca-heltai, I don't like such long parameter lists (what is one reason for this issue). Let's quickly think through the pros and cons of a struct?

  • + the parameter would be named
  • + it is extendable
  • - we cannot write it in one line

Are there any other issues, I did not think of?

I have thought a bit how it might be possible to integrate such a struct in a way without breaking backwards compatibility:

  • duplicate the functions (one set which takes OutputStyle as it is done in release 9.1 and one which takes a struct)
  • internally convert the OutputStyle into the struct (in particular replace ShortText by Text and a bool=true in the struct) and call the struct version of the function

Have I forgotten any aspect? The question would be also store this struct within the ParameterHandler class or pass it always to the function. In many other functions we call once set_flags().

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

This is not performance critical code. It will be run once (maybe twice) per program. I don't think we should worry about the "code complexity" at all.

Sorry, there has been a misunderstanding, I am talking about complexity from an OO design perspective.

I don't really care how ugly the code looks like internally.

Unfortunately I do. Are we really already at the point that the discussion needs to be stalled like this?

I agree with the points in which your preferred design makes certain things simpler at first sight, but I have the feeling you currently do not address the problems it will raise. Your FEValues example is just another example in addition to the first example you mentioned, but you did not address the concerns mentioned in the above comments?

If you think that are too many boolean parameters, you can introduce a struct with reasonable default values where the user only sets those parameters he is currently interested in, just as in your FEValues example. If some of the parameters are variable and are changed by the application program, your examples will look more difficult as well.

The discussions of this issue can be split into two categories:

  1. Do we want to make certain interfaces more lean by combining parameters into a common data structure?

  2. How exactly should this data structure be designed?

I currently have the feeling we mix up certain things. @luca-heltai You are voting for a common data structure, but you did not convince me that the enum design is the right way to go in terms of all the concerns raised above.

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

we cannot write it in one line

I claim you can also not do this in general for the enum design (hence, is it really a con of the struct design?), see

Now imagine a user wants to explicitly have the option short as a parameter in his application code and arrange the data in his own data structures (e.g. because the code existed before this refactoring). So my question is: how does the call to print_parameters() look like for your preferred design if you have to work with bool short = true, alphabetical = true; (some logic in the program might change the value of these bools)? would you write the call to print_parameters() multiple times in combination with two nested if-else-branches? Would one again introduce parameters outside the function call? In either case, complexity will not be reduced.
I have the concern that we will give up a lot of flexibility by the enum OutputStyle design, and in the end not make things simpler.

@luca-heltai
Copy link
Member

I think Alphabetical and Sorted belong to the category OutputStyle. These are not booleans but output styles. What is a good reason to keep them separated?
The internal complexity is really not an issue. Again, this is code that will be executed once per program.

we cannot write it in one line

I don't think that writing in one line is an issue: how many lines do you need to define your booleans (if you want to use named booleans and document the code to yourself)?

The current user interface was introduced in 9.2pre: in other words, now it is the right time to make a choice and make sure that whatever we have introduced after 9.1 is reverted (if we opt for one OuputStyle enum). After 9.2, this will be much more complicated.

Again, since there has been a lot of work in this area lately, and I know from @elauksap that other changes will probably be needed, going through the route of >1 boolean as arguments is really not a realistic option.

Unfortunately I do. Are we really already at the point that the discussion needs to be stalled like this?

You should stop taking comments on code design personally. There is no stall in the discussion.

Allow me to disagree on the need of a struct: we really only need one enum. If you think it is not appropriate to mix Alphabetical, Short, Xml, Text, etc., then this could be a good compromise, where the logic is clear and self documented:

  • OutputFormat: Text, Xml, Json
  • OuputStyle: SkipComments, Alphabetical

The users who have bools in their code are the users who have developed their code base only on the development branch (9.1 does not have short, nor alphabetical as an option to the output style, as the only thing we have in 9.1 is OutputStyle (See here)

I have the concern that we will give up a lot of flexibility by the enum OutputStyle design, and in the end not make things simpler.

what is the flexibility you are referring to? The flexibility I have in mind is the following: anything that changes the outcome of print_parameters, be it to string or to stream, can be grouped in the OutputStyle enum. Manipulations of these flags is very simple, self documented, and does not make life more complicated for any of the user codes developed on top of 9.1.

I have thought a bit how it might be possible to integrate such a struct in a way without breaking backwards compatibility:

I don't think there we are breaking any backward compatibility. 9.1 does not have any boolean.

duplicate the functions (one set which takes OutputStyle as it is done in release 9.1 and one which takes a struct)
internally convert the OutputStyle into the struct (in particular replace ShortText by Text and a bool=true in the struct) and call the struct version of the function

Why? The current version of the OutputStyle already does this (see here, for example OutpuStyle::Description which has the same logical role of "alphabetical", "short", etc.

Our current enum already serves all our purposes, can be made backward compatible by splitting ShortText into Short | Text.

Maybe I'm not seeing what you refer to as "flexibility". Can you elaborate on this?

@nfehn
Copy link
Contributor

nfehn commented Apr 15, 2020

I thought that this is not your honest conviction

I don't really care how ugly the code looks like internally.

For me, it seems you are following another intention by such statements, because I raised concerns that at some points the internal implementation might become more complicated. I gave concrete examples taken from the code and also identified a bug in the implementation. Instead you try to circumvent these discussions about the questions of interest and accuse me of taking things personally. Show me that I am wrong and that the internal implementation will become simpler. Don't worry, I won't take that personally. I'll accept and say you are right.

I already mentioned that the aspect of code performance has been a misunderstanding, why do you continue in this direction

The internal complexity is really not an issue. Again, this is code that will be executed once per program.

In a similar direction, I would be happy if you could explain what you wanted to tell me with this statement

I don't think that writing in one line is an issue: how many lines do you need to define your booleans (if you want to use named booleans and document the code to yourself)?

The number of lines depends on how many booleans one has and how long the names of the booleans are, and of course on the formatting of the files. I guess this is not what you wanted to hear. Again, I have the feeling that you follow another intention with this statement, namely that the argument "writing in one line" introduced by @peterrum can be skipped because you don't like it.

Maybe I'm not seeing what you refer to as "flexibility". Can you elaborate on this?

I elaborated on this and even repeated it again with clear hints. I think it does not make sense to make the discussion longer and longer. I raised questions directly to @luca-heltai above. Once they are answered, I will elaborate on that.

@luca-heltai
Copy link
Member

I don't know why, but this has become a bit too personal.

It looks to me you are getting frustrated over this argument. That's not my intention, was not my intention, and I apologize if you get frustrated about this.

Let's go back to code.

I cannot find real questions in your comments, just a general sense of frustration which I did not want to provoke. Let me be clear about what I meant:

Any code path internally can be translated to the version you advocated by two lines at the beginning:

const auto bool alphabetical = style & OutputStyle::Alphabetical;
const auto bool short = style & OutputStyle::Short;

...
if(alphabetical) {
}

....

if(short) {
}

I'm perfectly fine with having two such lines at the beginning of the implementation, so that nothing of what anybody wrote for the implementation needs to change. If this is the flexibility we are discussing about (again, it is internal, in the implementation of the library, and does not affect users), then there is no point in the whole discussion.

The question I raised is only on the user interface, and so far I saw no (reasonable) arguments in favour of a collection of booleans beeing better than bitwise "or"-operations. Again, as long as there are at least two boolean as arguments, this is bad design, bug prone, and difficult to document in the code.

This is the only sensible question to discuss about, and the only one that is difficult to change later on after the release.

Please, leave alone the internal implementation. If you think a boolean is better for the internal implementation, one line of code at the beginning of the implementation is all we need. Maybe this was not clear with my "I don't care how ugly it is".

Also, if you have codes in which a boolean is created by you, or in a parameter file, stating "output short style", then you can do the reverse, which also documents your code better:

bool short = true;

....
OutputStyle style = Prm | (short? Short : Default);
print_parameters(out, style);

This is flexible, and makes sure that print_parameters does not get the wrong arguments in the wrong order, can be checked for consistency internally in a much more sofisticated way, etc. etc. etc.

By the way, it is VERY easy to make a Patterns::OutputStylePattern, that converts a string to one enum.
as in

set style = Text | Alphabetical | Short

which, again, is much better to me than

set Alphabetical = true
set Short = true
set Ouptut style = Text

so also from this point of view, I see no advantages in having booleans + OuputStyle.

We have been using this type of programming style since 1997, and every time this was violated (by myself several times as well!), we regretted this.

We have had many discussions like this in the last 10 years over a lot of enumeration classes of deal.II. This should also be taken into account when deciding for an interface.

I'm sorry that the tone of the conversation became so harsh, and I apologize if it was my fault.

@nfehn
Copy link
Contributor

nfehn commented Apr 16, 2020

I think you made your point clear that a bitwise enum = a collection of booleans, and I like that the discussion is beginning to become more open. Now by looking at OutputStyle posted above, I am not sure that this is just a collection of booleans. You can combine short with alphabetical, but you can not combine XML with JSON, while you can again combine XML with short. You mention this very briefly in the above comment but for me the aim of such an issue is to discuss the pros and cons in the same detail (this does not imply that I take comments personally or that my comments should be interpreted as frustrated, just a balanced discussion of pros and cons, as a basis for a decision).

Hence, there is more structure in the enum then one can identify by only looking at the declaration of the enum OutputStyle. You will need additional constraints (in the form of asserts and other checks that have to be implemented explicitly) to enforce that structure and prevent wrong things or avoid errors in the code. In my experience, we will now manage to implement this (and it now also looks quite simply like "we can easily enforce this with just a few asserts"). But when looking at the code later and when people add functionality over the years, adjust the asserts that enforce the logic etc., we will catch ourselves spending a lot of time with questions like "ok, what was the idea why we did it like this, is parameter c compatible with parameter f, etc.?). C++ gives us the possibilities to choose a design that describes the data structure more naturally. "Flexibility" and "complexity" are to be understood in this context. Hence, I am of the opinion that we still have to open ourselves for a more reflected discussion. @luca-heltai By the way, these concerns have been raised long before, which is why I don't see why to forget about the upper part of the discussion. Maybe this is the pluralism of opinions you have to live with. As said before, I can live with your favorite option, I just want that we have thought about the cons.

@nfehn
Copy link
Contributor

nfehn commented Apr 18, 2020

I think for a decision one needs to compare to the alternative

struct OutputStyle
{
  bool with_documentation;
  bool sort_alphabetical;

  enum OutputFormat
  {
    Undefined,
    LaTeX,
    PRM,
    XML,
    JSON
  } format;
};

Let me explain the advantages such a design could have:

  • no implicit logic and incompatible parameter combinations because the enum OutputFormat can only take one value. The enum OutputStyle design allows to write short | Text | JSON (which would be invalid)?
  • the data structure can be understood immediately. This is not the case for the current OutputStyle (see Extend OutputStyle #9919) where we will need additional documentation and Asserts to enforce the structure which is already inherently contained in struct OutputStyle. The problem is that in the code these asserts will occur at other places than the declaration. This is clearly not transparent.
  • it is also impossible to mix-up the order of boolean variables in function calls just as for the enum OutputStyle design. Potential errors as described above by @luca-heltai regarding the order of function arguments are excluded by design.

Disadvantages:

  • can not use short cut short | Text in function call, data structure has to be filled step-by-step

Suggestion to a solution for this disadvantage: write a free function that does the translation between enum OutputStyle and struct OutputStyle with all the asserts to enforce correctness of enum OutputStyle, i.e., provide an additional constructor for struct OutputStyle with enum OutputStyle as parameter. This would be a single-responsibility design with a clear separation of conerns. The function call would then be print_parameters(..., OutputStyle(short | Text), but this solution also provides the flexibility to write print_parameters(..., output_style) with the variable struct OutputStyle output_style filled at other places in the code, i.e., there are user codes that do not want to call in a const-style short | Text). I still claim struct OutputStyle should be choosen for the internal implementation of ParameterHandler.

@luca-heltai Could you please briefly complete the list of cons to make a decision?

@bangerth
Copy link
Member

It isn't quite correct that the enum in the description above can not have invalid values. It's perfectly legal to write

  OutputStyle style;
  style = 245;

or

  style = PRM | XML;

It's of course a semantic bug to do this, but the language doesn't prevent you from doing it -- in the same way as you're not prevented from writing

  style = short | Text | JSON

in @luca-heltai 's style. In reality, one would have to write assertions in either case; in practice, we often haven't and then what happens is that you either get the first if branch in a chain of the form

  if (style & Text != 0)
    ...
  else if (style & JSON != 0)
    ...

or you end up in the default: branch of a switch-case statement:

  switch (style) {
    case Text: { ...; break; }
    case JSON: { ...; break; }
    default: Assert (false, ExcNotImplemented());
  }

My personal preference would be to go with the single enum. It's idiomatic C/C++ that is widely used both in the C++ standard library (e.g., http://www.cplusplus.com/reference/ios/ios_base/flags/) and in deal.II (e.g., in the UpdateFlags, ComponentMask, RefinementCase, GridTools::CacheUpdateFlags). If we're concerned about catching cases such as the mistaken examples above, we can always overload operator| as we do in several of the existing cases; operator| would then check that only one of the bits that correspond to formats is set; alternatively, that could also be done in operator=.

@nfehn
Copy link
Contributor

nfehn commented Apr 18, 2020

Yes, I agree. Maybe I should have written enum class which I use in my code. I think scoped enums (C++ 11) do not allow implicit conversions and the code should not compile when using the | operator?

@nfehn
Copy link
Contributor

nfehn commented Apr 19, 2020

@bangerth Also for plain enums, implicit conversions are actually only possible from enum to int, not in the other direction?

It isn't quite correct that the enum in the description above can not have invalid values. It's perfectly legal to write

  OutputFormat format;
  format = 245;

I don't think so: I think you will have to write

  OutputFormat format;
  format = static_cast<OutputFormat>(245);

similarly, you will have to write

  format = static_cast<OutputFormat>(PRM | XML);

Did you assume that the operator | has already been overloaded? I would not do that in my approach (C++ gives us freedom, and responsibility coming along with it). Let me further reassure you, you can also not write UpdateFlags flags = 245;.

Apart from these semantic discussions, I originally hoped we could have discussed the approach from a design perspective. As I realized now, alea iacta est.

@pcafrica
Copy link
Contributor

pcafrica commented Apr 19, 2020

I don't think so: I think you will have to write

  OutputFormat format;
  format = static_cast<OutputFormat>(245);

Yes, this is true. The compiler is not able to implicitly cast 245 to the OutputFormat type.

Disadvantages:

* can not use short cut `short | Text` in function call, data structure has to be filled step-by-step

This is exactly what I personally dislike in your approach.

Also, I think that keeping the two booleans separated from OutputFormat exposes a less clear interface to the user, who may think that, e.g., with_documentation has less/nothing to do with OutputFormat::Text, whereas they refer exactly to the same concept. But this is a strongly subjective opinion.

Moving rather to the objective design perspective: what happens if a user would like to print the same ParameterHandler object multiple times with different styles?
In the approach suggested by @nfehn one would have to write, e.g. using aggregate initialization,

params.print_parameters(out1, {false, true, OutputStyle::OutputFormat::Text});
params.print_parameters(out2, {true, false, OutputStyle::OutputFormat::XML});

which can be error prone and very obscure to read without having the documentation at hand, or, more verbosely,

OutputStyle style;
style.with_documentation = false;
style.alphabetical = true;
style.format = OutputStyle::OutputFormat::Text;
params.print_parameters(out1, style);

style.with_documentation = true;
style.alphabetical = false;
style.format = OutputStyle::OutputFormat::XML;
params.print_parameters(out2, style);

which is, on the other hand, very unconvenient to write and error prone (message edited).

With the single enum approach suggested by @luca-heltai and which @bangerth and I support, the call would simply be:

params.print_parameters(out1, ShortText | Alphabetical);
params.print_parameters(out2, XML);

In both designs, the complexity of checking that the user has set an admissible combination of flags is hidden inside the class implementation and in my opinion it is not something we should be concerned about. The level of complexity here relies only on the level of defensiveness desired for the library programming style. The simplest approach, where one either falls into the first if branch or into the default: case of a switch, has already been pointed out by @bangerth and is already used in other places within deal.II.

@nfehn
Copy link
Contributor

nfehn commented Apr 19, 2020

which is, on the other hand, very unconvenient to write and error prone.

I am not sure about this. Let's make another example/comparison

OutputStyle style;
style.with_documentation = param_a;
style.alphabetical = param_b;
style.format = OutputFormat::Text;
params.print_parameters(out1, style);

@elauksap Can you write the same code in your favorite version? Please also use ParameterHandler:: as discussed somewhere above.

@pcafrica
Copy link
Contributor

pcafrica commented Apr 19, 2020

@nfehn you are right, perhaps it is not more error prone than writing {true, false, ...}, but you would agree with me that it is very long and unconvenient to write with respect to the other design.

But sorry I don't understand your question. with_documentation and alphabetical are booleans used only to specify the output style, why would anyone need to declare them elsewhere in the code?

By the way, if this was the case it would still be legal to write, e.g.:

OutputFormat param_a = Alphabetical;
params.print_parameters(out1, Text | param_a);
params.print_parameters(out2, XML | param_a);

@nfehn
Copy link
Contributor

nfehn commented Apr 19, 2020

"unconvenient" and "error prone" might be debatable. Could we first make the comparison please? What do you have to lose? Otherwise I get the impression you are picking those examples that underpin your hypothesis. Again, please use ParameterHandler::Short, ParameterHandler::XML, etc.

@pcafrica
Copy link
Contributor

pcafrica commented Apr 19, 2020

We can debate the meaning of "unconvenient", but not that 5 lines of code are more than 1 😄

Sorry again, I don't really get your question so I'm not able to answer. What is "the comparison" you are referring to? You asked

@elauksap Can you write the same code in your favorite version?

and I replied in my last comment. What other comparison would you like to see?

@nfehn
Copy link
Contributor

nfehn commented Apr 19, 2020

This code is written in the struct OutputStyle design

OutputStyle style;
style.with_documentation = param_a;
style.alphabetical = param_b;
style.format = OutputFormat::Text;
params.print_parameters(out1, style);

Can you write the same code in your favorite enum OutputStyle version? Sure, 5 lines is more than 1 line of code. In these words, show me that you can do a 1-liner with your favorite version for the above code, that is also convenient and not error prone. Again, is there something to hide?

As a motivation, assume tests are implemented with the boolean interfaces calling other functions with these interfaces. Assume complex application codes that use other interfaces etc.

@pcafrica
Copy link
Contributor

pcafrica commented Apr 19, 2020

@nfehn sorry, but why should we choose the design depending on whether some test or application prefers to use bools? Usually applications are written depending on the interface exposed by the library they make use of, not vice-versa 😄

By the way, I already answered to your question in a previous reply, so I don't know why you keep asking it.

@pcafrica
Copy link
Contributor

pcafrica commented Apr 19, 2020

For the sake of completeness, a one-liner would still be possible in the context you mentioned:

params.print_parameters(out1,  Text | (param_a ? 0: Short) | (param_b ? Alphabetical : 0));

Does this make sense to you?

@nfehn
Copy link
Contributor

nfehn commented Apr 19, 2020

Now, please add ParameterHandler::, you have to add it three times. Please also format your code in a way that one does not have to shift the horizontal bar from left to right to read your code, just for fairness.

@pcafrica
Copy link
Contributor

pcafrica commented Apr 19, 2020

ParameterHandler:: needs to be added in either case, so what is the point?
Since when this has become a matter of indentation style?

I don't want this discussion to become (again) a personal attack, but we just got to the point of discussing about how long one line of code would become when writing the full scope names, in a case that will most likely never happen (there would be no reason to keep independent boolean variables if we provided the single enum interface).

I think I had the chance to express all of my opinions, so I'm not going to indulge further provocations, unless you show a clear case where your solution gives and undisputable advantages with respect to the other one.

@nfehn
Copy link
Contributor

nfehn commented Apr 19, 2020

We are currently discussing one argument. Let's do one step after the other. Let's please complete this argument and then we go on with the next.

@nfehn
Copy link
Contributor

nfehn commented Apr 19, 2020

Sorry, you can't accuse me of trying to bring facts on the table.

Here is a suggestion, please correct me

params.print_parameters(out1,  ParameterHandler::Text | 
(param_a ? static_cast<ParameterHandler::OutputStyle>(0) : 
  ParameterHandler::Short) | 
(param_b ? ParameterHandler::Alphabetical : 
  static_cast<ParameterHandler::OutputStyle>(0)));

@pcafrica
Copy link
Contributor

pcafrica commented Apr 19, 2020

Templatize operator| and there will be no more need to statically cast the 0.

template <class T>
inline ParameterHandler::OutputStyle operator|(
    const ParameterHandler::OutputStyle a,
    const T b)
{
    return static_cast<ParameterHandler::OutputStyle>(
        static_cast<int>(a) |
        static_cast<int>(b));
}

At least, assuming that it is really relevant to provide compatibility with a pointless interface (personal opinion)...

@nfehn
Copy link
Contributor

nfehn commented Apr 19, 2020

Type T could be identical to ParameterHandler::OutputStyle (one could implement the other operator| function as specialization of this one). I think you should assert in this function that one can not combine, e.g., JSON with XML? Now assume someone is adding another output style, do you have to touch this function again?

@pcafrica
Copy link
Contributor

Yes, at least in a strongly defensive programming style. Or we can stick with the solution pointed out by @bangerth. Or we may decide that Text | XML is an admissible option and design print_parameters in such a way that in that case it prints both in Text and XML format (e.g. using only if statements rather than else if or switch).
But this is a different point of discussion.

@nfehn
Copy link
Contributor

nfehn commented Apr 19, 2020

Not really, you asked me for the advantages of the struct OutputStyle design: I believe this is clearly one (see above for a more complete discussion).

@pcafrica
Copy link
Contributor

pcafrica commented Apr 19, 2020

First, this has nothing to do with the bool param_a, param_b example so it's a different point of discussion (the operator| overload would be there regardless of the int specialization).

Second: yes, I agree that this can be an advantage for the internal implementation of the class, but it also results in multiple (debatable) disadvantages from the users' perspective (see above for a more complete discussion).

It's a personal point of view, but I've always preferred an approach where developers take care of the implementation difficulties at their best, but giving users a more friendly interface.

@luca-heltai
Copy link
Member

I think everyone will simply keep their own idea on this topic. There is no way to have everyone happy.

Please comment with a 🚀 for the solution proposed by me, and implemented by @peterrum , or with
👀 for the solution proposed by @nfehn

@nfehn
Copy link
Contributor

nfehn commented Jan 25, 2022

Let be briefly point to #13284 and #13290 (comment) given that @luca-heltai (receiving many "thumbs up") was strongly arguing with UpdateFlags as the gold standard in the present issue. One might want to rethink about the concerns raised in comment #9888 (comment) above ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants