Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cli flag to attempt to cast to integer before boolean #11

Closed
wants to merge 2 commits into from

Conversation

LiamWalsh98
Copy link
Contributor

@LiamWalsh98 LiamWalsh98 commented Oct 12, 2023

Hello,

I ran into a scenario where an array-like collection of xml elements indexed by an attribute were being parsed as a boolean when the size of the was not greater than 2. Evidently, it is ambiguous whether something whose only instances are 0 and 1 should be cast as a boolean or not.

e.g.

<parent>
  <child ArrayIndex="0">
    chardata
  </child>
  <child ArrayIndex="1">
    chardata1
  </child>
</parent>

becomes ArrayIndex bool:

type Parent struct {
    Child []struct {
        ArrayIndex bool   `xml:"ArrayIndex,attr"`
        CharData   string `xml:",chardata"`
    } `xml:"child"`
}

instead of ArrayIndex int:

type Parent struct {
    Child []struct {
        ArrayIndex int      `xml:"ArrayIndex,attr"`
        CharData   string `xml:",chardata"`
    } `xml:"child"`
}

So I made a an "ObserveIntEager" flag in order to disambiguate this scenario. It essentially disallows "1" or "0" from ever being treated as a boolean at all by observing "int" first when it is enabled.

goxmlstruct -observe-int-eager <example>.xml


Though, in the process of testing regression, i think i may have discovered some unintended behaviour. It seems that if there is a single instance of 1 or 0 in a collection of integers, the spec becomes "string". The reason being that 1 and 0 are always observed as "bool" (because it is observed first), and the combination of instances of "int" + "bool" is then treated as "string".

Here are some test cases demonstrating the bug

Note that tests that include 1 or 0 fail, and the tests that do not include 1 or 0 pass.

If we have a collection of integer values that include 1 or 0 (e.g. test_int_parse), the spec becomes string.

If none of the integer values are 1 or 0 (e.g. test_int_parse_without_1_0), the spec correctly becomes int.

Now I am wondering whether 1 and 0 should just be excluded from the ParseBoolean check altogether. If that's the case then this PR should be closed.

P.S. Huge fan of chezmoi!

@twpayne
Copy link
Owner

twpayne commented Oct 12, 2023

Thank you for the extremely thorough PR and analysis and the kind words about chezmoi! Really happy that you like it :)

Though, in the process of testing regression, i think i may have discovered some unintended behaviour. It seems that if there is a single instance of 1 or 0 in a collection of integers, the spec becomes "string". The reason being that 1 and 0 are always observed as "bool" (because it is observed first), and the combination of instances of "int" + "bool" is then treated as "string".

This is a really good catch, and clearly a bug in go-xmlstruct.

Would it be sufficient to swap the order of these two if statements? If so, would you like to create a PR?

@LiamWalsh98
Copy link
Contributor Author

LiamWalsh98 commented Oct 12, 2023

That seems fine to me! My only concern is it would affect the way things are generated between go-xmlstruct's versions, thus possible breaking changes to users.

As long as that is fine, it should be alright (also we would be forgoing ever allowing 1 and 0 as bools). There definitely is a more complex look-behind way to solve the bug while leaving the generation otherwise intact (i.e. strict 1/0 still is evaluated as bool). edit: was going to say- I don't know if such a complex solution would even be worth the tiny edge case it solves.

My opinion is that it makes more sense to cast it to a type with a wider scope, that being int in this case.

I can close this PR and create a new one if we are in agreement!

@twpayne
Copy link
Owner

twpayne commented Oct 12, 2023

We're in agreement!

My only concern is it would affect the way things are generated between go-xmlstruct's versions, thus possible breaking changes to users.

We can bump go-xmlstruct's version to communicate the change. This is a change in behavior, but I think this is a genuine bug in go-xmlstruct and therefore is a minor version bump, rather than a major version bump.

@LiamWalsh98
Copy link
Contributor Author

Cool, sounds good to me! I'll close this and open a new PR

@LiamWalsh98 LiamWalsh98 mentioned this pull request Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants