Skip to content

Add additional tests for unexpected fields in units#183

Merged
kroenlein merged 4 commits intomainfrom
maintain/update-unit-handling
Apr 19, 2023
Merged

Add additional tests for unexpected fields in units#183
kroenlein merged 4 commits intomainfrom
maintain/update-unit-handling

Conversation

@kroenlein
Copy link
Copy Markdown
Collaborator

Field testing showed some bad behavior:

  • Punctuation like " and : were being swallowed, leading to things that weren't valid units being accepted
  • Strings that were just numbers being being interpreted as valid units

@kroenlein kroenlein requested a review from bfolie April 18, 2023 20:35
Comment thread gemd/units/tests/test_parser.py Outdated
"chain", # Survey units eliminated
"SECONDS" # Not just case insensitivity
"SECONDS", # Not just case insensitivity
"lb : in^3", # Not just case insensitivity
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems unrelated

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good eyes, copypasta

Comment thread gemd/units/impl.py Outdated
Comment on lines +29 to +30
if next(token for token in tokens).type == NUMBER:
return input_string # The unit can't have a leading number; scaling factors are internal
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does returning the input string lead to the desired error being thrown?

Also, why is a leading number not allowed? If g / 100 L is then why not 100 L / g? Or is this to cover for the cases in which the input string is just a number?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We exit the preprocessor without mangling the leading digit, so it's handled as it would have been handled by default. We could probably get equivalent functionality by just swallowing the first term, but I didn't want to have to deal with possible weirdness following from that if that first term would have set some of the internal booleans.

No leading digit because normal Pint behavior says that should be a magnitude on a value, and that kinda matches. I personally also can't think of a customary unit that has a scaling factor that has it up front like that. It conveniently also covers the case where it's just a number.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so it's handled as it would have been handled by default

I don't know what that is

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not allowing leading numbers is reasonable. And if this the effect of producing the desired error message then that's fine. I just find the logic to be unclear

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preprocessor is run internally to Pint's logic, so it already has opinions (depending on how one invokes the parse) how to handle numbers.

  • This preprocessor makes it so if there is a number not in the leading position, it is turned into a conversion factor token instead of lumping it into the magnitude. Without it, the units on g / 2.5 cm are reported as g / cm or raises an error depending on if you use parse_units or convert.
  • The gemd.units module then expresses the opinion that gemd-friendly units strings didn't have a scaling factor.

Updating the code so it just ignores the first term, which would I think be more clear.

Comment thread gemd/units/impl.py Outdated
exponent = token.type == OP and token.string in {"^", "**"}
division = token.type == OP and token.string in {"/", "//"}
if token.type == OP:
if token.string not in {"+", "-", "*", "/", "//", "^", "**", "(", ")"}:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would be more readable if the set of all allowed tokens was broken out as a constant

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair. I'd wanted to import it from Pint, but it's a private variable.

bfolie
bfolie previously approved these changes Apr 19, 2023
Copy link
Copy Markdown

@bfolie bfolie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't fully understand it, but the tests pass, at least in pint 0.20 (looks like we're also running tests against pint 0.18)

@kroenlein kroenlein merged commit 54df335 into main Apr 19, 2023
@kroenlein kroenlein deleted the maintain/update-unit-handling branch April 19, 2023 16:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants