Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Length.TryParse" parses invalid values #343

Closed
nimishmistry opened this issue Dec 14, 2017 · 8 comments
Closed

"Length.TryParse" parses invalid values #343

nimishmistry opened this issue Dec 14, 2017 · 8 comments

Comments

@nimishmistry
Copy link

Length.TryParse("2m,,,1", out length) is parsed and the length is set to "2m". Is this intended?

I'm trying to use TryParse for validation.

@angularsen
Copy link
Owner

Hi, thanks for reporting. This is not intentional, we try to be lenient about whitespace and combining multiple value/abbreviation combinations with different delimiters, but this is way beyond those usecases :-)
I'm sure it's a matter of tweaking the regex a little bit. Would you care to try to fix it? A pull request would be most welcome.

https://github.com/angularsen/UnitsNet/blob/master/UnitsNet/CustomCode/QuantityParser.cs
https://github.com/angularsen/UnitsNet/blob/master/UnitsNet.Tests/CustomCode/ParseTests.cs

@nimishmistry
Copy link
Author

Thanks of the offer, will definitely attempt to fix this.

@angularsen
Copy link
Owner

Awesome!

@bplubell
Copy link
Contributor

I took a look at this and found the problem is where the QuantityParser is trying to capture trailing, invalid data. It is only checking for alpha characters and, since comma is not an alpha character, it does not capture it.

@"(?<invalid>[a-z]*)?"); // capture invalid input

Trying to make this capture more characters (like comma) could be problematic with the current design where the string can have multiple sets of valid units (such as "1m and 200mm"). I think it would simplify things greatly and be more in keeping with framework parse methods if we did not allow multiple sets units. For example, double.TryParse("1+2") returns false throws an exception.

Proposal:
Fail when parsing with multiple sets of unit. The expectation would be that consumer code would pre-parse, since more complex maths than addition could be required and handled by the consumer.

  • An additional library (Nuget package) could separate the parsing of complex string from the core handling of units (either part of this repo or one outside of UnitsNet)
  • UnitNet should handle special cases like parsing FeetInches
  • This would be a breaking change

Thoughts?

@angularsen
Copy link
Owner

angularsen commented Jan 10, 2018

I agree it may be unintuitive that we support parsing strings like 1 kg 500 g. If I recall, the reasoning was originally that since we needed to support parsing 1' 5" for feet/inches, then maybe it was better that we had a consistent parsing logic instead of special cases. In hindsight, I don't know of any other cases besides feet/inches that actually make much sense in combining value/unit pairs like that? If no one can come up with other usecases for this, then yes I think we should consider removing it in favor of only special-case parsing feet-inches to spare everyones' sanity. It would be a breaking change and have to be added to our #180 wishlist for a major version bump, but that is always something we can do. The list is significant already.

Also I don't understand why we allow , separators between value/unit pairs?

@"(and)?,?", // allow "and" & "," separators between quantities

Whitespace or adding the word "and" in between, sure, but when is it natural to parse a string like 1',1" or 1kg,500g ? It seems contrived to me, but maybe I've just forgotten the reason it was added in the first place. And it should not have been valid to have empty value/unit entries such as in your example with multiple commas in a row. There is simply a lot of odd things going on here and I think we are trying to be way too lenient on the input, which is ultimately just going to cause problems when people expect it to handle all sorts of trash input.

Related PRs:
#64
#81
#254

I have not yet read through these, so my recollection is still poor on design choices.

@maherkassim Was involved in much of this, do you have any comments or recollection of why we did these things?

@maherkassim
Copy link
Contributor

maherkassim commented Jan 14, 2018

@angularsen I believe that the main use case was feet & inches (eg. 1' 5"), but that we decided to keep the regex broad to avoid needing to add other special cases in the future (all of the relevant discussion can be seen on #81).

To clarify, I'm ok with having the regex be more restrictive and handling feet & inches as a special case. The only other special case I'm aware of is stone & pounds, but I'm not sure how often that's used (especially within the context of UnitsNet).

Also, maybe @gojanpaolo can shed more light on any related changes in #254?

@angularsen
Copy link
Owner

Thanks for the heads up of stone and pounds @maherkassim.

@nimishmistry Anything I can help to move forward with a PR on this?

As noted above, this will be a breaking change and can only be merged in as part of a v4 and it will likely take some time with prerelease to get all the other changes in #180 included as well, but this is a good start at any rate. I just pushed a new v4 branch that we will create pull requests towards for breaking changes like this and those changes in #180.

This was referenced Sep 22, 2018
@angularsen
Copy link
Owner

This issue will be resolved in v4 release, see #487 .

angularsen added a commit that referenced this issue Dec 16, 2018
# 4.0.0 Release
This PR will serve as the list of items to complete and it will be updated to show the progress before finally merged into `master` when completed. We all have busy schedules, so **if you want to help move this work forward then that is much appreciated!** 

## The main theme is to reduce binary size
In two years it has grown from 280 kB to 1.4 MB and a lot of it is due to unnecessary syntactic sugar with many method overloads for various number types and nullable types - for every of our 800+ units! It simply adds up to a big total.

These items are chosen from #180, trying to include as many of the low hanging fruits as possible, while still keeping the list short and realistic to complete in a reasonably short time - we are all busy in our daily lives. We can always have more major version bumps later than trying to perfect it all now.

## Feature complete before November, merged before mid-December
I would like to aim for a "beta" pre-release nuget sometime in October with all the items completed, so we can have some time to test it before releasing the final, stable 4.0.0 version before Christmas holidays.
We should have the work items list more or less final before Monday, October 8th.

## Changes
#### Added
- [x] UnitSystem with a different meaning, now defines the base units for a unit system (#524)

#### Removed
- [x] Replace netstandard1.0 target with netstandard2.0, to avoid the extra dependencies (#477)
- [x] Remove code marked as `[Obsolete]`, such as `VolumeUnit.Teaspoon`  (#490)
- [x] Remove nullable `From` factory methods (#483)
- [x] Remove extension methods on _nullable_ number types (#483)
- [x] Remove all number extension methods (#497)
- [x] Remove `Length2d`, replaced by `Area` (#501)
- [x] Remove static methods on `UnitSystem`, should use `UnitSystem.Default` instead (#496)
- [x] Remove unit parameter from `ToString()` methods (#546)

#### Changed
- [x] Throw exception on NaN values in static constructor methods, like `FromMeters()` (#502, see #176 (comment))
- [x] Throw if unit was not specified when constructing quantities, (#499 - #389 (comment))
- [x] Stricter parsing (#343 and #180 (comment))
- [x] Remove unit parameter from `ToString()` methods (#546)
- [x] Change Temperature arithmetic back to use base unit Kelvin (#550, see #518)

#### Renamed
- [x] Correct SingularName for some Flow unit definitions (#494, see #360)
- [x] Split/rename UnitSystem into UnitParser, GlobalConfiguration, UnitAbbreviationsCache (#511)

#### Fixed
- [x] Search for any TODO comments in the code to address, remove comment and either fix or create issue (5d24432)
- [x] Update README with v4 changes #498
- [x] Do not try/catch in UnitConverter.Try-methods (#506)
- [x] Do not try/catch in `Length.TryParse()` and for other quantities (#507)
- [x] Add/update changelog in wiki for v4 with some summary copied from this issue, some v3 to v4 migration instructions and briefly explain why some stuff were removed

## Milestones
- [x] Finalize work items list (Monday, October 8)
- [x] Release 4.0.0-beta1, feature complete (Wednesday, October 31)
- [ ] ~Show release notes and upgrade guide when upgrading to 4.x nuget, if possible~
- [x] Release 4.0.0 (Monday, December 17)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants