Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coerce decimal strings for whole number types by truncating the decimal part #25835

Merged
merged 1 commit into from Jul 26, 2017

Conversation

Projects
None yet
5 participants
@scottsom
Copy link
Contributor

commented Jul 21, 2017

This changes makes it so you can index a value like "1.0" or "1.1" into whole
number field types like byte and integer. Without this change then the above
values would have resulted in an error, even with coerce set to true.

Closes #25819

@elasticmachine

This comment has been minimized.

Copy link
Collaborator

commented Jul 21, 2017

Since this is a community submitted pull request, a Jenkins build has not been kicked off automatically. Can an Elastic organization member please verify the contents of this patch and then kick off a build manually?

try {
return Long.parseLong(stringValue);
} catch (NumberFormatException e) {
return (long) Double.parseDouble(stringValue);

This comment has been minimized.

Copy link
@scottsom

scottsom Jul 21, 2017

Author Contributor

This does not preserve the precision of large numbers that have a decimal part and exceed the fractional bits of double but are within min/max Long. For example, while both 4115420654264075766 and "4115420654264075766" will be indexed as 4115420654264075766, something like "4115420654264075766.1" will not be indexed as 4115420654264075766 due to String -> Double -> Long conversion.

Do we want to try to do something a bit more clever to handle this edge case?

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 24, 2017

Contributor

I think I'd just put a comment that we might not fail in all cases, but I don't think I would try to address it?

This comment has been minimized.

Copy link
@scottsom

scottsom Jul 24, 2017

Author Contributor

Do you mean a comment in the code or in the documentation? (i.e. a warning here https://www.elastic.co/guide/en/elasticsearch/reference/current/coerce.html)

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 25, 2017

Contributor

I mean in the code but just noticed there was one already

@@ -46,19 +46,6 @@ public void testParseValidFromStrings() throws Exception {
assertNotNull(GeoGridAggregationBuilder.parse("geohash_grid", stParser));
}

public void testParseErrorOnNonIntPrecision() throws Exception {

This comment has been minimized.

Copy link
@scottsom

scottsom Jul 21, 2017

Author Contributor

I removed this test since this change breaks it but I don't think that this test makes sense in the current state. For example, before this change that test would also break if you changed it from "2.0" to a decimal literal like 2.5. Now it is consistent (it will accept and truncate decimals, whether you quote them or not).

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 24, 2017

Contributor

makes sense

@jpountz jpountz self-requested a review Jul 24, 2017

@elastic elastic deleted a comment from scottsom Jul 24, 2017

}
return Byte.parseByte(value.toString());

return (byte) doubleValue;

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 24, 2017

Contributor

Sorry I deleted your comment by mistake, so I am adding it back:

This only rejects coerce=false values for numbers with decimals but it won't reject strings, which seems to run contrary to how coercion is supposed to work. I did not change this behaviour. I just wanted to call it out as inconsistent with how the parse methods in AbstractXContentParser work, should this be changed?

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 24, 2017

Contributor

Agreed it should be changed to be consistent, but let's do it in a separate PR? Thinking more about it, I'm wondering whether we should remove the coerce option instead. I opened #25861.

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2017

@elasticmachine Please test it.

@jpountz
Copy link
Contributor

left a comment

It looks good to me, thanks for working on it.

@@ -46,19 +46,6 @@ public void testParseValidFromStrings() throws Exception {
assertNotNull(GeoGridAggregationBuilder.parse("geohash_grid", stParser));
}

public void testParseErrorOnNonIntPrecision() throws Exception {

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 24, 2017

Contributor

makes sense

}
return Byte.parseByte(value.toString());

return (byte) doubleValue;

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 24, 2017

Contributor

Agreed it should be changed to be consistent, but let's do it in a separate PR? Thinking more about it, I'm wondering whether we should remove the coerce option instead. I opened #25861.

try {
return Long.parseLong(stringValue);
} catch (NumberFormatException e) {
return (long) Double.parseDouble(stringValue);

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 24, 2017

Contributor

I think I'd just put a comment that we might not fail in all cases, but I don't think I would try to address it?

@scottsom scottsom force-pushed the scottsom:decimal_string_coerce branch Jul 24, 2017

@scottsom

This comment has been minimized.

Copy link
Contributor Author

commented Jul 24, 2017

Added StandardCharsets.UTF_8 to the unit test to fix the build. Can you kick off that build again?

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jul 24, 2017

@elasticmachine Please test it.

Coerce decimal strings for whole number types by truncating the decim…
…al part

This changes makes it so you can index a value like "1.0" or "1.1" into whole
number field types like byte and integer. Without this change then the above
values would have resulted in an error, even with coerce set to true.

Closes #25819

@scottsom scottsom force-pushed the scottsom:decimal_string_coerce branch to 27371b0 Jul 24, 2017

@scottsom

This comment has been minimized.

Copy link
Contributor Author

commented Jul 24, 2017

The build failure seems transient and/or unrelated to this change.

I've rebased onto the latest master since I see some newer changes that appear to be improving test stability.

@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jul 25, 2017

@elasticmachine Test it please.

try {
return Long.parseLong(stringValue);
} catch (NumberFormatException e) {
return (long) Double.parseDouble(stringValue);

This comment has been minimized.

Copy link
@jpountz

jpountz Jul 25, 2017

Contributor

I mean in the code but just noticed there was one already

@jpountz jpountz merged commit 2f8def1 into elastic:master Jul 26, 2017

2 checks passed

CLA Commit author has signed the CLA
Details
elasticsearch-ci Build finished.
Details
@jpountz

This comment has been minimized.

Copy link
Contributor

commented Jul 26, 2017

Thank you @scottsom !

jpountz added a commit that referenced this pull request Jul 26, 2017

Coerce decimal strings for whole number types by truncating the decim…
…al part (#25835)

This changes makes it so you can index a value like "1.0" or "1.1" into whole
number field types like byte and integer. Without this change then the above
values would have resulted in an error, even with coerce set to true.

Closes #25819

jpountz added a commit that referenced this pull request Jul 26, 2017

Coerce decimal strings for whole number types by truncating the decim…
…al part (#25835)

This changes makes it so you can index a value like "1.0" or "1.1" into whole
number field types like byte and integer. Without this change then the above
values would have resulted in an error, even with coerce set to true.

Closes #25819

jpountz added a commit that referenced this pull request Jul 26, 2017

Coerce decimal strings for whole number types by truncating the decim…
…al part (#25835)

This changes makes it so you can index a value like "1.0" or "1.1" into whole
number field types like byte and integer. Without this change then the above
values would have resulted in an error, even with coerce set to true.

Closes #25819

@scottsom scottsom deleted the scottsom:decimal_string_coerce branch Jul 26, 2017

@colings86 colings86 added v6.0.0-beta1 and removed v6.0.0 labels Jul 31, 2017

@clintongormley clintongormley added the >bug label Aug 17, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.