Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Epoch millis and second formats accept float implicitly #26119

Merged
merged 4 commits into from Aug 13, 2017

Conversation

Projects
None yet
4 participants
@albertzaharovits
Copy link
Contributor

commented Aug 9, 2017

All floats parsed by epoch_millis/second date formatter get truncated, either as strings or as numbers - coerce behavior. This builds on the existing behavior of parsing all dates to strings.
In this way there is no 'coerce parameter' for the DateFieldMapper. 'Coerce parameter' remains valid only for numeric data types.
The coerce behavior is implicitly enabled for a specific Formatter only, i.e. epoch_*.
A coerce parameter at the DateFieldMapper level cannot be defined irrespective of the date format because of conflicts, e.g. basic_time and epoch_second as float.

Closes: #14641

Epoch millis and second formats accept float implicitly
The coerce parameter is implicity true for the epoch
millis DateFormater. It is not defined for other date formaters.
This extends the current "coerce" from numbers to strings for all dates.

See: #14641
@albertzaharovits

This comment has been minimized.

Copy link
Contributor Author

commented Aug 9, 2017

@cbuescher I think from/to as double from org.elasticsearch.search.aggregations.bucket.range.RangeAggregator can be removed, since all dates can now be parsed as strings, plus this avoids the conversion from double to long. What do you think?

@colings86
Copy link
Member

left a comment

@albertzaharovits I left a small comment

@@ -262,6 +262,26 @@ public void testThatEpochsCanBeParsed() {
}
}

public void testThatFloatEpochsCanBeParsed() {

long millisFromEpoch = randomNonNegativeLong();

This comment has been minimized.

Copy link
@colings86

colings86 Aug 10, 2017

Member

Since dates previous to epoch 0 can still be expressed as a long and may well be encountered if the user is indexing historical data should we test negative epoch values here too?

@cbuescher
Copy link
Member

left a comment

@albertzaharovits this looks great, I left a couple of smaller comments but nothing big.

Some more things:

  • since the original issue revolves around documents not being indexed when they have float values, should we add an integration test for this? DateFieldMapperTests already more or less checks that, but I think it would be good to have one round-trip test here as well

  • Not sure if we also want to parse floats with , as decimal separator, but maybe thats overkill. Any opinions on this @colings86

  • As I understand this change now makes "coerce" : false not reject any Strings any more for epoch_millis and epoch_seconds. I just want to double check that this is okay, maybe we can document this somewhere?

@@ -331,7 +332,8 @@ public int estimateParsedLength() {
@Override
public int parseInto(DateTimeParserBucket bucket, String text, int position) {
boolean isPositive = text.startsWith("-") == false;
boolean isTooLong = text.length() > estimateParsedLength();
int firstDotIndex = text.indexOf((int)'.');

This comment has been minimized.

Copy link
@cbuescher

cbuescher Aug 10, 2017

Member

nit: I think we don't need the int cast here, it is done implicitely. At least my IDE removes it on "save"

@@ -342,7 +344,7 @@ public int parseInto(DateTimeParserBucket bucket, String text, int position) {

int factor = hasMilliSecondPrecision ? 1 : 1000;
try {
long millis = Long.valueOf(text) * factor;
long millis = new BigDecimal(text).longValue() * factor;

This comment has been minimized.

Copy link
@cbuescher

cbuescher Aug 10, 2017

Member

Nice, so this can handle all kinds of formats it seems.


// test floats get truncated
String epochFloatValue = String.format(Locale.US, "%d.%d", dateTime.getMillis() / (parseMilliSeconds ? 1L : 1000L), randomNonNegativeLong());
assertThat(formatter.parser().parseDateTime(epochFloatValue).getMillis(), is(dateTime.getMillis()));

This comment has been minimized.

Copy link
@cbuescher

cbuescher Aug 10, 2017

Member

I'm not sure if we also should support european decimal separators like ,? Maybe not, but just wanted to throw it in. I'm not sure if this complicates things too much.

This comment has been minimized.

Copy link
@albertzaharovits

albertzaharovits Aug 10, 2017

Author Contributor

I think we should accept numbers as defined by javascript/JSON schema, and not formatted string representation of numbers. The issue is that we didn't accept the numbers from javascript which does not have an integer datatype. Accepting string representation of valid javascript numbers is the bonus part since all dates are parsed as strings anyway.

This comment has been minimized.

Copy link
@cbuescher
@@ -301,16 +285,26 @@ public void testThatNegativeEpochsCanBeParsed() {
assertThat(dateTime.getSecondOfMinute(), is(20));
}

// test floats get truncated
String epochFloatValue = String.format(Locale.US, "%d.%d", dateTime.getMillis() / (parseMilliSeconds ? 1L : 1000L), randomNonNegativeLong());

This comment has been minimized.

Copy link
@cbuescher

cbuescher Aug 10, 2017

Member

Should this be a negative long in this test?

This comment has been minimized.

Copy link
@albertzaharovits

albertzaharovits Aug 10, 2017

Author Contributor

It is negative since dateTime.getMillis() is negative, the non negative long is for the fractional part.

This comment has been minimized.

Copy link
@cbuescher

cbuescher Aug 10, 2017

Member

I get it now, thanks.

assertEquals(mapping, mapper.mappingSource().toString());

long millisFromEpoch = randomNonNegativeLong();
String epochFloatValue = String.format(Locale.US, "%d.%d", millisFromEpoch, randomNonNegativeLong());

This comment has been minimized.

Copy link
@cbuescher

cbuescher Aug 10, 2017

Member

Maybe also randomly append a negative prefix to also test parsing negative values here?

@@ -245,6 +245,11 @@ public void doTestCoerce(String type) throws IOException {
IndexableField pointField = fields[1];
assertEquals(2, pointField.fieldType().pointDimensionCount());

// date_range ignores the coerce parameter and epoch_millis date format truncates floats (see issue: #14641)
if (type.equals("date_range")) {
return;

This comment has been minimized.

Copy link
@cbuescher

cbuescher Aug 10, 2017

Member

nit: maybe just personal preference, but early returns in test look strange to me. Can you change this to execute the rest of the test only for type.equals("date_range") == false)

assertThat(searchResponse.getHits().getTotalHits(), equalTo(3L));
buckets = checkBuckets(searchResponse.getAggregations().get("date_range"), "date_range", 2);
assertBucket(buckets.get(0), 2L, "1000-3000", 1000000L, 3000000L);
assertBucket(buckets.get(1), 1L, "3000-4000", 3000000L, 4000000L);

This comment has been minimized.

Copy link
@cbuescher

cbuescher Aug 10, 2017

Member

Great this works

@albertzaharovits albertzaharovits changed the title Epoch millis and second formats accept float implicitly Epoch millis and second formats accept float implicitly (Closes #14641) Aug 10, 2017

@cbuescher

This comment has been minimized.

Copy link
Member

commented Aug 10, 2017

@albertzaharovits thanks, those recent changes look good to me. That leaves the question about whether we should document the changed behaviour around "coerce" : false. Maybe @colings86 also wants to take another look at this?

@albertzaharovits

This comment has been minimized.

Copy link
Contributor Author

commented Aug 10, 2017

@cbuescher I am not sure what you mean by:

"coerce" : false not reject any Strings any more for epoch_millis and epoch_seconds

coerce parameter is invalid for date field type and is ignored in aggregations.
Any String that is a valid number is acceptable, other strings are considered malformed.

@cbuescher

This comment has been minimized.

Copy link
Member

commented Aug 10, 2017

coerce parameter is invalid for date field type and is ignored in aggregations

Thanks, thats what I was missing. So the existing behaviour is that coerce doesn't work with the date datatype? I wasn't really sure from reading the coerce docs.

@albertzaharovits

This comment has been minimized.

Copy link
Contributor Author

commented Aug 10, 2017

That's correct, coerce does not work with date and this is not changing.

@cbuescher
Copy link
Member

left a comment

Thanks, LGTM. I think CI might be still failing because of unrelated problems, had the same yesterday. Maybe rebasing or merging in master helps to get a clean build. Also I don't know if you want to wait for @colings86 to have another look, I'm good.
Thanks a lot for this change.

@colings86
Copy link
Member

left a comment

LGTM

@albertzaharovits albertzaharovits merged commit 3e3132f into elastic:master Aug 13, 2017

2 checks passed

CLA Commit author is a member of Elasticsearch
Details
elasticsearch-ci Build finished.
Details

albertzaharovits added a commit that referenced this pull request Aug 13, 2017

Epoch millis and second formats parse float implicitly (Closes #14641) (
#26119)

`epoch_millis` and `epoch_second` date formats truncate float values, as numbers or as strings.
The `coerce` parameter is not defined for `date` field type and this is not changing.
See PR #26119

Closes #14641

albertzaharovits added a commit that referenced this pull request Aug 13, 2017

Epoch millis and second formats parse float implicitly (Closes #14641) (
#26119)

`epoch_millis` and `epoch_second` date formats truncate float values, as numbers or as strings.
The `coerce` parameter is not defined for `date` field type and this is not changing.
See PR #26119

Closes #14641

jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Aug 14, 2017

Merge branch 'master' into es-path-conf
* master: (30 commits)
  Rewrite range queries with open bounds to exists query (elastic#26160)
  Fix eclipse compilation problem (elastic#26170)
  Epoch millis and second formats parse float implicitly (Closes elastic#14641) (elastic#26119)
  fix SplitProcessor targetField test (elastic#26178)
  Fixed typo in README.textile (elastic#26168)
  Fix incorrect class name in deleteByQuery docs (elastic#26151)
  Move more token filters to analysis-common module
  reindex: automatically choose the number of slices (elastic#26030)
  Fix serialization of the `_all` field. (elastic#26143)
  percolator: Hint what clauses are important in a conjunction query based on fields
  Remove unused Netty-related settings (elastic#26161)
  Remove SimpleQueryStringIT#testPhraseQueryOnFieldWithNoPositions.
  Tests: reenable ShardReduceIT#testIpRange.
  Allow `ClusterState.Custom` to be created on initial cluster states (elastic#26144)
  Teach the build about betas and rcs (elastic#26066)
  Fix wrong header level
  inner hits: Unfiltered nested source should keep its full path
  Document how to import Lucene Snapshot libs when elasticsearch clients (elastic#26113)
  Use `global_ordinals_hash` execution mode when sorting by sub aggregations. (elastic#26014)
  Make the README use a single type in examples. (elastic#26098)
  ...

@clintongormley clintongormley changed the title Epoch millis and second formats accept float implicitly (Closes #14641) Epoch millis and second formats accept float implicitly Aug 17, 2017

@lcawl lcawl removed the v6.1.0 label Dec 12, 2017

@colings86 colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.