Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

date_range with gte lower than lte (<24h) but errors with min value greater than max value #108241

Closed
boesing opened this issue May 3, 2024 · 9 comments
Labels
>bug :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch

Comments

@boesing
Copy link

boesing commented May 3, 2024

Elasticsearch Version

8.13.2

Installed Plugins

No response

Java Version

bundled

OS Version

Debian 11.9, Linux 5.10

Problem Description

Somehow, a date_range value of one of our indexed fields is not parsed properly.
When I try to index a value which seems kinda close to each other, I receive the following issue:

{
    "errors": true,
    "took": 5,
    "items": [
        {
            "create": {
                "_index": "daterange_test",
                "_id": "hGCvPY8BGbiuHIZU7bxk",
                "status": 400,
                "error": {
                    "type": "document_parsing_exception",
                    "reason": "[1:93] failed to parse field [indexed.timeFrame] of type [date_range] in document with id 'hGCvPY8BGbiuHIZU7bxk'. Preview of field's value: 'null'",
                    "caused_by": {
                        "type": "illegal_argument_exception",
                        "reason": "min value (1704128073000) is greater than max value (1704124499000)"
                    }
                }
            }
        }
    ]
}

I have tried a bunch of elasticsearch versions, starting from 8.12.3 up to 8.13.2 down to 8.11.0.
Somehow, the issue does not appear on 8.11.0 on elastic cloud (might be due to the fact that the lucene version there is 9.7.0 as that is the only difference I could spot between my local 8.11.0 and the one from elastic.co).

I have also tried multiple servers where we have elasticsearch installed, fresh setups via docker and debian, etc. and I always ran into that specific problem. The date range we are trying to persist has <24h difference but it perfectly works if it has 24h and 1 seconds difference (tho I just tested that by changing the time).

Steps to Reproduce

  • Use docker setup guide of elastic.co to install 8.13.2: https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html

  • Create a fresh index using the following mapping

    {
        "mappings": {
            "date_detection": false,
            "dynamic": "strict",
            "properties": {
                "indexed": {
                    "type": "object",
                    "properties": {
                        "timeFrame": {
                            "type": "date_range",
                            "format": "YYYY-MM-dd'T'HH:mm:ssz"
                        }
                    }
                }
            }
        }
    }
  • Use _bulk API to create a new document with the following request payload:

    {"create":{}}
    {"indexed":{"timeFrame":{"gte":"2024-04-29T18:54:33+02:00","lte":"2024-04-30T17:54:59+02:00"}}}
    

Logs (if relevant)

No response

@boesing boesing added >bug needs:triage Requires assignment of a team area label labels May 3, 2024
@pxsalehi pxsalehi added the :Search Foundations/Mapping Index mappings, including merging and defining field types label May 3, 2024
@elasticsearchmachine elasticsearchmachine added the Team:Search Meta label for search team label May 3, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@elasticsearchmachine elasticsearchmachine removed the needs:triage Requires assignment of a team area label label May 3, 2024
@Kmaralla
Copy link

Kmaralla commented May 5, 2024

Did you check the local time of the ES nodes ? My guess is , some of them having different timezones contributing to this issue of < 24 hr durations.

@boesing
Copy link
Author

boesing commented May 5, 2024

I did not, as I can reproduce it in a single node already. even tho, the local timezone should be ignored since the timezone is required to be submitted in the documents value. thats why I explicitly disabled date_detection in the mapping and defined the format of the date_range to contain a timezone.
Therefore, I would not expect to have any sideeffects on es server timezone at all. But maybe my assumption is incorrect 🤷🏼‍♂️

@benwtrent
Copy link
Member

@boesing The timestamps in the exception are as expected. The min value is indeed greater than the max value, and this is not allowed when indexing in a date range field

min: 1704128073000
max: 1704124499000

Note the seventh digit in min is an 8, but its a 4 in max

Maybe I don't understand the issue?

@boesing
Copy link
Author

boesing commented May 6, 2024

@benwtrent Could you check the data I am sending to elastic? these dates are not integer, though elastic parses these dates to integer and I guess there is a bug.

There is a difference in those dates round about 23 hours and a couple of minutes. Something within elasticsearch does not convert those strings into the correct integers and thus - ofc - the values in the error are reflecting an issue but that is not caused on client side.

if that would be a client issue, I do not understand why the same request works within elastic cloud which is on 8.11.0 with lucene 9.7.0.

@benwtrent
Copy link
Member

@boesing you are using YYYY instead of yyyy for your date parsing. These are two different things.

YYYY represents "week of year", which is a tricky definition in the ISO year-week calendar. You almost never want this.

yyyy represents the calendar year. What folks normally consider year.

FYI, I did your test with yyyy and things were indexed just fine.

these dates are not integer, though elastic parses these dates to integer and I guess there is a bug.

We parse all values and index them into long values representing the milliseconds since the epoch. So, even though you are not supplying a number, all dates are technically numbers and we store them as such.

@boesing
Copy link
Author

boesing commented May 6, 2024

Ah, wasn't aware of that. So its the format which introduces the problem.
I will try that, thanks for clarification! That might fix our problem.

@boesing boesing closed this as completed May 6, 2024
@boesing
Copy link
Author

boesing commented May 7, 2024

@benwtrent What I still do not understand is, that if elasticsearch uses SimpleDate as described here and here, how does that lead to issues for dates in April?

I've adapted our mapping but I still don't get the underlying issue with my concrete example.

  • Why is it valid in elastic.cloud but not when running docker?
  • How is the year of the week of week 18 not equal (both dates are in the same calendar week, but even if they weren't in the same week, the earlier date would be in a lower week)?

So I would expect issues end of december where the week of the year could become 2025 on december 30th, 2024 but not end of april 🤔

@benwtrent
Copy link
Member

Why is it valid in elastic.cloud but not when running docker?

I don't think it is? I got the exact same error as you when testing in cloud, however its obvious that the resulting min and max are incorrect for a range mapping.

How is the year of the week of week 18 not equal (both dates are in the same calendar week, but even if they weren't in the same week, the earlier date would be in a lower week)?

I honestly don't know.

But we are not using SimpleDate. We do utilize the underlying Java date parsing, but it is much more complicated.

Here is the parsing once we have the temporal accessor:

public static ZonedDateTime from(TemporalAccessor accessor, Locale locale, ZoneId defaultZone) {

Here is the code once we have parsed out the TemporalAccessor for WeekOfYear:

    private static LocalDate localDateFromWeekBasedDate(TemporalAccessor accessor, Locale locale) {
        WeekFields weekFields = WeekFields.of(locale);
        if (accessor.isSupported(weekFields.weekOfWeekBasedYear())) {
            return LocalDate.ofEpochDay(0)
                .with(weekFields.weekBasedYear(), accessor.get(weekFields.weekBasedYear()))
                .with(weekFields.weekOfWeekBasedYear(), accessor.get(weekFields.weekOfWeekBasedYear()))
                .with(TemporalAdjusters.previousOrSame(weekFields.getFirstDayOfWeek()));
        } else {
            return LocalDate.ofEpochDay(0)
                .with(weekFields.weekBasedYear(), accessor.get(weekFields.weekBasedYear()))
                .with(TemporalAdjusters.previousOrSame(weekFields.getFirstDayOfWeek()));

        }
    }

@javanna javanna added Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch and removed Team:Search Meta label for search team labels Jul 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Search Foundations/Mapping Index mappings, including merging and defining field types Team:Search Foundations Meta label for the Search Foundations team in Elasticsearch
Projects
None yet
Development

No branches or pull requests

6 participants