Date type has not enough precision for the logging use case. #10005

Open
jordansissel opened this Issue Mar 5, 2015 · 68 comments

Comments

Projects
None yet
@jordansissel
Contributor

jordansissel commented Mar 5, 2015

At present, the 'date' type is millisecond precision. For many log use cases, higher precision time is valuable - microsecond, nanosecond, etc.

The biggest impact of this is during sorting of search results. If you sort chronologically, newest-first, by a date field, documents with the same date will probably be sorted incorrectly (because they match). This is often reported by users seeing events "out of order" when they have the same timestamp. Specific example being sorting by date and seeing events in newest-first order, unless there is a tie, in which case oldest-first (or first-written?) appears. This causes a bit of confusion for the ELK use case.

Related: logstash-plugins/logstash-filter-date#8

I don't have any firm proposals, but I have two different implementation ideas:

  • Proposal 1, use a separate field: Store our own custom-precision time in a separate field as a long. This allows us to do correct sorting (because we have higher precision), but it makes any date-related functionality in Elasticsearch not usable (searching now-1h or doing date_histogram, etc)
  • Proposal 2, date type has tunable precision: Have the date type have configurable precision, with the default (backwards compatible) precision being milliseconds. This would let us choose, for example, nanosecond precision for the logging use case, and year precision for an archaeological use case (billions of years ago, or something). Benefit here is date histogram and other date-related features could still work. Further, having the precision configurable would allow us to keep the underlying data structure a 64bit long and users could choose their most appropriate precision.
@jordansissel

This comment has been minimized.

Show comment
Hide comment
@jordansissel

jordansissel Mar 5, 2015

Contributor

I know Joda's got a precision limit (the Instant class is millisecond precision) and a year limit ("year must be in the range [-292275054,292278993]"). I'm open to helping explore solutions in this area.

Contributor

jordansissel commented Mar 5, 2015

I know Joda's got a precision limit (the Instant class is millisecond precision) and a year limit ("year must be in the range [-292275054,292278993]"). I'm open to helping explore solutions in this area.

@synhershko

This comment has been minimized.

Show comment
Hide comment
@synhershko

synhershko Mar 6, 2015

Contributor

What about consequences to field_date size? even with docvalues in place, cardinality will be ridiculously high. Even for those scenarios which need this, this could be an overkill, no?

Contributor

synhershko commented Mar 6, 2015

What about consequences to field_date size? even with docvalues in place, cardinality will be ridiculously high. Even for those scenarios which need this, this could be an overkill, no?

@nikonyrh

This comment has been minimized.

Show comment
Hide comment
@nikonyrh

nikonyrh Mar 16, 2015

Couldn't you just store the decimal part of the second in a secondary field (as a float or long) and sort by these two fields when needed? You could still aggregate based on the standard date field but not at a microsecond resolution.

Couldn't you just store the decimal part of the second in a secondary field (as a float or long) and sort by these two fields when needed? You could still aggregate based on the standard date field but not at a microsecond resolution.

@markwalkom

This comment has been minimized.

Show comment
Hide comment
@markwalkom

markwalkom Apr 16, 2015

Member

I've been speaking to a few networking firms lately and it's dawned on me that microsecond level is going to be critical for IDS/network analytics.

Member

markwalkom commented Apr 16, 2015

I've been speaking to a few networking firms lately and it's dawned on me that microsecond level is going to be critical for IDS/network analytics.

@markwalkom

This comment has been minimized.

Show comment
Hide comment
@anoinoz

This comment has been minimized.

Show comment
Hide comment
@anoinoz

anoinoz Jun 24, 2015

that last request is mine I believe. I would add that to monitor networking (and other) activities in our field, nanosecond support is paramount.

anoinoz commented Jun 24, 2015

that last request is mine I believe. I would add that to monitor networking (and other) activities in our field, nanosecond support is paramount.

@abrisse

This comment has been minimized.

Show comment
Hide comment
@abrisse

abrisse Sep 4, 2015

👍 for this feature

abrisse commented Sep 4, 2015

👍 for this feature

@jack-pappas

This comment has been minimized.

Show comment
Hide comment
@jack-pappas

jack-pappas Sep 13, 2015

What about switching from Joda Time to date4j? It supports higher-precision timestamps compared to Joda and supposedly the performance is better as well.

What about switching from Joda Time to date4j? It supports higher-precision timestamps compared to Joda and supposedly the performance is better as well.

@dadoonet

This comment has been minimized.

Show comment
Hide comment
@dadoonet

dadoonet Sep 13, 2015

Member

Before looking on the technical side, is BSD License compatible with Apache2 license?

Member

dadoonet commented Sep 13, 2015

Before looking on the technical side, is BSD License compatible with Apache2 license?

@dadoonet

This comment has been minimized.

Show comment
Hide comment
@dadoonet

dadoonet Sep 13, 2015

Member

So BSD is compatible with Apache2.

Member

dadoonet commented Sep 13, 2015

So BSD is compatible with Apache2.

@clintongormley

This comment has been minimized.

Show comment
Hide comment
@clintongormley

clintongormley Sep 19, 2015

Member

I'd like to hear @jpountz's thoughts on this comment #10005 (comment) about high cardinality with regards to index size and performance.

I could imagine adding a precision parameter to date fields which defaults to ms, but also accepts s, us, ns.

We would need to move away from Joda, but I wouldn't be in favour of replacing Joda with a different dependency. Instead, we have this issue discussing replacing Joda with Java.time #12829

Member

clintongormley commented Sep 19, 2015

I'd like to hear @jpountz's thoughts on this comment #10005 (comment) about high cardinality with regards to index size and performance.

I could imagine adding a precision parameter to date fields which defaults to ms, but also accepts s, us, ns.

We would need to move away from Joda, but I wouldn't be in favour of replacing Joda with a different dependency. Instead, we have this issue discussing replacing Joda with Java.time #12829

@jpountz

This comment has been minimized.

Show comment
Hide comment
@jpountz

jpountz Sep 24, 2015

Contributor

@clintongormley It's hard to predict because it depends so much on the data so I ran an experiment for an application that ingests 1M messages at a 2000 messages per second per shard rate.

Precision Terms dict (kB) Doc values (kB)
milliseconds 3348 2448
microseconds 10424 3912

Millisecond precision is much more space-efficient, in particular because with 2k docs per second, several messages are in the same millisecond, but even if we go with 1M messages at a rate of 200 messages per second so that sharing the same millisecond is much more unlikely, there are still significant differences between millisecond and microsecond precision.

Precision Terms dict (kB) Doc values (kB)
milliseconds 7604 2936
microseconds 10680 4888

That said, these numbers are for a single field, the overall difference would be much lower if you include _source storage, indexes and doc values for other fields, etc.

Regarding performance, it should be pretty similar.

Contributor

jpountz commented Sep 24, 2015

@clintongormley It's hard to predict because it depends so much on the data so I ran an experiment for an application that ingests 1M messages at a 2000 messages per second per shard rate.

Precision Terms dict (kB) Doc values (kB)
milliseconds 3348 2448
microseconds 10424 3912

Millisecond precision is much more space-efficient, in particular because with 2k docs per second, several messages are in the same millisecond, but even if we go with 1M messages at a rate of 200 messages per second so that sharing the same millisecond is much more unlikely, there are still significant differences between millisecond and microsecond precision.

Precision Terms dict (kB) Doc values (kB)
milliseconds 7604 2936
microseconds 10680 4888

That said, these numbers are for a single field, the overall difference would be much lower if you include _source storage, indexes and doc values for other fields, etc.

Regarding performance, it should be pretty similar.

@clintongormley

This comment has been minimized.

Show comment
Hide comment
@clintongormley

clintongormley Sep 25, 2015

Member

From what users have told me, by far the most important reason for storing microseconds is for the sorting of results. it make no sense to aggregate on buckets smaller than a millisecond.

This can be achieved very efficiently with the two-field approach: one for the date (in milliseconds) and one for the microseconds. The microseconds field would not need to be indexed (unless you really need to run a range query with finer precision than one millisecond), so all that would be required is doc_values. Microseconds can have a maximum of 1,000 values, so doc_values for this field would require just 12 bits per document.

For the above example, that would be only an extra 11kB.

A logstash filter could make adding the separate microsecond field easy.

Member

clintongormley commented Sep 25, 2015

From what users have told me, by far the most important reason for storing microseconds is for the sorting of results. it make no sense to aggregate on buckets smaller than a millisecond.

This can be achieved very efficiently with the two-field approach: one for the date (in milliseconds) and one for the microseconds. The microseconds field would not need to be indexed (unless you really need to run a range query with finer precision than one millisecond), so all that would be required is doc_values. Microseconds can have a maximum of 1,000 values, so doc_values for this field would require just 12 bits per document.

For the above example, that would be only an extra 11kB.

A logstash filter could make adding the separate microsecond field easy.

@clintongormley

This comment has been minimized.

Show comment
Hide comment
@clintongormley

clintongormley Sep 25, 2015

Member

Meh - there aren't 1000 bits in a byte. /me hangs his head in shame.

It would require 1,500kB

Member

clintongormley commented Sep 25, 2015

Meh - there aren't 1000 bits in a byte. /me hangs his head in shame.

It would require 1,500kB

@rashidkpc rashidkpc referenced this issue in elastic/kibana Sep 28, 2015

Open

Nanosecond times #2498

@pfennema

This comment has been minimized.

Show comment
Hide comment
@pfennema

pfennema Sep 30, 2015

If we want to use the ELK framework proper analyzing network latency we really need nanosecond resolution. Are there any firm plans/roadmap to change the timestamps?

If we want to use the ELK framework proper analyzing network latency we really need nanosecond resolution. Are there any firm plans/roadmap to change the timestamps?

@portante

This comment has been minimized.

Show comment
Hide comment
@portante

portante Oct 1, 2015

Let's say I index the following JSON document with nanosecond precision timestamps:

{ "@timestamp": "2015-09-30T12:30:42.123456789-07:00", "message": "time is running out" }

So the internal date representation will be, 2015-09-30T19:30:42.123 UTC, right?

But if I issue a query matching that document, and ask for either the _source document or the @timestamp field explicitly, won't I get back the original string? If so, then in cases where the original time string lexicographically sorts the same as the converted time value, would that be sufficient for a client to further sort to get what they need?

Or is there a requirement that internal date manipulations in ES need such nanosecond precision? I am imagining that if one has records with nanosecond precision, only being able to query for a date range with millisecond precision could potentially result in more document matches than wanted. Is that the major concern?

portante commented Oct 1, 2015

Let's say I index the following JSON document with nanosecond precision timestamps:

{ "@timestamp": "2015-09-30T12:30:42.123456789-07:00", "message": "time is running out" }

So the internal date representation will be, 2015-09-30T19:30:42.123 UTC, right?

But if I issue a query matching that document, and ask for either the _source document or the @timestamp field explicitly, won't I get back the original string? If so, then in cases where the original time string lexicographically sorts the same as the converted time value, would that be sufficient for a client to further sort to get what they need?

Or is there a requirement that internal date manipulations in ES need such nanosecond precision? I am imagining that if one has records with nanosecond precision, only being able to query for a date range with millisecond precision could potentially result in more document matches than wanted. Is that the major concern?

@pfennema

This comment has been minimized.

Show comment
Hide comment
@pfennema

pfennema Oct 2, 2015

I think the latter, internal date manipulations need probably nanosecond precision. Reason is that when monitoring latency on 10Gb networks we get pcap records (or packets directly from the switch via UDP) which include multiple fields with nanosecond timestamps in the record. We like to find out the difference between the different timestamps in order to optimize our network/software and find correlations. In order to do this we like to zoom in on every single record and not aggregate records.

pfennema commented Oct 2, 2015

I think the latter, internal date manipulations need probably nanosecond precision. Reason is that when monitoring latency on 10Gb networks we get pcap records (or packets directly from the switch via UDP) which include multiple fields with nanosecond timestamps in the record. We like to find out the difference between the different timestamps in order to optimize our network/software and find correlations. In order to do this we like to zoom in on every single record and not aggregate records.

@abierbaum

This comment has been minimized.

Show comment
Hide comment
@abierbaum

abierbaum Oct 25, 2015

👍 for solving this. It is causing major issues for us now in our logging infrastructure.

👍 for solving this. It is causing major issues for us now in our logging infrastructure.

@clintongormley

This comment has been minimized.

Show comment
Hide comment
@clintongormley

clintongormley Oct 26, 2015

Member

@pfennema @abierbaum What problems are you having that can't be solved with the two-field solution?

Member

clintongormley commented Oct 26, 2015

@pfennema @abierbaum What problems are you having that can't be solved with the two-field solution?

@pfennema

This comment has been minimized.

Show comment
Hide comment
@pfennema

pfennema Oct 26, 2015

What we like to have is that we have a timescale in the display (Kibana) where we can zoom in on the individual measurements which have a timestamp with nanosecond resolution. A record in our case has multiple fields (NICTimestamp, TransactionTimestamp, etc) which we like to correlate with each other on an individual basis hence not aggregated. We need to see where spikes occur to optimize our environment. If we can have on the x-axis the time in micro/nanosecond resolution we should be able to zoom in on individual measurements.

What we like to have is that we have a timescale in the display (Kibana) where we can zoom in on the individual measurements which have a timestamp with nanosecond resolution. A record in our case has multiple fields (NICTimestamp, TransactionTimestamp, etc) which we like to correlate with each other on an individual basis hence not aggregated. We need to see where spikes occur to optimize our environment. If we can have on the x-axis the time in micro/nanosecond resolution we should be able to zoom in on individual measurements.

@abierbaum

This comment has been minimized.

Show comment
Hide comment
@abierbaum

abierbaum Oct 26, 2015

@clintongormley Our use case is using ELK to analyze logs from the backend processes in our application. The place we noticed it was postgresql logs. With the current ELK code base, even though the logs coming from the database server have the commands in order, once they end up elastic search and are visualized in kibana the order of items happening on the same millisecond are lost. We can add a secondary sequence number field, but that doesn't work well in Kibana queries (since you can't sort on multiple fields) and causes quite a bit of confusion on the team because they just expect the data in Kibana to be sorted in the same order as it came in from postgresql and logstash.

@clintongormley Our use case is using ELK to analyze logs from the backend processes in our application. The place we noticed it was postgresql logs. With the current ELK code base, even though the logs coming from the database server have the commands in order, once they end up elastic search and are visualized in kibana the order of items happening on the same millisecond are lost. We can add a secondary sequence number field, but that doesn't work well in Kibana queries (since you can't sort on multiple fields) and causes quite a bit of confusion on the team because they just expect the data in Kibana to be sorted in the same order as it came in from postgresql and logstash.

@gigi81

This comment has been minimized.

Show comment
Hide comment
@gigi81

gigi81 Nov 20, 2015

We have the same problem as @abierbaum described. When events happen on the same millisecond the order of the messages is lost.
Any workaround or suggestion on how to fix this would be really appreciated.

gigi81 commented Nov 20, 2015

We have the same problem as @abierbaum described. When events happen on the same millisecond the order of the messages is lost.
Any workaround or suggestion on how to fix this would be really appreciated.

@tbragin tbragin referenced this issue in elastic/kibana Jan 6, 2016

Closed

Extract log event context #275

@dtr2

This comment has been minimized.

Show comment
Hide comment
@dtr2

dtr2 Jan 17, 2016

You don't need to increase the timestamp accuracy: instead, the time sorting should be based on both timestamp and ID: message IDs are monotonically increasing, and specifically, they are monotonically increasing for a set of messages with the same timestamp...

dtr2 commented Jan 17, 2016

You don't need to increase the timestamp accuracy: instead, the time sorting should be based on both timestamp and ID: message IDs are monotonically increasing, and specifically, they are monotonically increasing for a set of messages with the same timestamp...

@jcollie

This comment has been minimized.

Show comment
Hide comment
@jcollie

jcollie Jan 17, 2016

@dtr That may be true for IDs that are automatically assigned, but only if the messages are indexed in the correct order in the first place, and only if the application isn't supplying it's own IDs. There's definitely no way that I could depend on that behavior. Also is the monotonically increasing IDs guaranteed, or is it an implementation artifact, especially when considering clusters of more than one node?

jcollie commented Jan 17, 2016

@dtr That may be true for IDs that are automatically assigned, but only if the messages are indexed in the correct order in the first place, and only if the application isn't supplying it's own IDs. There's definitely no way that I could depend on that behavior. Also is the monotonically increasing IDs guaranteed, or is it an implementation artifact, especially when considering clusters of more than one node?

@dtr2

This comment has been minimized.

Show comment
Hide comment
@dtr2

dtr2 Jan 18, 2016

I believe the original intent was to see a bulk of log messages originated from the same source. If a cluster is involved, then probably the timestamp is the only clustering item (unless, of course, there is a special "context" or "session" field)
For that purpose, we can rely on the id (assuming, of course, its monotonically increasing at least per source)

dtr2 commented Jan 18, 2016

I believe the original intent was to see a bulk of log messages originated from the same source. If a cluster is involved, then probably the timestamp is the only clustering item (unless, of course, there is a special "context" or "session" field)
For that purpose, we can rely on the id (assuming, of course, its monotonically increasing at least per source)

@jcollie

This comment has been minimized.

Show comment
Hide comment
@jcollie

jcollie Jan 18, 2016

@dtr2 that's a lot of "ifs" to be relying on ElasticSearch's autogenerated IDs, nearly all of which are violated in my cluster:

  1. Some messages supply their own ID if the source has a unique ID already associated with it (systemd journal messages in my case).
  2. All of my data runs through a RabbitMQ server (sometimes passing through multiple queues depending on the amount of processing that needs to be done) with multiple consumers per queue so there's no way that I can expect documents to be indexed in any specific order, much less by the same ElasticSearch node.

In any case, ElasticSearch does not guarantee the behavior of autogenerated IDs. The docs only guarantee that the IDs are unique:

https://www.elastic.co/guide/en/elasticsearch/guide/current/index-doc.html

So I can hope that you see that trying to impose an order based upon the ID cannot be relied upon in the general case. Yes, there may be certain specific instances where that would work, but you'd be relying on undocumented implementation details.

jcollie commented Jan 18, 2016

@dtr2 that's a lot of "ifs" to be relying on ElasticSearch's autogenerated IDs, nearly all of which are violated in my cluster:

  1. Some messages supply their own ID if the source has a unique ID already associated with it (systemd journal messages in my case).
  2. All of my data runs through a RabbitMQ server (sometimes passing through multiple queues depending on the amount of processing that needs to be done) with multiple consumers per queue so there's no way that I can expect documents to be indexed in any specific order, much less by the same ElasticSearch node.

In any case, ElasticSearch does not guarantee the behavior of autogenerated IDs. The docs only guarantee that the IDs are unique:

https://www.elastic.co/guide/en/elasticsearch/guide/current/index-doc.html

So I can hope that you see that trying to impose an order based upon the ID cannot be relied upon in the general case. Yes, there may be certain specific instances where that would work, but you'd be relying on undocumented implementation details.

@dtr2

This comment has been minimized.

Show comment
Hide comment
@dtr2

dtr2 Jan 18, 2016

@jcollie , In that case, trying to find a "context" is impossible - unless your data source provides it. The idea was to find a context and filter "related" lines together.

dtr2 commented Jan 18, 2016

@jcollie , In that case, trying to find a "context" is impossible - unless your data source provides it. The idea was to find a context and filter "related" lines together.

@bobrik

This comment has been minimized.

Show comment
Hide comment
@bobrik

bobrik Jan 18, 2016

Contributor

@jcollie "IDs" (in fact they are counters per log source) have to be generated before ingestion, outside of elasticsearch.

Instead of sorting on time you sort on time, counter.

Contributor

bobrik commented Jan 18, 2016

@jcollie "IDs" (in fact they are counters per log source) have to be generated before ingestion, outside of elasticsearch.

Instead of sorting on time you sort on time, counter.

@jcollie

This comment has been minimized.

Show comment
Hide comment
@jcollie

jcollie Jan 19, 2016

I don't get it - what is the resistance to extending timestamps to nanosecond accuracy? I realize that would take a lot of work and would likely be a "3.x" feature, but anything else is just a workaround until nanosecond timestamps are available.

Having a counter per log source only really helps correlating messages from the same source, but is really not very useful in correlating messages across sources/systems.

As an example, let's say that I implement this counter in each of my log sources (for example as a logstash plugin). Then let's say that I have one source that generates 1000 messages per millisecond and another source that generates 100000 messages per millisecond. There's no way that I could reliably tell what order those messages should be in relative to each source. That may be an extreme example but I think that it illustrates the point.

jcollie commented Jan 19, 2016

I don't get it - what is the resistance to extending timestamps to nanosecond accuracy? I realize that would take a lot of work and would likely be a "3.x" feature, but anything else is just a workaround until nanosecond timestamps are available.

Having a counter per log source only really helps correlating messages from the same source, but is really not very useful in correlating messages across sources/systems.

As an example, let's say that I implement this counter in each of my log sources (for example as a logstash plugin). Then let's say that I have one source that generates 1000 messages per millisecond and another source that generates 100000 messages per millisecond. There's no way that I could reliably tell what order those messages should be in relative to each source. That may be an extreme example but I think that it illustrates the point.

@bobrik

This comment has been minimized.

Show comment
Hide comment
@bobrik

bobrik Jan 19, 2016

Contributor

Having a counter per log source only really helps correlating messages from the same source, but is really not very useful in correlating messages across sources/systems.

@jcollie can you tell me how you keep clocks on your machines in perfect sync so nanosecond accuracy starts making sense? Even Google struggles to do so:

Then let's say that I have one source that generates 1000 messages per millisecond and another source that generates 100000 messages per millisecond. There's no way that I could reliably tell what order those messages should be in relative to each source.

You are right here. There is no way. You can easily see jitter of a few ms between ntp sync between machines in the same rack:

19 Jan 09:27:40 ntpdate[11731]: adjust time server 10.36.14.18 offset -0.002199 sec
19 Jan 09:27:50 ntpdate[11828]: adjust time server 10.36.14.18 offset 0.004238 sec

On the other hand, you can reliably say in which order messages were processed by a single source:

  • Single thread of your program.
  • Single queue in logstash.
  • Some other strictly ordered sequence (ex: kafka partition).

I'd be happy to be proven wrong, though.

Contributor

bobrik commented Jan 19, 2016

Having a counter per log source only really helps correlating messages from the same source, but is really not very useful in correlating messages across sources/systems.

@jcollie can you tell me how you keep clocks on your machines in perfect sync so nanosecond accuracy starts making sense? Even Google struggles to do so:

Then let's say that I have one source that generates 1000 messages per millisecond and another source that generates 100000 messages per millisecond. There's no way that I could reliably tell what order those messages should be in relative to each source.

You are right here. There is no way. You can easily see jitter of a few ms between ntp sync between machines in the same rack:

19 Jan 09:27:40 ntpdate[11731]: adjust time server 10.36.14.18 offset -0.002199 sec
19 Jan 09:27:50 ntpdate[11828]: adjust time server 10.36.14.18 offset 0.004238 sec

On the other hand, you can reliably say in which order messages were processed by a single source:

  • Single thread of your program.
  • Single queue in logstash.
  • Some other strictly ordered sequence (ex: kafka partition).

I'd be happy to be proven wrong, though.

@pfennema

This comment has been minimized.

Show comment
Hide comment
@pfennema

pfennema Jan 19, 2016

Clock synchronisation between machines is done by PTP if you need an
accurate synchronisation. Which is needed when measuring low
latency/high frequency trading networks. The PTP source is usually the
switch in the network (Arista, Cisco)

Ivan Babrou schreef op 2016-01-19 10:32:

Having a counter per log source only really helps correlating messages from the same source, but is really not very useful in correlating messages across sources/systems.

@jcollie [1] can you tell me how you keep clocks on your machines in perfect sync so nanosecond accuracy starts making sense? Even Google struggles to do so:

Then let's say that I have one source that generates 1000 messages per millisecond and another source that generates 100000 messages per millisecond. There's no way that I could reliably tell what order those messages should be in relative to each source.

You are right here. There is no way. You can easily see jitter of a few ms between ntp sync between machines in the same rack:

19 Jan 09:27:40 ntpdate[11731]: adjust time server 10.36.14.18 offset -0.002199 sec
19 Jan 09:27:50 ntpdate[11828]: adjust time server 10.36.14.18 offset 0.004238 sec

On the other hand, you can reliably say in which order messages were processed by a single source:

  • Single thread of your program.
  • Single queue in logstash.
  • Some other strictly ordered sequence (ex: kafka partition).

I'd be happy to be proven wrong, though.

Reply to this email directly or view it on GitHub [2].

Links:

[1] https://github.com/jcollie
[2]
#10005 (comment)

Clock synchronisation between machines is done by PTP if you need an
accurate synchronisation. Which is needed when measuring low
latency/high frequency trading networks. The PTP source is usually the
switch in the network (Arista, Cisco)

Ivan Babrou schreef op 2016-01-19 10:32:

Having a counter per log source only really helps correlating messages from the same source, but is really not very useful in correlating messages across sources/systems.

@jcollie [1] can you tell me how you keep clocks on your machines in perfect sync so nanosecond accuracy starts making sense? Even Google struggles to do so:

Then let's say that I have one source that generates 1000 messages per millisecond and another source that generates 100000 messages per millisecond. There's no way that I could reliably tell what order those messages should be in relative to each source.

You are right here. There is no way. You can easily see jitter of a few ms between ntp sync between machines in the same rack:

19 Jan 09:27:40 ntpdate[11731]: adjust time server 10.36.14.18 offset -0.002199 sec
19 Jan 09:27:50 ntpdate[11828]: adjust time server 10.36.14.18 offset 0.004238 sec

On the other hand, you can reliably say in which order messages were processed by a single source:

  • Single thread of your program.
  • Single queue in logstash.
  • Some other strictly ordered sequence (ex: kafka partition).

I'd be happy to be proven wrong, though.

Reply to this email directly or view it on GitHub [2].

Links:

[1] https://github.com/jcollie
[2]
#10005 (comment)

@saffroy

This comment has been minimized.

Show comment
Hide comment
@saffroy

saffroy Mar 29, 2017

@portante Actual timestamps convey much more information than just sequence numbers, can be much easier to generate (eg. across multiple processes) and situations exist where a certain precision (microsecond, nanosecond) gives correct ordering sufficiently often to be useful.

I have a use case where we collect performance events from modules inside a running process and across distributed processes (like Google Dapper): millisecond precision is insufficient to measure differences between closely related events, but we still need some absolute time to relate that to other events. The occasional clock glitch breaks perfect ordering, but it isn't an issue in practice, because it's performance data and we have many samples. So we worked around the loss of precision in ES by storing timestamps twice, in a date field (millisecond precision, good enough for search, and for humans to understand) and in a double field (microseconds since epoch, good for computations). Not exactly optimal though.

saffroy commented Mar 29, 2017

@portante Actual timestamps convey much more information than just sequence numbers, can be much easier to generate (eg. across multiple processes) and situations exist where a certain precision (microsecond, nanosecond) gives correct ordering sufficiently often to be useful.

I have a use case where we collect performance events from modules inside a running process and across distributed processes (like Google Dapper): millisecond precision is insufficient to measure differences between closely related events, but we still need some absolute time to relate that to other events. The occasional clock glitch breaks perfect ordering, but it isn't an issue in practice, because it's performance data and we have many samples. So we worked around the loss of precision in ES by storing timestamps twice, in a date field (millisecond precision, good enough for search, and for humans to understand) and in a double field (microseconds since epoch, good for computations). Not exactly optimal though.

@portante

This comment has been minimized.

Show comment
Hide comment
@portante

portante Mar 30, 2017

@saffroy, you are correct. I was not trying to convey otherwise, just stating that you cannot rely on timestamps for ordering logs flowing from a source. That requires a monotonically increasing sequence number. This is an old but good paper on the topic.

@saffroy, you are correct. I was not trying to convey otherwise, just stating that you cannot rely on timestamps for ordering logs flowing from a source. That requires a monotonically increasing sequence number. This is an old but good paper on the topic.

@ghost

This comment has been minimized.

Show comment
Hide comment
@ghost

ghost Mar 30, 2017

@portante Why not to have both tools in arsenal? I mean let's say for my case I can use nanosecond time precision and this would be sufficient to reconstruct logs. Somebody will have so many logs and they will be so dense that even nanosecond won't be enough. No problem, he will start to do this with sequence numbers. And will use both, nano seconds AND sequence numbers when required.

My impression that it's better to have 1 field rather than 2 fields that allows you to reconstruct sequence of events. Even now in ES you have "offset", which can be used in some situations, but again, this looks like a workaround.

Not to mention that in Kibana for proper visualisations it should be one field, as field grouping is not supported.

So nano seconds are needed anyway and could be used in most situations, for those who need strict order, they can add sequence numbers to logs.

ghost commented Mar 30, 2017

@portante Why not to have both tools in arsenal? I mean let's say for my case I can use nanosecond time precision and this would be sufficient to reconstruct logs. Somebody will have so many logs and they will be so dense that even nanosecond won't be enough. No problem, he will start to do this with sequence numbers. And will use both, nano seconds AND sequence numbers when required.

My impression that it's better to have 1 field rather than 2 fields that allows you to reconstruct sequence of events. Even now in ES you have "offset", which can be used in some situations, but again, this looks like a workaround.

Not to mention that in Kibana for proper visualisations it should be one field, as field grouping is not supported.

So nano seconds are needed anyway and could be used in most situations, for those who need strict order, they can add sequence numbers to logs.

@jordansissel

This comment has been minimized.

Show comment
Hide comment
@jordansissel

jordansissel Mar 30, 2017

Contributor

The occasional clock glitch breaks perfect ordering, but it isn't an issue in practice, because it's performance data and we have many samples

Broad declarations "occasional clock glitch ... not an issue in practice" are not very helpful - while this may be your experience, but it has not been my experience.

Time is an amazing and interesting topic, but let's stay focused on the issue of date type precision in Elasticsearch, and not about measuring time or properties of time itself.

Contributor

jordansissel commented Mar 30, 2017

The occasional clock glitch breaks perfect ordering, but it isn't an issue in practice, because it's performance data and we have many samples

Broad declarations "occasional clock glitch ... not an issue in practice" are not very helpful - while this may be your experience, but it has not been my experience.

Time is an amazing and interesting topic, but let's stay focused on the issue of date type precision in Elasticsearch, and not about measuring time or properties of time itself.

@clintongormley

This comment has been minimized.

Show comment
Hide comment
@clintongormley

clintongormley Mar 30, 2017

Member

Stalled on #12829

Member

clintongormley commented Mar 30, 2017

Stalled on #12829

@clintongormley clintongormley added stalled and removed discuss labels Mar 30, 2017

@saffroy

This comment has been minimized.

Show comment
Hide comment
@saffroy

saffroy Mar 31, 2017

@jordansissel The entire paragraph was only meant to give another example of a situation where we would benefit from increased precision when storing timestamps in Elasticsearch, and the remark about clock glitch ("not an issue in practice, because it's performance data") was to be understood within this specific example. Sorry if I wasn't clear about that.

saffroy commented Mar 31, 2017

@jordansissel The entire paragraph was only meant to give another example of a situation where we would benefit from increased precision when storing timestamps in Elasticsearch, and the remark about clock glitch ("not an issue in practice, because it's performance data") was to be understood within this specific example. Sorry if I wasn't clear about that.

@mbullock1986

This comment has been minimized.

Show comment
Hide comment
@mbullock1986

mbullock1986 Aug 20, 2017

Hi All,

I know this is difficult however wondered if this issue has moved on with the advent of version 6?

Thanks!

Hi All,

I know this is difficult however wondered if this issue has moved on with the advent of version 6?

Thanks!

@jpountz

This comment has been minimized.

Show comment
Hide comment
@jpountz

jpountz Aug 21, 2017

Contributor

Sorry, it hasn't moved.

Contributor

jpountz commented Aug 21, 2017

Sorry, it hasn't moved.

@jchannon

This comment has been minimized.

Show comment
Hide comment
@jchannon

jchannon Aug 30, 2017

Any ideas when it will be? I've been recommended ELK and I hit this

Any ideas when it will be? I've been recommended ELK and I hit this

@jpountz

This comment has been minimized.

Show comment
Hide comment
@jpountz

jpountz Aug 30, 2017

Contributor

No idea. The only thing I can tell is that it won't be fixed in a short term.

Contributor

jpountz commented Aug 30, 2017

No idea. The only thing I can tell is that it won't be fixed in a short term.

@jchannon

This comment has been minimized.

Show comment
Hide comment
@jchannon

jchannon Aug 30, 2017

Am I right in that issue for dumb people like me is that when I send in "ts": "2017-08-30T14:26:30.9157480Z" ES converts that to 1504103190915 chops off the last 4 digits, parses that as a date but obviously has missing digits off the millisecond so the sorting/search is not as accurate as expected?

jchannon commented Aug 30, 2017

Am I right in that issue for dumb people like me is that when I send in "ts": "2017-08-30T14:26:30.9157480Z" ES converts that to 1504103190915 chops off the last 4 digits, parses that as a date but obviously has missing digits off the millisecond so the sorting/search is not as accurate as expected?

@synhershko

This comment has been minimized.

Show comment
Hide comment
@synhershko

synhershko Aug 30, 2017

Contributor

@jchannon what is your use case that requires that precision?

@jpountz maybe 6.x sorted indexes can be the answer to this, or are they using the same precision as the indexed values?

Contributor

synhershko commented Aug 30, 2017

@jchannon what is your use case that requires that precision?

@jpountz maybe 6.x sorted indexes can be the answer to this, or are they using the same precision as the indexed values?

@jchannon

This comment has been minimized.

Show comment
Hide comment
@jchannon

jchannon Aug 30, 2017

My use case? I have always logged to 6 decimal places and want to keep it that way. I'm astounded that this highly recommended piece of software is so poor on storing/converting dates

My use case? I have always logged to 6 decimal places and want to keep it that way. I'm astounded that this highly recommended piece of software is so poor on storing/converting dates

@jchannon jchannon referenced this issue in uken/fluent-plugin-elasticsearch Sep 7, 2017

Closed

What happens if time_key is not present #284

@StephanX

This comment has been minimized.

Show comment
Hide comment
@StephanX

StephanX Nov 22, 2017

Our use case is that we ingest logs kubernetes => fluentd (0.14) => elasticsearch, and logs that are emitted rapidly (anything under a millisecond apart, which is easily done) obviously have no way of being kept in that order when displayed in kibana.

StephanX commented Nov 22, 2017

Our use case is that we ingest logs kubernetes => fluentd (0.14) => elasticsearch, and logs that are emitted rapidly (anything under a millisecond apart, which is easily done) obviously have no way of being kept in that order when displayed in kibana.

@varas

This comment has been minimized.

Show comment
Hide comment
@varas

varas Dec 11, 2017

Same issue, we are tracking events that happen within nanosec precision.

Is there any plan to increase it?

varas commented Dec 11, 2017

Same issue, we are tracking events that happen within nanosec precision.

Is there any plan to increase it?

@clintongormley

This comment has been minimized.

Show comment
Hide comment
@clintongormley

clintongormley Dec 11, 2017

Member

Yes, but we need to move from Joda to Java.time in order to do so. See #27330

Member

clintongormley commented Dec 11, 2017

Yes, but we need to move from Joda to Java.time in order to do so. See #27330

@gavenkoa

This comment has been minimized.

Show comment
Hide comment
@gavenkoa

gavenkoa Jan 21, 2018

I opened bug in Logback as its core interface also preserves data in millisecond resolution so precision is lost even earlier, before ES: https://jira.qos.ch/browse/LOGBACK-1374

It seems that historical java.util.Date type is the cause of problems is Java world.

I opened bug in Logback as its core interface also preserves data in millisecond resolution so precision is lost even earlier, before ES: https://jira.qos.ch/browse/LOGBACK-1374

It seems that historical java.util.Date type is the cause of problems is Java world.

@shekharupland

This comment has been minimized.

Show comment
Hide comment
@shekharupland

shekharupland Jan 27, 2018

Same use case, using kubernetes filebeat elasticsearch stack for log collection, but not having nano second precision is leading to incorrect ordering of logs.

Same use case, using kubernetes filebeat elasticsearch stack for log collection, but not having nano second precision is leading to incorrect ordering of logs.

@portante

This comment has been minimized.

Show comment
Hide comment
@portante

portante Jan 27, 2018

Seems like we need to consider the collectors providing a monotonically increasing counter which records the order in which the logs were collected. Nanosecond precision does not necessarily solve the problem because time resolution might not be nanosecond.

Seems like we need to consider the collectors providing a monotonically increasing counter which records the order in which the logs were collected. Nanosecond precision does not necessarily solve the problem because time resolution might not be nanosecond.

@lgogolin

This comment has been minimized.

Show comment
Hide comment
@lgogolin

lgogolin Feb 16, 2018

Seriously guys ? This bug is almost 3 years old...

Seriously guys ? This bug is almost 3 years old...

@matthid

This comment has been minimized.

Show comment
Hide comment
@matthid

matthid Feb 16, 2018

The problem is also that if you try to find a workaround you run into a series of other bugs so there is not even a viable acceptable workaround:

  • If you use a string, sorting will be slow
  • If you use a integer and try to make it readable you will not have big enough numbers (#17006)
  • If you add an additional ordering field you cannot easily configure Kibana to have a "thenBy" ordering on that field.

So the only viable workaround seems to be to have an epoch + 2 additional digits which are increased in logstash when the timestamp matches.

Does anyone have found a better approach?

matthid commented Feb 16, 2018

The problem is also that if you try to find a workaround you run into a series of other bugs so there is not even a viable acceptable workaround:

  • If you use a string, sorting will be slow
  • If you use a integer and try to make it readable you will not have big enough numbers (#17006)
  • If you add an additional ordering field you cannot easily configure Kibana to have a "thenBy" ordering on that field.

So the only viable workaround seems to be to have an epoch + 2 additional digits which are increased in logstash when the timestamp matches.

Does anyone have found a better approach?

@jraby

This comment has been minimized.

Show comment
Hide comment
@jraby

jraby Feb 16, 2018

Been storing microseconds since epoch in an number field for 2 years now.
Suits our needs but YMMV.

jraby commented Feb 16, 2018

Been storing microseconds since epoch in an number field for 2 years now.
Suits our needs but YMMV.

@jpountz

This comment has been minimized.

Show comment
Hide comment
Contributor

jpountz commented Mar 14, 2018

@tlhampton13

This comment has been minimized.

Show comment
Hide comment
@tlhampton13

tlhampton13 May 16, 2018

Not all time data is collected using commodity hardware. There is plenty of specialty equipment that collects nanosecond resolution data. Thinking about other applications besides log analysis. Sorting by time is critical, but aggregations over small timeframes is also important. For example, maybe I just want to aggregate some scientific data over a one second window or even over millisecond window.

I have nanosecond resolution data and would love to be able to use ES aggregations to analyze it.

Not all time data is collected using commodity hardware. There is plenty of specialty equipment that collects nanosecond resolution data. Thinking about other applications besides log analysis. Sorting by time is critical, but aggregations over small timeframes is also important. For example, maybe I just want to aggregate some scientific data over a one second window or even over millisecond window.

I have nanosecond resolution data and would love to be able to use ES aggregations to analyze it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment