Make messages with identical timestamps sortable by ULID #6711

mpfz0r · 2019-11-03T17:28:59Z

The ULID format in the gl2_message_id field is composed of a 48 bit timestamp followed by 80 bits
of randomness.
Use the first 16 bits of the random field to embed a sequence number
for each message.
If a batch of messages was received with identical timestamps
(the same millisecond), the original receive order is kept by the
encoded sequence number which directly follows the timestamp.

This allows us to sort messages by gl2_message_id which should have the correct original
order in most cases.

CAVEATS

This is a best effort approach to a complicated problem. It's not a silver bullet.
Here are reasons why the gl2_message_id sort order might not always be correct:

The sequence number is generated per node and input.
This means that sorting will not work if an input is load balanced over multiple nodes.
There is only space for 60535 messages with the same timestamp and input.
Also, there is a small chance that, if too many batches of messages with the same timestamp and input get processed
in parallel, the sort order might be wrong.

Performance Impact

Running a benchmark, which ingests 8 million messages.
Four parallel curl loops send batches of 5000 messages with identical timestamps.

Without this change, this takes 3m 19s
With this change: 3m 25s

Which means approx 40k msg/sec and no measurable performance impact.

Fixes #2741

mpfz0r · 2019-11-21T13:18:30Z

/rebase

bernd

A few comments:

graylog2-server/src/main/java/org/graylog2/shared/buffers/processors/MessageULIDGenerator.java

graylog2-server/src/main/resources/org/graylog2/plugin/journal/raw_message.proto

graylog2-server/src/main/java/org/graylog2/shared/buffers/processors/MessageULIDGenerator.java

graylog2-server/src/main/java/org/graylog2/plugin/inputs/MessageInput.java

mpfz0r

Thanks for the review. I've updated the PR

bernd · 2020-01-27T11:18:59Z

Checking for an existing "gl2_message_id" and using the message timestamp is done in #7290.

bernd · 2022-10-12T16:35:55Z

@mpfz0r Does this implementation still work when a non-local message journal implementation (e.g., Kafka) is used? This just crossed my mind.

mpfz0r · 2022-10-12T16:57:57Z

@mpfz0r Does this implementation still work when a non-local message journal implementation (e.g., Kafka) is used? This just crossed my mind.

It does. I had the same thought at first, but only my first iteration depended on the journal 😅

graylog2-server/src/main/java/org/graylog2/shared/buffers/processors/MessageULIDGenerator.java

patrickmann

LGTM
Tested with syslog input - sorting by gl2_message_id works as expected

The ULID format is composed of a 48 bit timestamp followed by 80 bits of randomness. Use the first 16 bits of the random field to embed a sequence number for each message. If a batch of messages was received with identical timestamps (the same millisecond), the original receive order is kept by the encoded sequence number which directly follows the timestamp.

Introduce a seqence number on each input which gets incremented and embedded in every received message. This allows sorting to work without depending on a KafkaJournal. Regenerate the protobuf class JournalMessages so it can pass the sequence number from a RawMessage to a Message. Don't overwrite already existing GL2_MESSAGE_IDs. They might already be set if we are receiving messages from a Graylog forwarder.

Every message should have a gl2_message_id now

This enables us to sort by gl2_message_id, even if the field does not exist. This might be the case with older indices, restored from archives.

This test should not be using the default index template

Multiple forwarders can run the same input, that's why we need to differentiate them in the cache.

This allows us to sort on older indices that might not have that field yet. https://www.elastic.co/guide/en/elasticsearch/reference/7.17/sort-search-results.html#_ignoring_unmapped_fields

thll

Very nice improvement! 👍

graylog2-server/src/main/java/org/graylog2/shared/buffers/processors/MessageULIDGenerator.java

...ensearch2/src/main/java/org/graylog/storage/opensearch2/views/searchtypes/OSMessageList.java

Also add some tests

This is simpler and probably leads to less log messages.

No time to review it again.

* Make messages with identical timestamps sortable by ULID The ULID format is composed of a 48 bit timestamp followed by 80 bits of randomness. Use the first 16 bits of the random field to embed a sequence number for each message. If a batch of messages was received with identical timestamps (the same millisecond), the original receive order is kept by the encoded sequence number which directly follows the timestamp. * Add license * Use a MessageInput sequenceNr instead of journalOffset Introduce a seqence number on each input which gets incremented and embedded in every received message. This allows sorting to work without depending on a KafkaJournal. Regenerate the protobuf class JournalMessages so it can pass the sequence number from a RawMessage to a Message. Don't overwrite already existing GL2_MESSAGE_IDs. They might already be set if we are receiving messages from a Graylog forwarder. * final some variables * Fix review comments * Use protoc 2.5.0 instead of 3.0.0 * Handle sequenceNr wrap and exceptions * update license * better log messages; bump offset gap * increase cache size * Cleanup and fix test * Fix some comments and add changelog * Bump OFFSET_GAP to 5000 and rename a few constants With an OFFSET_GAP of 5000, I couldn't reproduce negative sequence numbers anymore. It's a tradeoff, but ordering only 60535 messages for the same timestamp is reasonable. * improve java doc * Always add gl2_message_id as a second sort order, if sorting by timestamp is requested. * Don't assume that sort list is mutable * Fix BackendStartupIT test Every message should have a gl2_message_id now * Clarify when gl2_message_id might not be empty * Add a index mapping for gl2_message_id This enables us to sort by gl2_message_id, even if the field does not exist. This might be the case with older indices, restored from archives. * Fix IndexMappingTest * fix FieldTypePollerIT * Fix FieldAliasForEvents IT This test should not be using the default index template * Handle Messages with sequenceNr that are received from Forwarders * Include the nodeId into the sequenceNr lookup cache Multiple forwarders can run the same input, that's why we need to differentiate them in the cache. * Provide unmapped_type for gl2_message_id sort This allows us to sort on older indices that might not have that field yet. https://www.elastic.co/guide/en/elasticsearch/reference/7.17/sort-search-results.html#_ignoring_unmapped_fields * Only add gl2_message_id sort if not already present Also add some tests * Simply reset the sequenceNrCache in case we exceed the ULID limit This is simpler and probably leads to less log messages. * Improve log message

mpfz0r added improvement in progress labels Nov 3, 2019

mpfz0r requested a review from bernd November 5, 2019 10:56

bernd self-assigned this Nov 5, 2019

github-actions bot force-pushed the ulid-sorted-messages branch from 27d71fd to 6d52c07 Compare November 21, 2019 13:19

mpfz0r force-pushed the ulid-sorted-messages branch from 6d52c07 to ae39533 Compare December 10, 2019 10:25

bernd previously requested changes Dec 13, 2019

View reviewed changes

mpfz0r commented Dec 30, 2019

View reviewed changes

mpfz0r requested a review from bernd January 2, 2020 11:17

bernd added this to the 3.3.0 milestone Jan 27, 2020

bernd mentioned this pull request Jan 27, 2020

Conditionally set "gl2_message_id" and use message timestamp #7290

Merged

bernd modified the milestones: 3.3.0, 4.0.0 May 12, 2020

HenryTheSir mentioned this pull request Aug 7, 2020

Overcoming log ordering issues with millisecond precision #2741

Closed

bernd removed this from the 4.0.0 milestone Jan 27, 2022

mpfz0r force-pushed the ulid-sorted-messages branch from 7b3e148 to 751dec0 Compare October 12, 2022 16:14

boosty assigned mpfz0r and patrickmann Oct 14, 2022

patrickmann reviewed Nov 7, 2022

View reviewed changes

graylog2-server/src/main/java/org/graylog2/shared/buffers/processors/MessageULIDGenerator.java Outdated Show resolved Hide resolved

patrickmann approved these changes Nov 7, 2022

View reviewed changes

mpfz0r added 6 commits November 8, 2022 17:27

Add license

2bce197

final some variables

bf47d17

Fix review comments

ca73079

Use protoc 2.5.0 instead of 3.0.0

8c4f0fe

mpfz0r added 8 commits November 17, 2022 10:24

Don't assume that sort list is mutable

269e65f

Fix BackendStartupIT test

0650e75

Every message should have a gl2_message_id now

Clarify when gl2_message_id might not be empty

6a79f0e

Merge remote-tracking branch 'origin/master' into ulid-sorted-messages

80db1cd

Add a index mapping for gl2_message_id

6b98027

This enables us to sort by gl2_message_id, even if the field does not exist. This might be the case with older indices, restored from archives.

Fix IndexMappingTest

7b1d1de

fix FieldTypePollerIT

8005a4e

Fix FieldAliasForEvents IT

5797e4e

This test should not be using the default index template

mpfz0r mentioned this pull request Nov 22, 2022

Widget error: No mapping found for gl2_message_id in order to sort on #13875

Closed

mpfz0r requested a review from thll November 22, 2022 11:23

mpfz0r added 4 commits November 22, 2022 17:54

Handle Messages with sequenceNr that are received from Forwarders

30ff22d

Include the nodeId into the sequenceNr lookup cache

8722491

Multiple forwarders can run the same input, that's why we need to differentiate them in the cache.

Provide unmapped_type for gl2_message_id sort

9641d2e

This allows us to sort on older indices that might not have that field yet. https://www.elastic.co/guide/en/elasticsearch/reference/7.17/sort-search-results.html#_ignoring_unmapped_fields

Merge remote-tracking branch 'origin/master' into ulid-sorted-messages

767d2d5

mpfz0r added inputs and removed in progress labels Nov 25, 2022

thll requested changes Nov 29, 2022

View reviewed changes

graylog2-server/src/main/java/org/graylog2/shared/buffers/processors/MessageULIDGenerator.java Outdated Show resolved Hide resolved

...ensearch2/src/main/java/org/graylog/storage/opensearch2/views/searchtypes/OSMessageList.java Outdated Show resolved Hide resolved

mpfz0r added 4 commits November 29, 2022 18:18

Only add gl2_message_id sort if not already present

485ca5b

Also add some tests

Simply reset the sequenceNrCache in case we exceed the ULID limit

2246363

This is simpler and probably leads to less log messages.

Improve log message

ba6c3d9

Merge remote-tracking branch 'origin/master' into ulid-sorted-messages

ffbaf49

mpfz0r requested a review from thll November 29, 2022 17:54

thll approved these changes Nov 30, 2022

View reviewed changes

mpfz0r merged commit 18678f1 into master Nov 30, 2022

mpfz0r deleted the ulid-sorted-messages branch November 30, 2022 13:47

boosty mentioned this pull request Jan 2, 2023

sort messages by more than one fields #5693

Open

martin0258 mentioned this pull request Apr 17, 2023

ULID gl2_message_id field did not include the sequence number? #15237

Closed

mpfz0r mentioned this pull request Feb 29, 2024

Default message sorting in search results is not transparent in 5.1 and beyond #18348

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make messages with identical timestamps sortable by ULID #6711

Make messages with identical timestamps sortable by ULID #6711

mpfz0r commented Nov 3, 2019 •

edited

mpfz0r commented Nov 21, 2019

bernd left a comment

mpfz0r left a comment

bernd commented Jan 27, 2020

bernd commented Oct 12, 2022

mpfz0r commented Oct 12, 2022

patrickmann left a comment

thll left a comment

Make messages with identical timestamps sortable by ULID #6711

Make messages with identical timestamps sortable by ULID #6711

Conversation

mpfz0r commented Nov 3, 2019 • edited

CAVEATS

Performance Impact

mpfz0r commented Nov 21, 2019

bernd left a comment

Choose a reason for hiding this comment

mpfz0r left a comment

Choose a reason for hiding this comment

bernd commented Jan 27, 2020

bernd commented Oct 12, 2022

mpfz0r commented Oct 12, 2022

patrickmann left a comment

Choose a reason for hiding this comment

thll left a comment

Choose a reason for hiding this comment

mpfz0r commented Nov 3, 2019 •

edited