Skip to content

Improve efficiency of DatastoreIndexingRouter#source_event_versions_in_index: leverage routing/timestamp #1077

@myronmarston

Description

@myronmarston

This method searches all shards an indices:

# Note: we intentionally search the entire index expression, not just an individual index based on a rollover timestamp.
# And we intentionally do NOT provide a routing value--we want to find the version, no matter what shard the document
# lives on.
#
# Since this `source_event_versions_in_index` is for handling malformed events, its possible that the
# rollover timestamp or routing value on the operation is wrong and that the correct document lives in
# a different shard and index than what the operation is targeted at. We want to search across all of them
# so that we will find it, regardless of where it lives.

This is quite inefficient, particularly when you have many malformed events--it can cause significant load problems on a cluster.

The rationale given in that comment is speculative and I suspect we've never actually had malformed documents with the wrong routing/index rollover timestamp. We should make it more efficient.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions