Skip to content

Expand aliases in ElasticsearchIO.Read #19441

@kennknowles

Description

@kennknowles

I am building a pipeline that needs to process all records in a set of indexes, each suffixed with a timestamp. I have an alias that matches all of these indexes at once. However, I cannot use the alias name in ElasticsearchIO as it will try to read stats from this specific index. Because it is an alias and not an actual index, the response contains no count for the alias name itself and therefore Beam (Dataflow?) will estimate the size as being 0. This makes the pipeline end without even executing the query on the alias, even though that would have returned loads of documents.

This should be easy to fix as the results of /<aliasname>/_stats only contains indexes references by that alias, so instead of looking for a key <aliasname> in the indices key in the returned JSON, it should consider all returned indexes and add the estimated sizes together.

Imported from Jira BEAM-6920. Original Jira may contain additional context.
Reported by: MadEgg.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions