Skip to content

feat(providers/amazon): add BedrockRerankOperator for document reranking#67787

Open
gauravSsinha wants to merge 1 commit into
apache:mainfrom
substrai:feat/bedrock-rerank-operator
Open

feat(providers/amazon): add BedrockRerankOperator for document reranking#67787
gauravSsinha wants to merge 1 commit into
apache:mainfrom
substrai:feat/bedrock-rerank-operator

Conversation

@gauravSsinha
Copy link
Copy Markdown

Description

Adds BedrockRerankOperator that uses the Bedrock Agent Runtime Rerank API to score and reorder documents by relevance to a query.

Motivation

Reranking is a critical step in production RAG pipelines — it improves answer quality by reordering retrieved documents before passing them to a generative model. Amazon Bedrock now supports reranking via the Cohere Rerank model, but there's no Airflow operator for it.

Changes

Added BedrockRerankOperator to providers/amazon/src/airflow/providers/amazon/aws/operators/bedrock.py:

  • Uses BedrockAgentRuntimeHook (bedrock-agent-runtime client)
  • Configurable model_id (defaults to cohere.rerank-v3-5:0)
  • number_of_results parameter to limit returned documents
  • All key parameters are template-able
  • Returns ranked results with relevance scores

Example Usage

rerank = BedrockRerankOperator(
    task_id='rerank_results',
    query='What is serverless computing?',
    documents=[{'textDocument': {'text': doc}} for doc in retrieved_docs],
    model_id='cohere.rerank-v3-5:0',
    number_of_results=5,
)

Related


Was generative AI tooling used to co-author this PR?
  • Yes
    Generated-by: Kiro (AI IDE)

@gauravSsinha gauravSsinha requested a review from o-nikolas as a code owner May 30, 2026 21:43
@boring-cyborg boring-cyborg Bot added area:providers provider:amazon AWS/Amazon - related issues labels May 30, 2026
@boring-cyborg
Copy link
Copy Markdown

boring-cyborg Bot commented May 30, 2026

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide
Here are some useful points:

  • Pay attention to the quality of your code (ruff, mypy and type annotations). Our prek-hooks will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example Dag that shows how users should use it.
  • Consider using Breeze environment for testing locally, it's a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
  • Be sure to read the Airflow Coding style.
  • Always keep your Pull Requests rebased, otherwise your build might fail due to changes not related to your commits.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://s.apache.org/airflow-slack

Add BedrockRerankOperator that uses the Bedrock Agent Runtime Rerank
API to score and reorder documents by relevance to a query. This is
useful for improving RAG pipeline quality by filtering and prioritizing
retrieved results before passing them to a generative model.

Features:
- Configurable model_id (defaults to cohere.rerank-v3-5:0)
- number_of_results parameter to limit returned documents
- Template-able query, documents, and model_id
- Returns ranked results with relevance scores

Example usage in a DAG:
    rerank = BedrockRerankOperator(
        task_id='rerank_results',
        query='What is serverless computing?',
        documents=[{'textDocument': {'text': doc}} for doc in retrieved_docs],
        model_id='cohere.rerank-v3-5:0',
        number_of_results=5,
    )

Signed-off-by: Gaurav Kumar Sinha <gaurav@substrai.dev>
@gauravSsinha gauravSsinha force-pushed the feat/bedrock-rerank-operator branch from afb2dcb to 05fb657 Compare May 30, 2026 21:47
return job_arn


class BedrockRerankOperator(AwsBaseOperator[BedrockAgentRuntimeHook]):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please create unit tests associated to this operator

@Srabasti
Copy link
Copy Markdown
Contributor

Srabasti commented Jun 2, 2026

Congratulations on your first PR! Static checks are failing @gauravSsinha
Please run prek locally to ensure no errors and then commit your branch to repo after rebasing.

@potiuk
Copy link
Copy Markdown
Member

potiuk commented Jun 4, 2026

@gauravSsinha A few things need addressing before review — see our Pull Request quality criteria.

  • Pre-commit / static checks. See docs.
  • Other failing CI checks. See docs.

No rush.


Note: This comment was drafted by an AI-assisted triage tool and may contain mistakes. Once you have addressed the points above, an Apache Airflow maintainer — a real person — will take the next look at your PR. We use this two-stage triage process so that our maintainers' limited time is spent where it matters most: the conversation with you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:providers provider:amazon AWS/Amazon - related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants