feat(providers/amazon): add BedrockRerankOperator for document reranking#67787
feat(providers/amazon): add BedrockRerankOperator for document reranking#67787gauravSsinha wants to merge 1 commit into
Conversation
|
Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contributors' Guide
|
Add BedrockRerankOperator that uses the Bedrock Agent Runtime Rerank
API to score and reorder documents by relevance to a query. This is
useful for improving RAG pipeline quality by filtering and prioritizing
retrieved results before passing them to a generative model.
Features:
- Configurable model_id (defaults to cohere.rerank-v3-5:0)
- number_of_results parameter to limit returned documents
- Template-able query, documents, and model_id
- Returns ranked results with relevance scores
Example usage in a DAG:
rerank = BedrockRerankOperator(
task_id='rerank_results',
query='What is serverless computing?',
documents=[{'textDocument': {'text': doc}} for doc in retrieved_docs],
model_id='cohere.rerank-v3-5:0',
number_of_results=5,
)
Signed-off-by: Gaurav Kumar Sinha <gaurav@substrai.dev>
afb2dcb to
05fb657
Compare
| return job_arn | ||
|
|
||
|
|
||
| class BedrockRerankOperator(AwsBaseOperator[BedrockAgentRuntimeHook]): |
There was a problem hiding this comment.
Please create unit tests associated to this operator
|
Congratulations on your first PR! Static checks are failing @gauravSsinha |
|
@gauravSsinha A few things need addressing before review — see our Pull Request quality criteria. No rush. Note: This comment was drafted by an AI-assisted triage tool and may contain mistakes. Once you have addressed the points above, an Apache Airflow maintainer — a real person — will take the next look at your PR. We use this two-stage triage process so that our maintainers' limited time is spent where it matters most: the conversation with you. |
Description
Adds
BedrockRerankOperatorthat uses the Bedrock Agent RuntimeRerankAPI to score and reorder documents by relevance to a query.Motivation
Reranking is a critical step in production RAG pipelines — it improves answer quality by reordering retrieved documents before passing them to a generative model. Amazon Bedrock now supports reranking via the Cohere Rerank model, but there's no Airflow operator for it.
Changes
Added
BedrockRerankOperatortoproviders/amazon/src/airflow/providers/amazon/aws/operators/bedrock.py:BedrockAgentRuntimeHook(bedrock-agent-runtime client)model_id(defaults tocohere.rerank-v3-5:0)number_of_resultsparameter to limit returned documentsExample Usage
Related
BedrockRaGOperator,BedrockRetrieveOperatorWas generative AI tooling used to co-author this PR?
Generated-by: Kiro (AI IDE)