feat: add bedrock-titan-embedding skill for AWS Bedrock Titan V2 by MarouaneBenabdelkader · Pull Request #69 · AmadeusITGroup/docs2vecs

MarouaneBenabdelkader · 2026-04-27T10:45:58Z

Adds a new embedding skill that generates vectors via AWS Bedrock's Titan Embed Text v2 model. AWS credentials resolve through boto3's default provider chain (env vars, AWS_PROFILE, IAM role, etc.) so the skill works from laptops, CI, ECS, EC2 and Lambda without code changes and without putting secrets in YAML.

Params (all optional, sensible defaults): region, model_id, dimensions (256/512/1024), normalize, max_retries, retry_backoff. The skill follows the existing embedding-skill pattern: lazy-init client in init, self._config.get(...) for params, per-chunk iteration that mutates chunk.embedding in place, empty-content short-circuit, and self.logger for info/debug/warning output.

Verified end-to-end against live Bedrock: 100 Confluence chunks embedded to 1024-dim unit-normalized vectors, and a Develocity docx split into 7 chunks embedded cleanly across dimensions 256/512/1024.

Adds a new embedding skill that generates vectors via AWS Bedrock's Titan Embed Text v2 model. AWS credentials resolve through boto3's default provider chain (env vars, AWS_PROFILE, IAM role, etc.) so the skill works from laptops, CI, ECS, EC2 and Lambda without code changes and without putting secrets in YAML. Params (all optional, sensible defaults): region, model_id, dimensions (256/512/1024), normalize, max_retries, retry_backoff. The skill follows the existing embedding-skill pattern: lazy-init client in __init__, self._config.get(...) for params, per-chunk iteration that mutates chunk.embedding in place, empty-content short-circuit, and self.logger for info/debug/warning output. Verified end-to-end against live Bedrock: 100 Confluence chunks embedded to 1024-dim unit-normalized vectors, and a Develocity docx split into 7 chunks embedded cleanly across dimensions 256/512/1024.

Copilot

Pull request overview

Adds an AWS Bedrock embedding implementation (Titan Embed Text v2) to the indexer skill system so pipelines can generate embeddings via Bedrock using standard boto3 credential resolution.

Changes:

Introduces BedrockTitanEmbeddingSkill to call bedrock-runtime.invoke_model and write embeddings back onto chunks.
Registers the new skill in the skill factory / exports it from the skills package.
Updates config schema, documentation, and adds boto3 as a dependency.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/docs2vecs/subcommands/indexer/skills/factory.py`	Registers the new `bedrock-titan-embedding` skill name and maps it to the implementation class.
`src/docs2vecs/subcommands/indexer/skills/bedrock_titan_embedding_skill.py`	Implements the Bedrock Titan v2 embedding skill with retries and chunk mutation.
`src/docs2vecs/subcommands/indexer/skills/__init__.py`	Exposes `BedrockTitanEmbeddingSkill` via package exports.
`src/docs2vecs/subcommands/indexer/config/config_schema.yaml`	Adds YAML schema entries for Bedrock skill parameters.
`pyproject.toml`	Adds `boto3` dependency required by the new skill.
`docs/readme/indexer-skills.md`	Documents how to configure and use the new Bedrock Titan embedding skill.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-27T10:51:03Z

+            except Exception as exc:
+                if attempt == self._max_retries - 1:
+                    raise
+                wait = self._retry_backoff * (attempt + 1)
+                self.logger.warning(
+                    f"Bedrock call failed (attempt {attempt + 1}/{self._max_retries}): {exc} - retrying in {wait}s"
+                )
+                time.sleep(wait)


The retry loop retries on any Exception (including JSON parsing/KeyError, config errors like missing region/credentials, and programmer errors), which can waste time and hide the real failure mode. Consider catching botocore exceptions (e.g., ClientError/BotoCoreError) and only retrying transient failures (throttling, timeouts, 5xx), while surfacing non-retryable errors immediately with a clearer message.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings April 27, 2026 10:45

Copilot started reviewing on behalf of MarouaneBenabdelkader April 27, 2026 10:46 View session

Copilot AI reviewed Apr 27, 2026

View reviewed changes

Apply suggestion from @Copilot

0a300f8

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

dpomian approved these changes Apr 28, 2026

View reviewed changes

dpomian merged commit d988286 into AmadeusITGroup:main Apr 28, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add bedrock-titan-embedding skill for AWS Bedrock Titan V2#69

feat: add bedrock-titan-embedding skill for AWS Bedrock Titan V2#69
dpomian merged 2 commits intoAmadeusITGroup:mainfrom
MarouaneBenabdelkader:feat/bedrock-titan-embedding

MarouaneBenabdelkader commented Apr 27, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 27, 2026

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

MarouaneBenabdelkader commented Apr 27, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants