feat: add bedrock-titan-embedding skill for AWS Bedrock Titan V2#69
Conversation
Adds a new embedding skill that generates vectors via AWS Bedrock's Titan Embed Text v2 model. AWS credentials resolve through boto3's default provider chain (env vars, AWS_PROFILE, IAM role, etc.) so the skill works from laptops, CI, ECS, EC2 and Lambda without code changes and without putting secrets in YAML. Params (all optional, sensible defaults): region, model_id, dimensions (256/512/1024), normalize, max_retries, retry_backoff. The skill follows the existing embedding-skill pattern: lazy-init client in __init__, self._config.get(...) for params, per-chunk iteration that mutates chunk.embedding in place, empty-content short-circuit, and self.logger for info/debug/warning output. Verified end-to-end against live Bedrock: 100 Confluence chunks embedded to 1024-dim unit-normalized vectors, and a Develocity docx split into 7 chunks embedded cleanly across dimensions 256/512/1024.
There was a problem hiding this comment.
Pull request overview
Adds an AWS Bedrock embedding implementation (Titan Embed Text v2) to the indexer skill system so pipelines can generate embeddings via Bedrock using standard boto3 credential resolution.
Changes:
- Introduces
BedrockTitanEmbeddingSkillto callbedrock-runtime.invoke_modeland write embeddings back onto chunks. - Registers the new skill in the skill factory / exports it from the skills package.
- Updates config schema, documentation, and adds
boto3as a dependency.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
src/docs2vecs/subcommands/indexer/skills/factory.py |
Registers the new bedrock-titan-embedding skill name and maps it to the implementation class. |
src/docs2vecs/subcommands/indexer/skills/bedrock_titan_embedding_skill.py |
Implements the Bedrock Titan v2 embedding skill with retries and chunk mutation. |
src/docs2vecs/subcommands/indexer/skills/__init__.py |
Exposes BedrockTitanEmbeddingSkill via package exports. |
src/docs2vecs/subcommands/indexer/config/config_schema.yaml |
Adds YAML schema entries for Bedrock skill parameters. |
pyproject.toml |
Adds boto3 dependency required by the new skill. |
docs/readme/indexer-skills.md |
Documents how to configure and use the new Bedrock Titan embedding skill. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| except Exception as exc: | ||
| if attempt == self._max_retries - 1: | ||
| raise | ||
| wait = self._retry_backoff * (attempt + 1) | ||
| self.logger.warning( | ||
| f"Bedrock call failed (attempt {attempt + 1}/{self._max_retries}): {exc} - retrying in {wait}s" | ||
| ) | ||
| time.sleep(wait) |
There was a problem hiding this comment.
The retry loop retries on any Exception (including JSON parsing/KeyError, config errors like missing region/credentials, and programmer errors), which can waste time and hide the real failure mode. Consider catching botocore exceptions (e.g., ClientError/BotoCoreError) and only retrying transient failures (throttling, timeouts, 5xx), while surfacing non-retryable errors immediately with a clearer message.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Adds a new embedding skill that generates vectors via AWS Bedrock's Titan Embed Text v2 model. AWS credentials resolve through boto3's default provider chain (env vars, AWS_PROFILE, IAM role, etc.) so the skill works from laptops, CI, ECS, EC2 and Lambda without code changes and without putting secrets in YAML.
Params (all optional, sensible defaults): region, model_id, dimensions (256/512/1024), normalize, max_retries, retry_backoff. The skill follows the existing embedding-skill pattern: lazy-init client in init, self._config.get(...) for params, per-chunk iteration that mutates chunk.embedding in place, empty-content short-circuit, and self.logger for info/debug/warning output.
Verified end-to-end against live Bedrock: 100 Confluence chunks embedded to 1024-dim unit-normalized vectors, and a Develocity docx split into 7 chunks embedded cleanly across dimensions 256/512/1024.