Skip to content

Remove hidden foundation model initialization for byokg-graph (#183)#184

Merged
acarbonetto merged 9 commits intoawslabs:mainfrom
mykola-pereyma:fix/remove-hidden-fm-initialization-byokg
Apr 9, 2026
Merged

Remove hidden foundation model initialization for byokg-graph (#183)#184
acarbonetto merged 9 commits intoawslabs:mainfrom
mykola-pereyma:fix/remove-hidden-fm-initialization-byokg

Conversation

@mykola-pereyma
Copy link
Copy Markdown
Contributor

Fixes #183

Description

Resolves the issue where ByoKGQueryEngine silently creates a BedrockGenerator with a hardcoded model and region when llm_generator is not provided, forcing a Bedrock dependency on users who only need graph traversal.

Changes

Core fix (byokg_query_engine.py):

  • Removed hidden BedrockGenerator default — llm_generator stays None if not provided
  • AgenticRetriever and KGLinker are now only auto-created when llm_generator is explicitly provided
  • query() returns gracefully when no linker is configured
  • generate_response() raises a clear ValueError when called without llm_generator

Model ID updates (bedrock_llms.py, notebooks, docs):

  • Replaced all legacy Claude 3.x model IDs with active Claude 4.x equivalents
  • Default model: anthropic.claude-3-7-sonnet-20250219-v1:0anthropic.claude-sonnet-4-20250514-v1:0

Notebooks (5 files):

  • Added explicit llm_generator=llm_generator to all ByoKGQueryEngine constructor calls
  • Added BedrockGenerator setup to the embeddings notebook which previously had none

Documentation (5 files):

  • Updated parameter tables, examples, and descriptions to reflect that llm_generator must be explicitly provided for LLM-powered features
  • Updated supported model lists from legacy to active models

Testing

  • All 9 unit tests pass
  • End-to-end validation against live Bedrock confirmed:
    • Explicit llm_generator path: query + generate works, ground-truth answer retrieved
    • No-LLM path: initialization succeeds, query() returns empty list, generate_response() raises ValueError

Backward compatibility

  • No change for users who already pass llm_generator — behavior is identical
  • Users relying on the hidden default need to add explicit BedrockGenerator initialization

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

ByoKGQueryEngine.__init__() silently created a BedrockGenerator with
hardcoded model/region when llm_generator was not provided, forcing a
Bedrock dependency on users who bring their own KG without needing an LLM.

- Remove hidden BedrockGenerator default creation
- Gate AgenticRetriever and KGLinker auto-creation on llm_generator being provided
- Add early return in query() when kg_linker is None
- Add ValueError in generate_response() when llm_generator is None
- Update test_initialization_with_defaults to reflect new behavior
Add llm_generator=llm_generator to all ByoKGQueryEngine constructor
calls across 5 example notebooks. For the embeddings notebook, also add
BedrockGenerator import and instantiation since it had no LLM setup.
- configuration.md: llm_generator default changed from Auto-created to None,
  triplet_retriever/kg_linker defaults clarified as conditional on llm_generator
- overview.md: clarify LLM must be explicitly configured via BedrockGenerator
- query-engine.md: all examples now include llm_generator, basic init shows
  full BedrockGenerator setup
- faq.md: all ByoKGQueryEngine examples include llm_generator
- multi-strategy-retrieval.md: all examples include llm_generator
Replace legacy anthropic.claude-3-7-sonnet-20250219-v1:0 with active
anthropic.claude-sonnet-4-20250514-v1:0 as the default model_name.
…net 4

Replace us.anthropic.claude-3-7-sonnet-20250219-v1:0 with
us.anthropic.claude-sonnet-4-20250514-v1:0 across all 5 byokg-rag
example notebooks.
@mykola-pereyma mykola-pereyma force-pushed the fix/remove-hidden-fm-initialization-byokg branch from 1d8d739 to 58991dd Compare April 3, 2026 00:33
Replace all legacy model IDs in configuration.md, faq.md, and
query-engine.md with active equivalents:
- Claude 3.5/3.7 Sonnet -> Claude Sonnet 4
- Claude 3 Opus -> Claude Opus 4.1
- Claude 3 Haiku -> Claude Haiku 4.5
@mykola-pereyma mykola-pereyma force-pushed the fix/remove-hidden-fm-initialization-byokg branch from 58991dd to 61e036d Compare April 3, 2026 00:40
Copy link
Copy Markdown
Collaborator

@cmavro cmavro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic looks good to me (if that tests and notebooks successfully run, we can merge).

Could we replace Sonnet-4 with the latest Sonnet-4.6:

  • anthropic.claude-sonnet-4-6
  • anthropic.claude-sonnet-4-5-20250929-v1:0 (or Sonnet-4.5)

- Default model: anthropic.claude-sonnet-4-6 (was claude-sonnet-4-20250514)
- Opus: anthropic.claude-opus-4-6-v1 (was claude-opus-4-1-20250805)
- Haiku 4.5 unchanged (still latest)
Comment thread docs/byokg-rag/configuration.md Outdated
Comment thread docs/byokg-rag/configuration.md
Add Active and Legacy model sections to configuration.md and faq.md
with EOL dates. Legacy models are explicitly noted as available only to
users who have actively used them in the last 15 days. Link to the
Amazon Bedrock model lifecycle documentation for latest status.
Change BedrockGenerator default region_name from us-west-2 to us-east-1
to be consistent with the majority of the repo (lexical-graph docs and
examples use us-east-1). Updated source, tests, docs, and notebooks.
@acarbonetto acarbonetto merged commit 03cd315 into awslabs:main Apr 9, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove hidden foundation model initialization in ByoKGQueryEngine

4 participants