Skip to content

ReverseMerge: V0.163.4 hotfix#1979

Closed
pk-zipstack wants to merge 3 commits into
mainfrom
v0.163.4-hotfix
Closed

ReverseMerge: V0.163.4 hotfix#1979
pk-zipstack wants to merge 3 commits into
mainfrom
v0.163.4-hotfix

Conversation

@pk-zipstack
Copy link
Copy Markdown
Contributor

What

Back-merge of v0.163.4-hotfix (released as v0.163.5) into main.

Three hotfix commits sit on v0.163.4-hotfix ahead of main:

Net new content this PR carries to main is just the litellm CVE fix.

Why

Per the Hotfix Deployment Guide Step 9, hotfix branches must be back-merged to main after production deployment so the fixes carry forward into future releases.

How

  • Use rebase merge (preferred) or squash merge — not the default merge commit. Per the doc, click the merge dropdown and select the right option.

Expected merge conflicts

Local git merge-tree flagged conflicts in these files:

  • unstract/sdk1/src/unstract/sdk1/adapters/base1.py
  • unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/bedrock.json
  • unstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/bedrock.json
  • unstract/sdk1/tests/test_bedrock_adapter.py
  • uv.lock

These conflicts are not from the litellm change — they come from #1944 (IAM Role / Instance Profile) on the hotfix branch overlapping with main's newer #1952 (UN-3152 [FEAT] Add AWS_BEARER_TOKEN_BEDROCK auth to Bedrock LLM and Embedding adapters). Main's version integrates both auth modes (access_keys + iam_role + bearer_token) and supersedes the hotfix-branch version.

Resolution guidance: for the four bedrock files, keep main's version (it already has the IAM Role content from the prior back-merge in #1946 plus the bearer-token addition from #1952). For uv.lock, regenerate after resolving.

Can this PR break any existing features. If yes, please list possible items. If no, please explain why.

No. The only new content reaching main is the litellm 1.82.3 → 1.83.10 bump, which was independently reviewed and merged in #1976, then released to production as v0.163.5.

Database Migrations

  • None

Env Config

  • None

Relevant Docs

Related Issues or PRs

Dependencies Versions

  • litellm: 1.82.3 (Zipstack GitHub fork) → 1.83.10 (PyPI)
  • tiktoken: ~=0.9.0~=0.12.0 (required by litellm 1.83.10's pin)

Notes on Testing

Tested in #1976 and validated in production via the v0.163.5 release. No additional testing required for the back-merge itself — but please verify the bedrock-file conflict resolution preserves main's three-mode auth (access_keys + iam_role + bearer_token).

Screenshots

N/A

Checklist

I have read and understood the Contribution Guidelines.

…1918)

* [FIX] Use importlib.util.find_spec for pluggable worker discovery

_verify_pluggable_worker_exists() previously checked for the literal file
`pluggable_worker/<name>/worker.py` on disk, which breaks when the plugin
has been compiled to a .so (Nuitka, Cython, or any C extension) — the
module is perfectly importable but the pre-check rejects it because only
the .py extension is considered.

Replace the filesystem check with importlib.util.find_spec(), which is
Python's standard way to ask "is this module resolvable by the import
system?". It honors every registered finder — source .py, compiled .so,
bytecode .pyc, namespace packages, zipimports — so the function now
matches what its docstring claims: verifying the module can be loaded,
not that a specific file extension is present.

Behavior is preserved for existing deployments:
- Images with no `pluggable_worker/<name>/` subpackage → find_spec
  raises ModuleNotFoundError (ImportError subclass) → returns False.
- Images with source .py → find_spec resolves the .py → returns True.
- Images with compiled .so → find_spec resolves the .so → returns True.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FIX] Handle ValueError from find_spec in pluggable worker verification

Greptile-flagged edge case: importlib.util.find_spec() can raise
ValueError (not just ImportError) when sys.modules has a partially
initialised module entry with __spec__ = None from a prior failed import.
Broaden the except to catch both.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FIX] Resolve api-deployment worker directory from enum import path

worker.py:452 did worker_type.value.replace("-", "_") to derive the
on-disk dir name. All WorkerType enum values already use underscores,
so the replace was a no-op; for API_DEPLOYMENT whose dir is
"api-deployment" (hyphen), it resolved to "api_deployment" and the
os.path.exists() check failed. Boot then logged a spurious
"❌ Worker directory not found: /app/api_deployment" at ERROR level.

The task registration path (builder + celery autodiscover via
to_import_path) is unaffected, so this was purely log noise — but
noise at ERROR level that masks real failures in log scans.

Fix: derive the directory from the authoritative to_import_path()
which already handles the hyphen case (api_deployment -> api-deployment).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…pter (#1944)

* [FEAT] Allow Bedrock to fall through to boto3's default credential chain

Match the S3/MinIO connector pattern: when AWS access keys are left blank
on the Bedrock LLM and embedding adapter forms, drop them from the kwargs
dict so boto3's default credential chain handles authentication. This
unlocks IAM role / instance profile / IRSA / AWS Profile scenarios on
hosts that already have ambient AWS credentials (e.g. EKS workers with
IRSA, EC2 with an instance profile).

- llm1/static/bedrock.json: clarify access-key descriptions to mention
  IRSA and instance profile (already non-required at v0.163.2 base).
- embedding1/static/bedrock.json: drop aws_access_key_id and
  aws_secret_access_key from top-level required; same description fix;
  expose aws_profile_name for parity with the LLM form.
- base1.py: AWSBedrockLLMParameters and AWSBedrockEmbeddingParameters
  now strip empty access-key values from the validated kwargs before
  returning, so empty strings don't override boto3's default chain.
  AWSBedrockEmbeddingParameters fields gain explicit None defaults
  and an aws_profile_name field.

Backward-compatible: existing adapters with access keys filled in
continue to work unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [FEAT] Add Authentication Type selector to Bedrock adapter form

Add an explicit `auth_type` selector with two options, making the auth
choice clear to users:

- "Access Keys" (default): existing flow, keys required
- "IAM Role / Instance Profile (on-prem AWS only)": no fields; relies on
  boto3's default credential chain (IRSA on EKS, task role on ECS,
  instance profile on EC2). Description on the selector explicitly notes
  this option is only for AWS-hosted Unstract deployments.

The form-only auth_type field is stripped before LiteLLM validation in
both AWSBedrockLLMParameters.validate() and AWSBedrockEmbeddingParameters.
validate(). Empty access keys continue to be stripped so boto3 falls
through to the default chain even when the access_keys arm is selected
without values (matches the S3/MinIO connector pattern).

Backward-compatible: legacy adapters without auth_type behave as
"Access Keys" mode (the default), and existing keys are forwarded
unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [REVIEW] Address Bedrock auth_type review feedback

Fixes the P0/P1 issues raised by greptile-apps and jaseemjaskp on
PR #1944.

Behaviour fixes:
- Stale-key leak in IAM Role mode: switching an existing adapter from
  Access Keys to IAM Role would carry truthy stored access keys through
  the strip-empty-only loop, so boto3 silently authenticated with the
  old long-lived credentials instead of falling through to the host's
  IRSA / instance-profile identity. Both LLM and embedding paths were
  affected.
- Silent acceptance of unknown auth_type: a typo (e.g. "access_key") or
  a malformed payload from a non-UI client passed through the dict
  comprehension untouched, with no enum guard.
- Cross-field validation gap: explicit Access Keys mode with blank or
  whitespace-only values silently fell through to the default
  credential chain instead of surfacing the misconfiguration.

Implementation:
- Add a module-level _resolve_bedrock_aws_credentials helper used by
  both AWSBedrockLLMParameters.validate() and AWSBedrock
  EmbeddingParameters.validate(), so the auth-type contract is
  expressed once.
  - Validates auth_type against an allowlist (None | "access_keys" |
    "iam_role"); raises ValueError on anything else.
  - iam_role: unconditionally drops aws_access_key_id and
    aws_secret_access_key.
  - access_keys (explicit): requires non-blank values; raises ValueError
    if either is empty or whitespace-only.
  - Legacy (auth_type absent): retains the lenient strip behaviour so
    pre-PR adapter configurations continue to deserialise unchanged.
- Restore aws_region_name as required (no `= None` default) on
  AWSBedrockEmbeddingParameters; only credentials may legitimately be
  absent.
- Drop the orphan aws_profile_name field from
  embedding1/static/bedrock.json: it was added for parity with the LLM
  form but lives outside the auth_type oneOf and contradicts the
  selector's "no further input" semantics. The LLM form already had
  aws_profile_name pre-PR and is left alone for backwards compatibility.

Tests:
- New tests/test_bedrock_adapter.py covers 15 cases across LLM and
  embedding adapters: legacy-no-auth-type, explicit access_keys with
  valid/blank/whitespace keys, iam_role with stale/no keys, unknown
  auth_type rejection, cross-field validation, and preservation of
  unrelated params (model_id, aws_profile_name, region, thinking).

Skipped (P2 nice-to-have):
- Comment-scope clarification, MinIO reference rewording,
  validate-mutates-caller'\''s-dict, and the LLM form description nit
  about aws_profile_name visibility. These don'\''t change behaviour
  and can be addressed in a follow-up.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…1976)

Hotfix for cloud v0.159.3 (OSS v0.163.4). Customer scanner flagged
litellm 1.82.3 for CVE-2026-42208 (SQL injection in litellm proxy auth
path, affects 1.81.16-1.83.6). We do not use litellm.proxy, but
vulnerability scanners flag the installed package regardless of which
code path is reachable.

Bump to 1.83.10 — the exact version recommended by the upstream advisory
(v1.83.10-stable) and the smallest jump that clears the CVE range while
keeping python-dotenv==1.0.1 compatible (1.83.14 would force bumping
python-dotenv across 7+ pyproject.toml files). Only tiktoken needed to
move 0.9 -> 0.12 to satisfy litellm's pin.

Switch source back to PyPI now that the PyPI quarantine is over,
reversing the temporary fork in #1873.

Cohere embed timeout patch: verified that
litellm/llms/cohere/embed/handler.py is byte-identical between v1.82.3,
v1.83.10-stable, and v1.83.14-stable (the timeout-not-forwarded bug
fixed in #1848 is still present upstream — BerriAI/litellm#14635 remains
OPEN). Version guard bumped 1.82.3 -> 1.83.10; 6/6 patch tests pass on
the new version, confirming the monkey-patch still binds correctly.

Other cleanup from #1873:
- Drop git apt-install from worker-unified and tool Dockerfiles (no
  git-sourced deps remain in any uv.lock)
- Bump tool versions: structure 0.0.100 -> 0.0.101,
  classifier 0.0.79 -> 0.0.80, text_extractor 0.0.75 -> 0.0.76

Note on root uv.lock churn: the v0.163.4 root uv.lock had a pre-existing
corruption (banks v2.4.1 entry pointing at banks-2.2.0 wheel) that
blocked incremental resolution. Regenerated from scratch.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 21, 2026

Summary by CodeRabbit

  • New Features

    • Added flexible AWS Bedrock authentication with support for access_keys and iam_role modes.
  • Chores

    • Updated tool versions: Structure Tool, Classifier, and Text Extractor.
    • Upgraded tiktoken and litellm dependencies.
    • Optimized Docker images by removing unnecessary packages.
    • Enhanced pluggable worker verification using Python's import system.

Walkthrough

This PR includes three independent change sets: tool version increments with Docker system dependency optimization, a comprehensive refactoring of AWS Bedrock authentication to support conditional credential modes (access keys vs. IAM role), and conversion of pluggable worker discovery from filesystem to import-system-based verification. The Bedrock changes include schema updates, validation logic, litellm version alignment, and comprehensive test coverage.

Changes

Tool Version Updates and Docker Optimization

Layer / File(s) Summary
Structure tool version bump to 0.0.101
backend/sample.env, tools/structure/src/config/properties.json
Structure tool Docker image and configuration version updated from 0.0.100 to 0.0.101.
Classifier and text extractor version increments
tools/classifier/src/config/properties.json, tools/text_extractor/src/config/properties.json
Classifier version bumped from 0.0.79 to 0.0.80; text extractor from 0.0.75 to 0.0.76.
Docker system dependency optimization
docker/dockerfiles/worker-unified.Dockerfile, tools/classifier/Dockerfile, tools/structure/Dockerfile, tools/text_extractor/Dockerfile
Git package removed from apt-get installations across all tool and worker Dockerfiles; worker-unified image gains libmagic-dev, libssl-dev, and pkg-config.
Public tools registry version synchronization
unstract/tool-registry/tool_registry_config/public_tools.json
Tool registry metadata synchronized with structure (0.0.101), classifier (0.0.80), and text extractor (0.0.76) versions and matching Docker image tags.

AWS Bedrock Conditional Authentication System

Layer / File(s) Summary
LiteLLM and tiktoken dependency updates
unstract/sdk1/pyproject.toml
Tiktoken upgraded to ~=0.12.0; litellm pinned to ==1.83.10 instead of custom git source.
Bedrock credential resolution and validation logic
unstract/sdk1/src/unstract/sdk1/adapters/base1.py
New _resolve_bedrock_aws_credentials() function enforces auth_type semantics (iam_role removes keys, access_keys requires them, legacy strips blanks); integrated into both LLM and embedding parameter validators to post-process credentials after pydantic validation.
Bedrock LLM and embedding adapter schemas
unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/bedrock.json, unstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/bedrock.json
Schemas refactored to replace unconditional AWS credential fields with auth_type-driven conditional validation; access keys required only for access_keys mode, omitted for iam_role mode.
Bedrock adapter parameter validation tests
unstract/sdk1/tests/test_bedrock_adapter.py
Comprehensive test suite covering auth_type modes (legacy/access_keys/iam_role), credential stripping/preservation, error cases (blank keys, unknown auth_type), and non-credential field preservation for both LLM and embedding adapters.
LiteLLM cohere timeout patch version update
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
Version guard and patch documentation updated from 1.82.3 to 1.83.10 to match the pinned litellm dependency.

Worker Discovery System Refactoring

Layer / File(s) Summary
Worker verification via import system
workers/shared/infrastructure/config/builder.py
Worker existence check converted from pathlib.Path filesystem inspection to importlib.util.find_spec and importlib.import_module to detect unresolvable and broken modules.
Worker directory path resolution from import path
workers/worker.py
Pluggable worker directory resolution now derives from worker_type.to_import_path() instead of manual enum value processing, aligning with the authoritative import-path mapping.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the PR as a back-merge of the v0.163.4 hotfix branch to main, clearly summarizing the primary change.
Description check ✅ Passed The description comprehensively covers all required template sections with specific details about the hotfix content, merge strategy, conflict resolution, and testing.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch v0.163.4-hotfix
⚔️ Resolve merge conflicts
  • Resolve merge conflict in branch v0.163.4-hotfix

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@sonarqubecloud
Copy link
Copy Markdown

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
workers/shared/infrastructure/config/builder.py (1)

348-350: ⚡ Quick win

Clarify the rationale for catching OSError and AttributeError.

The ValueError catch is well-documented (line 344-345), but the separate (OSError, AttributeError) block lacks explanation. What specific failure modes do these exceptions cover during pluggable worker verification?

📝 Add explanatory comment
         except (OSError, AttributeError):
+            # OSError: I/O errors during module file access
+            # AttributeError: Malformed module spec or broken import machinery
             logger.exception(f"Error verifying pluggable worker {worker_type.value}")
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@workers/shared/infrastructure/config/builder.py` around lines 348 - 350, Add
a short explanatory comment above the "except (OSError, AttributeError):" block
in the pluggable worker verification code explaining which failure modes are
being guarded: e.g., OSError covers filesystem/IO issues or problems loading
modules/resources when inspecting a plugin, and AttributeError covers missing
attributes or methods on the dynamically loaded worker (e.g., expected
class/method names absent). Keep the existing logger.exception(...) and return
False behavior unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@workers/shared/infrastructure/config/builder.py`:
- Around line 348-350: Add a short explanatory comment above the "except
(OSError, AttributeError):" block in the pluggable worker verification code
explaining which failure modes are being guarded: e.g., OSError covers
filesystem/IO issues or problems loading modules/resources when inspecting a
plugin, and AttributeError covers missing attributes or methods on the
dynamically loaded worker (e.g., expected class/method names absent). Keep the
existing logger.exception(...) and return False behavior unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 169bd087-ff6c-428f-bda9-8ec786518223

📥 Commits

Reviewing files that changed from the base of the PR and between 0559057 and 2dab564.

⛔ Files ignored due to path filters (8)
  • backend/uv.lock is excluded by !**/*.lock
  • platform-service/uv.lock is excluded by !**/*.lock
  • prompt-service/uv.lock is excluded by !**/*.lock
  • unstract/filesystem/uv.lock is excluded by !**/*.lock
  • unstract/sdk1/uv.lock is excluded by !**/*.lock
  • unstract/tool-registry/uv.lock is excluded by !**/*.lock
  • uv.lock is excluded by !**/*.lock
  • workers/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (17)
  • backend/sample.env
  • docker/dockerfiles/worker-unified.Dockerfile
  • tools/classifier/Dockerfile
  • tools/classifier/src/config/properties.json
  • tools/structure/Dockerfile
  • tools/structure/src/config/properties.json
  • tools/text_extractor/Dockerfile
  • tools/text_extractor/src/config/properties.json
  • unstract/sdk1/pyproject.toml
  • unstract/sdk1/src/unstract/sdk1/adapters/base1.py
  • unstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/bedrock.json
  • unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/bedrock.json
  • unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py
  • unstract/sdk1/tests/test_bedrock_adapter.py
  • unstract/tool-registry/tool_registry_config/public_tools.json
  • workers/shared/infrastructure/config/builder.py
  • workers/worker.py
💤 Files with no reviewable changes (1)
  • docker/dockerfiles/worker-unified.Dockerfile

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 21, 2026

Greptile Summary

This back-merge carries the litellm CVE-2026-42208 fix (1.82.3 fork → 1.83.10 PyPI) to main, along with the IAM Role / Instance Profile Bedrock auth mode and the importlib.util.find_spec pluggable-worker discovery improvement. The net-new content is small, but the conflict resolution for the four Bedrock adapter files went the wrong way.

  • litellm bump (core goal): pyproject.toml switches from the Zipstack GitHub fork to litellm==1.83.10 on PyPI, drops the git install-time dependency from Dockerfiles, and updates tiktoken to ~=0.12.0; the cohere-timeout monkey-patch is refreshed to match the new release.
  • Bedrock conflict regression: base1.py, both bedrock.json schemas, and the test file were resolved to the hotfix-branch version instead of main's version, dropping the bearer_token auth mode added in UN-3152 [FEAT] Add AWS_BEARER_TOKEN_BEDROCK auth to Bedrock LLM and Embedding adapters #1952 and breaking stored adapters that use it.
  • Worker improvements: importlib.util.find_spec replaces a raw filesystem check for compiled plugins, and worker.py derives worker_directory from WorkerType.to_import_path() instead of a blind hyphen-replace.

Confidence Score: 3/5

Not safe to merge as-is: the conflict resolution for the Bedrock adapter files chose the hotfix-branch version instead of main's, silently removing the bearer_token auth mode shipped in #1952.

The litellm bump and worker improvements are clean. The conflict resolution for base1.py and both bedrock.json schemas kept the hotfix-branch version, dropping _BEDROCK_VALID_AUTH_TYPES from four values to three and removing _translate_bedrock_bearer_token(). Any saved Bedrock adapter with auth_type: 'bearer_token' will throw ValueError: Unknown auth_type on its next validation call, and the UI will no longer offer the bearer-token option to new users.

unstract/sdk1/src/unstract/sdk1/adapters/base1.py, llm1/static/bedrock.json, and embedding1/static/bedrock.json all need to be re-resolved to main's three-mode auth version before merging.

Important Files Changed

Filename Overview
unstract/sdk1/src/unstract/sdk1/adapters/base1.py Conflict resolution chose the hotfix-branch version, dropping the bearer_token auth mode and its _translate_bedrock_bearer_token() helper; stored adapters using bearer_token will raise ValueError on next validation.
unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/bedrock.json JSON schema truncated to two auth_type values; bearer_token branch and aws_bearer_token field from main's #1952 are missing.
unstract/sdk1/src/unstract/sdk1/adapters/embedding1/static/bedrock.json Same truncation as the LLM schema — bearer_token option absent from auth_type enum.
unstract/sdk1/tests/test_bedrock_adapter.py New test suite covers IAM Role and Access Keys semantics well; no bearer_token cases (consistent with the missing mode).
unstract/sdk1/pyproject.toml Switches litellm from the Zipstack GitHub fork to PyPI 1.83.10 and bumps tiktoken to ~=0.12.0; the CVE fix is the sole net-new change on main.
unstract/sdk1/src/unstract/sdk1/patches/litellm_cohere_timeout.py Version pin updated to 1.83.10 and the copied async/sync handler bodies refreshed to match that release; approach is correct.
workers/shared/infrastructure/config/builder.py Replaces filesystem .exists() check with importlib.util.find_spec() for pluggable worker discovery; also catches ValueError for the __spec__ == None edge case.
workers/worker.py Derives worker_directory from to_import_path().rsplit(".", 1)[0]; correctly maps api_deploymentapi-deployment via the enum's directory mapping.
docker/dockerfiles/worker-unified.Dockerfile Removes git from the build image; safe since litellm is now sourced from PyPI rather than a Git tag.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Bedrock adapter validate] --> B{auth_type?}
    B -- access_keys --> C[Require aws_access_key_id + aws_secret_access_key non-blank]
    B -- iam_role --> D[Drop access key fields - boto3 default chain]
    B -- bearer_token MISSING AFTER MERGE --> E[_translate_bedrock_bearer_token - aws_bearer_token to api_key]
    B -- None legacy --> F[Strip blank keys silently]
    B -- unknown --> G[raise ValueError]
    C --> H[Return validated LiteLLM kwargs]
    D --> H
    E --> H
    F --> H
    style E fill:#ff6b6b,color:#fff
    style G fill:#ffa500,color:#fff
Loading

Fix All in Claude Code

Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
unstract/sdk1/src/unstract/sdk1/adapters/base1.py:487-489
**`bearer_token` auth mode silently dropped by conflict resolution**

The base commit (`main`) had `_BEDROCK_VALID_AUTH_TYPES = frozenset({None, "access_keys", "iam_role", "bearer_token"})` and a `_translate_bedrock_bearer_token()` function that renames `aws_bearer_token``api_key` for LiteLLM. This PR's version has `_BEDROCK_VALID_AUTH_TYPES = frozenset({None, "access_keys", "iam_role"})` with no `bearer_token` case and no translation helper. Any saved adapter with `auth_type: "bearer_token"` will immediately raise `ValueError: Unknown auth_type 'bearer_token'` on its next validation call, breaking all Bedrock users who chose the API key / bearer-token auth path added in #1952. The PR description explicitly flagged this risk, but the conflict was resolved in favour of the hotfix-branch version instead of main's.

### Issue 2 of 3
unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/bedrock.json:67-80
**`bearer_token` option missing from `auth_type` enum**

On main before this PR, the LLM bedrock.json schema offered three `auth_type` choices: `"access_keys"`, `"iam_role"`, and `"bearer_token"` (with an `aws_bearer_token` input field). This file now only exposes two choices. Existing adapters stored with `auth_type: "bearer_token"` will fail validation and users will lose the option in the UI. The embedding `bedrock.json` has the same truncation.

### Issue 3 of 3
unstract/sdk1/tests/test_bedrock_adapter.py:1-10
**No test coverage for `bearer_token` auth path**

The added test suite exhaustively covers `iam_role` and `access_keys` semantics but has no cases for `auth_type == "bearer_token"`. On main, this path requires non-blank `aws_bearer_token`, renames it to `api_key`, and strips access keys. If the bearer-token mode is restored, tests covering these three behaviours should accompany the fix.

Reviews (1): Last reviewed commit: "[HOTFIX] Bump litellm to 1.83.10 from Py..." | Re-trigger Greptile

Comment on lines +487 to +489
{None, "access_keys", "iam_role"}
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 bearer_token auth mode silently dropped by conflict resolution

The base commit (main) had _BEDROCK_VALID_AUTH_TYPES = frozenset({None, "access_keys", "iam_role", "bearer_token"}) and a _translate_bedrock_bearer_token() function that renames aws_bearer_tokenapi_key for LiteLLM. This PR's version has _BEDROCK_VALID_AUTH_TYPES = frozenset({None, "access_keys", "iam_role"}) with no bearer_token case and no translation helper. Any saved adapter with auth_type: "bearer_token" will immediately raise ValueError: Unknown auth_type 'bearer_token' on its next validation call, breaking all Bedrock users who chose the API key / bearer-token auth path added in #1952. The PR description explicitly flagged this risk, but the conflict was resolved in favour of the hotfix-branch version instead of main's.

Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/adapters/base1.py
Line: 487-489

Comment:
**`bearer_token` auth mode silently dropped by conflict resolution**

The base commit (`main`) had `_BEDROCK_VALID_AUTH_TYPES = frozenset({None, "access_keys", "iam_role", "bearer_token"})` and a `_translate_bedrock_bearer_token()` function that renames `aws_bearer_token``api_key` for LiteLLM. This PR's version has `_BEDROCK_VALID_AUTH_TYPES = frozenset({None, "access_keys", "iam_role"})` with no `bearer_token` case and no translation helper. Any saved adapter with `auth_type: "bearer_token"` will immediately raise `ValueError: Unknown auth_type 'bearer_token'` on its next validation call, breaking all Bedrock users who chose the API key / bearer-token auth path added in #1952. The PR description explicitly flagged this risk, but the conflict was resolved in favour of the hotfix-branch version instead of main's.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

Comment on lines +67 to +80
"auth_type": {
"type": "string",
"title": "Authentication Type",
"enum": [
"access_keys",
"iam_role"
],
"enumNames": [
"Access Keys",
"IAM Role / Instance Profile (on-prem AWS only)"
],
"default": "access_keys",
"description": "Choose **Access Keys** for any deployment (provide your AWS Access Key ID and Secret). Choose **IAM Role / Instance Profile** only when Unstract is hosted on AWS infrastructure that already has ambient credentials — for example EKS pods with IRSA, ECS tasks with a task role, or EC2 instances with an instance profile. The on-prem option requires no further input; boto3 picks up the host's identity automatically."
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 bearer_token option missing from auth_type enum

On main before this PR, the LLM bedrock.json schema offered three auth_type choices: "access_keys", "iam_role", and "bearer_token" (with an aws_bearer_token input field). This file now only exposes two choices. Existing adapters stored with auth_type: "bearer_token" will fail validation and users will lose the option in the UI. The embedding bedrock.json has the same truncation.

Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/src/unstract/sdk1/adapters/llm1/static/bedrock.json
Line: 67-80

Comment:
**`bearer_token` option missing from `auth_type` enum**

On main before this PR, the LLM bedrock.json schema offered three `auth_type` choices: `"access_keys"`, `"iam_role"`, and `"bearer_token"` (with an `aws_bearer_token` input field). This file now only exposes two choices. Existing adapters stored with `auth_type: "bearer_token"` will fail validation and users will lose the option in the UI. The embedding `bedrock.json` has the same truncation.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

Comment on lines +1 to +10
"""Unit tests for the AWS Bedrock LLM and embedding adapters.

Covers the auth_type selector behaviour added alongside the IAM Role /
Instance Profile mode, plus backwards compatibility for legacy adapter
configurations stored before auth_type existed.
"""

import pytest
from unstract.sdk1.adapters.base1 import (
AWSBedrockEmbeddingParameters,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 No test coverage for bearer_token auth path

The added test suite exhaustively covers iam_role and access_keys semantics but has no cases for auth_type == "bearer_token". On main, this path requires non-blank aws_bearer_token, renames it to api_key, and strips access keys. If the bearer-token mode is restored, tests covering these three behaviours should accompany the fix.

Prompt To Fix With AI
This is a comment left during a code review.
Path: unstract/sdk1/tests/test_bedrock_adapter.py
Line: 1-10

Comment:
**No test coverage for `bearer_token` auth path**

The added test suite exhaustively covers `iam_role` and `access_keys` semantics but has no cases for `auth_type == "bearer_token"`. On main, this path requires non-blank `aws_bearer_token`, renames it to `api_key`, and strips access keys. If the bearer-token mode is restored, tests covering these three behaviours should accompany the fix.

How can I resolve this? If you propose a fix, please make it concise.

Fix in Claude Code

@jaseemjaskp jaseemjaskp self-requested a review May 21, 2026 04:28
@pk-zipstack
Copy link
Copy Markdown
Contributor Author

Closing — reopening from back-merge/v0.163.4-hotfix-to-main which carries the merge commit that resolves the bedrock-file and uv.lock conflicts. The hotfix branch itself is push-protected so the resolution had to live on a separate head.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants