Skip to content

OLS-2588: Add google vertex provider for Gemini and Claude#2877

Open
raptorsun wants to merge 4 commits intoopenshift:mainfrom
raptorsun:add-google-vertex-provider
Open

OLS-2588: Add google vertex provider for Gemini and Claude#2877
raptorsun wants to merge 4 commits intoopenshift:mainfrom
raptorsun:add-google-vertex-provider

Conversation

@raptorsun
Copy link
Copy Markdown
Contributor

@raptorsun raptorsun commented Apr 3, 2026

Description

This PR adds 2 providers:

  • Generic Google Vertex for models hosted by Google
  • Anthropic Google Vertex for models hosted by Anthropic but accessible by Vertex AI.

This PR replaces #2824

Here is an example of configuration to access Gemini and Claude models.

The credential file is the key file for service accounts in Google cloud.

llm_providers:
  - name: my_google_vertex_anthropic
    type: google_vertex_anthropic
    url: "https://us-east5-aiplatform.googleapis.com"
    credentials_path: application_default_credentials.json
    project_id: itpc-gcp-hcm-pe-eng-claude
    google_vertex_anthropic_config:
      project: itpc-gcp-hcm-pe-eng-claude
      location: us-east5
    models:
      - name: claude-opus-4-6
  - name: my_google_vertex
    type: google_vertex
    url: "https://us-east5-aiplatform.googleapis.com"
    credentials_path: ols-gemini-ai-testing-07a7fa3e0e0e.json
    project_id: ols-gemini-ai-testing
    google_vertex_config:
      project: ols-gemini-ai-testing
      location: us-east5
    models:
      - name: gemini-2.5-flash

ols_config:
  conversation_cache:
    type: memory
    memory:
      max_entries: 1000
  logging_config:
    app_log_level: info
    lib_log_level: warning
    uvicorn_log_level: info
    suppress_metrics_in_log: false
    suppress_auth_checks_warning_in_log: false
  default_provider: my_google_vertex
  default_model: gemini-2.5-flash
  expire_llm_is_ready_persistent_state: -1

  authentication_config:
    module: "noop"
  user_data_collection:
    feedback_disabled: true
    transcripts_disabled: true
dev_config:
  enable_dev_ui: true
  disable_auth: true
  disable_tls: true

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

@openshift-ci openshift-ci bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 3, 2026
@openshift-ci openshift-ci bot requested a review from joshuawilson April 3, 2026 10:25
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 3, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign joshuawilson for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot requested a review from onmete April 3, 2026 10:25
@raptorsun raptorsun force-pushed the add-google-vertex-provider branch from fa4f807 to a759596 Compare April 3, 2026 12:49
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 3, 2026
@raptorsun raptorsun force-pushed the add-google-vertex-provider branch 6 times, most recently from 148dff7 to 8e903b2 Compare April 6, 2026 16:30
@raptorsun raptorsun changed the title [WIP] Add google vertex provider Add google vertex provider for Gemini and Claude Apr 6, 2026
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 6, 2026
@raptorsun raptorsun changed the title Add google vertex provider for Gemini and Claude OLS-2588: Add google vertex provider for Gemini and Claude Apr 6, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 6, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 6, 2026

@raptorsun: This pull request references OLS-2588 which is a valid jira issue.

Details

In response to this:

Description

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 6, 2026

@raptorsun: This pull request references OLS-2588 which is a valid jira issue.

Details

In response to this:

Description

This PR adds 2 providers:

  • Generic Google Vertex for models hosted by Google
  • Anthropic Google Vertex for models hosted by Anthropic but accessible by Vertex AI.

This PR replaces #2824

Here is an example of configuration to access Gemini and Claude models.

The credential file is the key file for service accounts in Google cloud.

llm_providers:
 - name: my_google_vertex_anthropic
   type: google_vertex_anthropic
   url: "https://us-east5-aiplatform.googleapis.com"
   credentials_path: application_default_credentials.json
   project_id: itpc-gcp-hcm-pe-eng-claude
   google_vertex_anthropic_config:
     project: itpc-gcp-hcm-pe-eng-claude
     location: us-east5
   models:
     - name: claude-opus-4-6
 - name: my_google_vertex
   type: google_vertex
   url: "https://us-east5-aiplatform.googleapis.com"
   credentials_path: ols-gemini-ai-testing-07a7fa3e0e0e.json
   project_id: ols-gemini-ai-testing
   google_vertex_config:
     project: ols-gemini-ai-testing
     location: us-east5
   models:
     - name: gemini-2.5-flash

ols_config:
 conversation_cache:
   type: memory
   memory:
     max_entries: 1000
 logging_config:
   app_log_level: info
   lib_log_level: warning
   uvicorn_log_level: info
   suppress_metrics_in_log: false
   suppress_auth_checks_warning_in_log: false
 default_provider: my_google_vertex
 default_model: gemini-2.5-flash
 expire_llm_is_ready_persistent_state: -1

 authentication_config:
   module: "noop"
 user_data_collection:
   feedback_disabled: true
   transcripts_disabled: true
dev_config:
 enable_dev_ui: true
 disable_auth: true
 disable_tls: true

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change

Related Tickets & Documents

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

and self.google_vertex_config == other.google_vertex_config
and self.tls_security_profile == other.tls_security_profile
)
return False
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need it? Pydantic v2's BaseModel already compares all fields by default. This is very error prone

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it to say we should remove this eq function?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@blublinsky
Copy link
Copy Markdown
Contributor

credentials_path is configured but never passed to the LLM constructors
The config example shows credentials_path: application_default_credentials.json, and ProviderConfig.init dutifully reads it via checks.read_secret() into self.credentials. But neither GoogleVertex nor GoogleVertexAnthropic ever reads self.provider_config.credentials — the loaded secret is a dead value.

Compare with every other provider in the codebase:

openai.py, watsonx.py, rhoai_vllm.py, rhelai_vllm.py, azure_openai.py — all do this:

self.credentials = self.provider_config.credentials

... and then pass it to the LangChain constructor

The Google Vertex providers skip this entirely. Authentication works only because the operator separately sets the GOOGLE_APPLICATION_CREDENTIALS environment variable, and Google's client libraries pick it up via Application Default Credentials (ADC). The credentials_path in the config creates a false sense of control — an operator could point it at the wrong file and never notice because auth succeeds through the ambient env var.

Both LangChain constructors accept an explicit credentials parameter:

from google.oauth2 import service_account
creds = service_account.Credentials.from_service_account_file(credentials_path)
ChatGoogleGenerativeAI(credentials=creds, ...)
ChatAnthropicVertex(credentials=creds, ...)
The providers should load the credential file and pass it explicitly (like every other provider does with its credential), or credentials_path should be removed from the config to avoid the misleading configuration.

This needs to be fixed

@raptorsun
Copy link
Copy Markdown
Contributor Author

@blublinsky thank you very much for the review!
the credential problem has been fixed in the new commit.
please have a look again.

@raptorsun raptorsun requested a review from blublinsky April 9, 2026 23:23
def default_params(self) -> dict[str, Any]:
"""Construct and return structure with default LLM params."""
self.project = self.provider_config.project_id
account_info = credentials_str_to_dict(self.provider_config.credentials)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

credentials_str_to_dict(self.provider_config.credentials) is called unconditionally, but self.provider_config.credentials is None when credentials_path is not configured (read_secret returns None for missing paths). This crashes with a raw TypeError from json.loads(None) instead of a clear configuration error.

Other providers guard against this -- e.g. watsonx.py:56:

if self.credentials is None:
raise ValueError("Credentials must be specified")
Add a similar check before the credentials_str_to_dict call.

"""Construct and return structure with default LLM params."""
self.project = self.provider_config.project_id
account_info = credentials_str_to_dict(self.provider_config.credentials)
self.credentials = service_account.Credentials.from_service_account_info(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

credentials_str_to_dict(self.provider_config.credentials) is called unconditionally, but self.provider_config.credentials is None when credentials_path is not configured (read_secret returns None for missing paths). This crashes with a raw TypeError from json.loads(None) instead of a clear configuration error.

Other providers guard against this -- e.g. watsonx.py:56:

if self.credentials is None:
raise ValueError("Credentials must be specified")
Add a similar check before the credentials_str_to_dict call.

wheel_packages="$wheel_packages,$EXTRA_WHEELS"
sed -i 's/"packages": "[^"]*"/"packages": "'"$wheel_packages"'"/' .tekton/lightspeed-service-pull-request.yaml
sed -i 's/"packages": "[^"]*"/"packages": "'"$wheel_packages"'"/' .tekton/lightspeed-service-push.yaml

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

he maturin==1.10.2 pin was removed, but maturin is only in EXTRA_WHEELS, not in $wheel_packages (derived from WHEEL_FILE). The new dedup loop on line 87-90 only strips BUILD_FILE entries matching names in $wheel_packages, so it doesn't cover maturin. If pybuild-deps emits a different maturin version than what the Red Hat registry provides, you'll get hash conflicts. Either restore the pin or extend the dedup loop to also cover EXTRA_WHEELS.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hermeto download build-requirements first, and then the rest of requirment files.
When a package like maturin is listed in both build requirements and other reqiurement files, it will be downloaded only once and the hash of that from build dependencies is often different than the package we have in the wheel requirment file.
So I remove the packages from build requirement file if the same package is also listed in the wheel requirement file.
Normally maturin is not a runtime dependency , its functionality is to manage rust environment, so a build time depenency.
I filter the wheel packages first from the build dependency file and then add extra wheels list for the hermeto config. So that we do not remove the package in EXTRA_WHEELS list that may cause some package never downloaded, if the package is not in the wheel requirement file.

@raptorsun raptorsun force-pushed the add-google-vertex-provider branch from fb33e52 to 6980913 Compare April 10, 2026 16:03
@raptorsun raptorsun force-pushed the add-google-vertex-provider branch 2 times, most recently from 1418021 to 12a1d6e Compare April 10, 2026 20:31
harche and others added 2 commits April 10, 2026 22:31
Adds a new `google_vertex` LLM provider type that enables using Anthropic
Claude models through Google Cloud's Vertex AI. Uses `ChatAnthropicVertex`
from `langchain-google-vertexai` which authenticates via Google ADC.

Also bumps httpx to >=0.28.0 (required by langchain-google-vertexai) and
updates provider.py to use the renamed `proxy` parameter.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Haoyu Sun <hasun@redhat.com>
@raptorsun raptorsun force-pushed the add-google-vertex-provider branch 2 times, most recently from 7d5d6bf to 7dec5ff Compare April 10, 2026 20:59
Signed-off-by: Haoyu Sun <hasun@redhat.com>
…_objects_api" on class object

Signed-off-by: Haoyu Sun <hasun@redhat.com>
@raptorsun raptorsun force-pushed the add-google-vertex-provider branch from 7dec5ff to 1ce47fa Compare April 10, 2026 21:15
@openshift-ci
Copy link
Copy Markdown

openshift-ci bot commented Apr 10, 2026

@raptorsun: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/ols-evaluation 1ce47fa link true /test ols-evaluation
ci/prow/e2e-ols-cluster 1ce47fa link true /test e2e-ols-cluster

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants