Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResourceExhausted: 429 received metadata size exceeds soft limit #3965

Open
pweglik opened this issue Jun 18, 2024 · 0 comments
Open

ResourceExhausted: 429 received metadata size exceeds soft limit #3965

pweglik opened this issue Jun 18, 2024 · 0 comments
Assignees
Labels
api: vertex-ai Issues related to the googleapis/python-aiplatform API.

Comments

@pweglik
Copy link

pweglik commented Jun 18, 2024

Environment details

  • OS: Dockerfile base image: python:3.11
  • Python version: 3.11
  • pip version: 24.0
  • google-auth version: 2.29.0

Description

We have created a simple Flask server and deployed it to GCP as Cloud Run Service. We are also using few other dependencies:

google-api-core==2.19.0
google-cloud-aiplatform==1.49.0 # we needed to pin this version, because of depraction warning somewhere in ai-vertex, might be worth checking in future
google-cloud-logging==3.10.0
langchain==0.2.1
langchain_community==0.2.1
langchain-google-vertexai==1.0.4

snippet of the code:

import google.cloud.logging
from langchain_google_vertexai import VertexAI

# setup logging
client = google.cloud.logging.Client()
client.setup_logging()

# in the endpoint
llm_model = VertexAI(
        model_name="text-bison",
        max_output_tokens=256,
        temperature=1,
        top_p=0.8,
        top_k=40,
        verbose=True,
    )

We don't do anything more sophisticated than that. After deployment, it ran fine for few hours and then we started to received warnings:

Retrying langchain_google_vertexai.llms._completion_with_retry.<locals>._completion_with_retry_inner in 4.0 seconds as it raised ResourceExhausted: 429 received metadata size exceeds soft limit (16711 vs. 16384);  :path:90B :authority:79B :method:43B :scheme:44B content-type:60B te:42B grpc-accept-encoding:75B user-agent:100B grpc-trace-bin:103B pc-low-fwd-bin:77B x-goog-request-params:148B x-goog-api-client:12052B x-goog-api-client:62B authorization:1076B x-google-gfe-frontline-info:836B x-google-gfe-timestamp-trace:76B x-google-gfe-verified-user-ip:76B x-gfe-signed-request-headers:472B x-google-gfe-location-info:74B x-gfe-ssl:44B x-google-gfe-tls-base64urlclienthelloprotobuf:299B x-user-ip:56B x-google-service:105B x-google-gfe-service-trace:115B x-google-gfe-backend-timeout-ms:71B accept-encoding:56B x-google-peer-delegation-chain-bin:92B x-google-request-uid:138B x-google-dappertraceinfo:111B.

You can see that there are two fields named x-goog-api-client and one is growing out of proportion.
Later on it grows even bigger and we started to received it on almost every request. The server also started to timeout as it was unable to serve those requests.

Retrying langchain_google_vertexai.llms._completion_with_retry.<locals>._completion_with_retry_inner in 4.0 seconds as it raised ResourceExhausted: 429 received metadata size exceeds soft limit (27114 vs. 16384);  :path:90B :authority:79B :method:43B :scheme:44B content-type:60B te:42B grpc-accept-encoding:75B user-agent:100B grpc-trace-bin:103B pc-low-fwd-bin:77B x-goog-request-params:148B x-goog-api-client:22452B x-goog-api-client:62B authorization:1076B x-google-gfe-frontline-info:837B x-google-gfe-timestamp-trace:76B x-google-gfe-verified-user-ip:76B x-gfe-signed-request-headers:472B x-google-gfe-location-info:74B x-gfe-ssl:44B x-google-gfe-tls-base64urlclienthelloprotobuf:299B x-user-ip:56B x-google-service:105B x-google-gfe-service-trace:115B x-google-gfe-backend-timeout-ms:71B accept-encoding:56B x-google-peer-delegation-chain-bin:92B x-google-request-uid:140B x-google-dappertraceinfo:111B.

It looks like something is appended to this field and it overflows after some time. I found a place in the copde of the library tha could cause it: https://github.com/googleapis/google-auth-library-python/blob/main/google/auth/metrics.py#L138-L154

I'm looking for some guidance what could cause such warning and overflow in requests.

Steps to reproduce

I'm not really sure, error only occurred after few hours (serving few thousands requests)

I have also created this issue in google auth library repo, but maybe someone here will be able to help.

Let me know if I can help you somehow or provide any additional info!

@product-auto-label product-auto-label bot added the api: vertex-ai Issues related to the googleapis/python-aiplatform API. label Jun 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: vertex-ai Issues related to the googleapis/python-aiplatform API.
Projects
None yet
Development

No branches or pull requests

2 participants