You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have created a simple Flask server and deployed it to GCP as Cloud Run Service. We are also using few other dependencies:
google-api-core==2.19.0
google-cloud-aiplatform==1.49.0 # we needed to pin this version, because of depraction warning somewhere in ai-vertex, might be worth checking in future
google-cloud-logging==3.10.0
langchain==0.2.1
langchain_community==0.2.1
langchain-google-vertexai==1.0.4
snippet of the code:
import google.cloud.logging
from langchain_google_vertexai import VertexAI
# setup logging
client = google.cloud.logging.Client()
client.setup_logging()
# in the endpoint
llm_model = VertexAI(
model_name="text-bison",
max_output_tokens=256,
temperature=1,
top_p=0.8,
top_k=40,
verbose=True,
)
We don't do anything more sophisticated than that. After deployment, it ran fine for few hours and then we started to received warnings:
Retrying langchain_google_vertexai.llms._completion_with_retry.<locals>._completion_with_retry_inner in 4.0 seconds as it raised ResourceExhausted: 429 received metadata size exceeds soft limit (16711 vs. 16384); :path:90B :authority:79B :method:43B :scheme:44B content-type:60B te:42B grpc-accept-encoding:75B user-agent:100B grpc-trace-bin:103B pc-low-fwd-bin:77B x-goog-request-params:148B x-goog-api-client:12052B x-goog-api-client:62B authorization:1076B x-google-gfe-frontline-info:836B x-google-gfe-timestamp-trace:76B x-google-gfe-verified-user-ip:76B x-gfe-signed-request-headers:472B x-google-gfe-location-info:74B x-gfe-ssl:44B x-google-gfe-tls-base64urlclienthelloprotobuf:299B x-user-ip:56B x-google-service:105B x-google-gfe-service-trace:115B x-google-gfe-backend-timeout-ms:71B accept-encoding:56B x-google-peer-delegation-chain-bin:92B x-google-request-uid:138B x-google-dappertraceinfo:111B.
You can see that there are two fields named x-goog-api-client and one is growing out of proportion.
Later on it grows even bigger and we started to received it on almost every request. The server also started to timeout as it was unable to serve those requests.
Retrying langchain_google_vertexai.llms._completion_with_retry.<locals>._completion_with_retry_inner in 4.0 seconds as it raised ResourceExhausted: 429 received metadata size exceeds soft limit (27114 vs. 16384); :path:90B :authority:79B :method:43B :scheme:44B content-type:60B te:42B grpc-accept-encoding:75B user-agent:100B grpc-trace-bin:103B pc-low-fwd-bin:77B x-goog-request-params:148B x-goog-api-client:22452B x-goog-api-client:62B authorization:1076B x-google-gfe-frontline-info:837B x-google-gfe-timestamp-trace:76B x-google-gfe-verified-user-ip:76B x-gfe-signed-request-headers:472B x-google-gfe-location-info:74B x-gfe-ssl:44B x-google-gfe-tls-base64urlclienthelloprotobuf:299B x-user-ip:56B x-google-service:105B x-google-gfe-service-trace:115B x-google-gfe-backend-timeout-ms:71B accept-encoding:56B x-google-peer-delegation-chain-bin:92B x-google-request-uid:140B x-google-dappertraceinfo:111B.
Environment details
python:3.11
google-auth
version: 2.29.0Description
We have created a simple Flask server and deployed it to GCP as Cloud Run Service. We are also using few other dependencies:
snippet of the code:
We don't do anything more sophisticated than that. After deployment, it ran fine for few hours and then we started to received warnings:
You can see that there are two fields named
x-goog-api-client
and one is growing out of proportion.Later on it grows even bigger and we started to received it on almost every request. The server also started to timeout as it was unable to serve those requests.
It looks like something is appended to this field and it overflows after some time. I found a place in the copde of the library tha could cause it: https://github.com/googleapis/google-auth-library-python/blob/main/google/auth/metrics.py#L138-L154
I'm looking for some guidance what could cause such warning and overflow in requests.
Steps to reproduce
I'm not really sure, error only occurred after few hours (serving few thousands requests)
I have also created this issue in google auth library repo, but maybe someone here will be able to help.
Let me know if I can help you somehow or provide any additional info!
The text was updated successfully, but these errors were encountered: