Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manually deployed vespa with docker compose Failed to establish a new connection: [Errno 111] #663

Open
ricoms opened this issue Jan 18, 2024 · 15 comments
Assignees
Milestone

Comments

@ricoms
Copy link

ricoms commented Jan 18, 2024

hi.

I raised Vespa locally with the following docker compose:

version: '3.9'

services:
  locust-master:
    image: locust-vespa
    build:
      context: ../../
      dockerfile: Dockerfile
      args:
        - TEST_NAME=locust
        - VECTORDB_NAME=vespa
    volumes:
      - ${PWD}/reports/vespa/locust_report:/opt/locust_report/
      - ${PWD}/config/:/opt/config/
    command: locust --config ${LOCUST_MASTER_CONFIG} ${LOCUST_USER_CLASSES}
    env_file: vespa.env
    depends_on:
      vespa:
        condition: service_healthy

  locust-worker:
    image: locust-vespa
    volumes:
      - ${PWD}/config/:/opt/config/
    command: locust --config /opt/config/worker.conf ${LOCUST_USER_CLASSES}
    env_file: vespa.env
    depends_on:
      vespa:
        condition: service_healthy

  vespa:
    image: vespaengine/vespa:8.282.24
    command: configserver,services
    volumes:
      - vespa_persist:/opt/vespa/var
      - vespa_logs:/opt/vespa/logs
    environment:
      VESPA_HOSTNAME: vespa
      VESPA_CONFIGSERVERS: vespa
      VESPA_CONFIGSERVER_JVMARGS: "-Xms32M -Xmx128M"
      VESPA_CONFIGPROXY_JVMARGS: "-Xms32M -Xmx32M"
    healthcheck:
      test: ["CMD-SHELL", "curl -f http://vespa:19071/state/v1/health || exit 1"]
      interval: 5s
      start_period: 40s
      timeout: 10s
      retries: 3

volumes:
  vespa_persist:
    driver: local
  vespa_logs:
    driver: local

The services locust-worker and locust-master basically uses pyvespa to create the a ApplicationPackage with a defined schema and then I create a vespa client with Vespa (from vespa.application module). When I try to use feed_iterable I get the following errors on my locust-worker application.

locust-worker-1  | [2024-01-18 07:54:39,034] 952dbfda1623/WARNING/urllib3.connectionpool: Retrying (Retry(total=1, connect=None, read=None, redirect=None, status=None)) after connection broken by 'NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f4bcba6b050>: Failed to establish a new connection: [Errno 111] Connection refused')': /document/v1/benchmarktest/vector_store_benchmark/docid/3586743025

And here is the full vespa log:

2024-01-18 04:54:08 runserver(configserver) running with pid: 16
2024-01-18 04:54:08 Starting config proxy using tcp/vespa:19070 as config source(s)
2024-01-18 04:54:08 Waiting for config proxy to start
2024-01-18 04:54:08 runserver(configproxy) running with pid: 69
2024-01-18 04:54:09 config proxy started after 0s (runserver pid 69)
2024-01-18 04:54:09 runserver(config-sentinel) running with pid: 146
2024-01-18 04:54:09 [2024-01-18 07:54:08.900] INFO    configserver     just-start-configserver  JVM env:  LD_LIBRARY_PATH=/opt/vespa/lib64:/opt/vespa-deps/lib64 MALLOC_ARENA_MAX=1 VESPA_LOG_CONTROL_DIR=/opt/vespa/var/db/vespa/logcontrol VESPA_LOG_CONTROL_FILE=/opt/vespa/var/db/vespa/logcontrol/configserver.logcontrol VESPA_LOG_TARGET=file:/opt/vespa/logs/vespa/vespa.log VESPA_SERVICE_NAME=configserver standalone_jdisc_container__app_location=/opt/vespa/conf/configserver-app
2024-01-18 04:54:09 [2024-01-18 07:54:08.900] INFO    configserver     just-start-configserver  JVM exec: [java -XX:ActiveProcessorCount=12 -XX:+PreserveFramePointer -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/opt/vespa/var/crash -XX:ErrorFile=/opt/vespa/var/crash/hs_err_pid%p.log -XX:+ExitOnOutOfMemoryError -XX:MaxJavaStackTraceDepth=1000000 -XX:-OmitStackTraceInFastThrow --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/jdk.internal.loader=ALL-UNNAMED --add-opens=java.base/sun.security.ssl=ALL-UNNAMED -Djava.io.tmpdir=/opt/vespa/var/tmp -Djava.library.path=/opt/vespa/lib64:/opt/vespa-deps/lib64 -Djava.security.properties=/opt/vespa/conf/vespa/java.security.override -Djava.awt.headless=true -Dsun.rmi.dgc.client.gcInterval=3600000 -Dsun.net.client.defaultConnectTimeout=5000 -Dsun.net.client.defaultReadTimeout=60000 -Djavax.net.ssl.keyStoreType=JKS -Djdk.tls.rejectClientInitiatedRenegotiation=true -Dfile.encoding=UTF-8 -Dorg.apache.commons.logging.Log=org.apache.commons.logging.impl.Jdk14Logger -Djdisc.bundle.path=/opt/vespa/lib/jars -Djdisc.logger.enabled=false -Djdisc.logger.level=WARNING -Dvespa.log.control.dir=/opt/vespa/var/db/vespa/logcontrol -Djdisc.export.packages= -Djdisc.config.file=/opt/vespa/var/jdisc_container/configserver.properties -Djdisc.cache.path=/opt/vespa/var/vespa/bundlecache/configserver -Djdisc.logger.tag=configserver -Dzookeeper_log_file_prefix=/opt/vespa/logs/vespa/zookeeper.configserver -Xms32M -Xmx128M -XX:+UseTransparentHugePages -cp /opt/vespa/lib/jars/jdisc_core-jar-with-dependencies.jar com.yahoo.jdisc.core.StandaloneMain standalone-container-jar-with-dependencies.jar]
2024-01-18 04:54:09 [2024-01-18 07:54:08.976] INFO    configproxy      just-run-configproxy     JVM env:  LD_LIBRARY_PATH=/opt/vespa/lib64:/opt/vespa-deps/lib64 MALLOC_ARENA_MAX=1 VESPA_LOG_CONTROL_DIR=/opt/vespa/var/db/vespa/logcontrol VESPA_LOG_CONTROL_FILE=/opt/vespa/var/db/vespa/logcontrol/configproxy.logcontrol VESPA_LOG_TARGET=file:/opt/vespa/logs/vespa/vespa.log VESPA_SERVICE_NAME=configproxy
2024-01-18 04:54:09 [2024-01-18 07:54:08.976] INFO    configproxy      just-run-configproxy     JVM exec: [java -XX:+ExitOnOutOfMemoryError -XX:+PreserveFramePointer -XX:CompressedClassSpaceSize=32m -XX:MaxDirectMemorySize=32m -XX:ThreadStackSize=448 -XX:MaxJavaStackTraceDepth=1000 -XX:-OmitStackTraceInFastThrow -XX:ActiveProcessorCount=2 -Dproxyconfigsources=tcp/vespa:19070 -Djava.io.tmpdir=${VESPA_HOME}/var/tmp -Xms32M -Xmx32M -XX:+UseTransparentHugePages -cp /opt/vespa/lib/jars/config-proxy-jar-with-dependencies.jar com.yahoo.vespa.config.proxy.ProxyServer 19090]
2024-01-18 04:54:09 [2024-01-18 07:54:09.296] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.RpcConfigSourceClient   Could not connect to config source at tcp/vespa:19070
2024-01-18 04:54:09 [2024-01-18 07:54:09.299] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.RpcConfigSourceClient   Could not connect to any config source in set [tcp/vespa:19070], please make sure config server(s) are running.
2024-01-18 04:54:09 [2024-01-18 07:54:09.354] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.filedistribution.FileReferencesAndDownloadsMaintainer   Not running maintainer, since this is on a config server host
2024-01-18 04:54:12 [2024-01-18 07:54:12.446] WARNING configserver     Container.DeployLogger   Host named 'vespa' may not receive any config since it differs from its canonical hostname '390389d3f0d5' (check DNS and /etc/hosts).
2024-01-18 04:54:12 [2024-01-18 07:54:12.973] INFO    configserver     Container.com.yahoo.container.core.config.HandlersConfigurerDi   Installing bundles for application generation 0
2024-01-18 04:54:13 [2024-01-18 07:54:13.945] INFO    configserver     Container.com.yahoo.container.core.config.ApplicationBundleLoader        Installed bundles: {[0]org.apache.felix.framework:7.0.5, [1]standalone-container:8.282.24, [2]configdefinitions:8.282.24, [3]config-provisioning:8.282.24, [4]config-bundle:8.282.24, [5]config-model-api:8.282.24, [6]config-model:8.282.24, [7]container-disc:8.282.24, [8]hosted-zone-api:8.282.24, [9]container-apache-http-client-bundle:8.282.24, [10]security-utils:8.282.24, [11]bcprov:1.76.0, [12]bcpkix:1.76.0, [13]bcutil:1.76.0, [14]com.fasterxml.jackson.core.jackson-annotations:2.16.1, [15]com.fasterxml.jackson.core.jackson-core:2.16.1, [16]com.fasterxml.jackson.core.jackson-databind:2.16.1, [17]com.fasterxml.jackson.datatype.jackson-datatype-jdk8:2.16.1, [18]com.fasterxml.jackson.datatype.jackson-datatype-jsr310:2.16.1, [19]javax.ws.rs-api:2.1.99.b01, [20]container-spifly:1.3.7, [21]javax.servlet-api:3.1.0, [22]container-search-and-docproc:8.282.24, [23]linguistics-components:8.282.24, [24]model-evaluation:8.282.24, [25]model-integration:8.282.24, [26]container-onnxruntime:8.282.24, [27]jdisc-security-filters:8.282.24, [28]vespa-athenz:8.282.24, [29]zkfacade:8.282.24, [30]zookeeper-server:8.282.24, [31]configserver:8.282.24, [32]config-model-fat:8.282.24, [33]flags:8.282.24, [34]http-client:8.282.24, [35]node-repository:8.282.24, [36]application-model:8.282.24, [37]orchestrator:8.282.24, [38]service-monitor:8.282.24, [39]configserver-flags:8.282.24}
2024-01-18 04:54:18 [2024-01-18 07:54:18.922] INFO    configproxy      configproxy.com.yahoo.vespa.config.JRTConnection Connecting to tcp/vespa:19070
2024-01-18 04:54:21 [2024-01-18 07:54:21.158] INFO    configserver     Container.com.yahoo.container.handler.threadpool.ContainerThreadpoolImpl Threadpool 'default-pool': min=24, max=1200, queue=0
2024-01-18 04:54:21 [2024-01-18 07:54:21.347] INFO    configserver     Container.com.yahoo.container.jdisc.state.StateMonitor   Changing health status code from 'initializing' to 'up'
2024-01-18 04:54:21 [2024-01-18 07:54:21.462] INFO    configserver     Container.com.yahoo.jdisc.http.server.jetty.JettyHttpServer      Threadpool size: min=28, max=28
2024-01-18 04:54:21 [2024-01-18 07:54:21.854] INFO    configserver     Container.com.yahoo.container.handler.threadpool.ContainerThreadpoolImpl Threadpool 'default-handler-common': min=24, max=24, queue=960
2024-01-18 04:54:22 [2024-01-18 07:54:22.097] INFO    configserver     Container.com.yahoo.container.jdisc.ConfiguredApplication        Switching to the latest deployed set of configurations and components. Application config generation: 0
2024-01-18 04:54:25 [2024-01-18 07:54:25.313] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.DelayedResponseHandler  Timed out (timeout 15000) getting config name=cloud.config.sentinel,configId=hosts/vespa, will retry
2024-01-18 04:54:29 [2024-01-18 07:54:29.740] INFO    config-sentinel  sentinel.config.frt.frtconfigagent       No response / error from config server. This is normal before an application package is deployed. (key: name=cloud.config.sentinel,configId=hosts/vespa) (errcode=103, validresponse:0), trying again in 6.000000 seconds
2024-01-18 04:54:35 [2024-01-18 07:54:35.440] INFO    configproxy      configproxy.com.yahoo.vespa.config.JRTConnection Connecting to tcp/vespa:19070
2024-01-18 04:54:35 [2024-01-18 07:54:35.475] INFO    configproxy      configproxy.com.yahoo.config.subscription.impl.JRTConfigRequester        Request failed: Failed request (No application exists) from Connection { Socket[addr=/172.21.0.2,port=57312,localport=19070] }\nConnection spec: tcp/vespa:19070
2024-01-18 04:54:53 [2024-01-18 07:54:53.640] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.RpcConfigSourceClient   Subscribe for 'name=cloud.config.sentinel,configId=hosts/vespa,aa9115ebf63cfc6721f75ada21d7cdfa' failed, closing subscriber: Subscribe for 'name=cloud.config.sentinel,configId=hosts/vespa' timed out (timeout was 44000 ms): name=cloud.config.sentinel,configId=hosts/vespa, Current generation: 0, Generation changed: false, Config changed: false
2024-01-18 04:54:53 [2024-01-18 07:54:53.648] INFO    configproxy      configproxy.com.yahoo.config.subscription.impl.JRTConfigRequester        Request failed: Failed request (No application exists) from Connection { Socket[addr=/172.21.0.2,port=57312,localport=19070] }\nConnection spec: tcp/vespa:19070
2024-01-18 04:55:01 [2024-01-18 07:55:01.313] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.DelayedResponseHandler  Timed out (timeout 25000) getting config name=cloud.config.sentinel,configId=hosts/vespa, will retry
2024-01-18 04:55:06 [2024-01-18 07:55:06.163] INFO    config-sentinel  sentinel.config.frt.frtconfigagent       No response / error from config server. This is normal before an application package is deployed. (key: name=cloud.config.sentinel,configId=hosts/vespa) (errcode=103, validresponse:0), trying again in 7.000000 seconds
2024-01-18 04:55:37 [2024-01-18 07:55:37.653] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.RpcConfigSourceClient   Subscribe for 'name=cloud.config.sentinel,configId=hosts/vespa,aa9115ebf63cfc6721f75ada21d7cdfa' failed, closing subscriber: Subscribe for 'name=cloud.config.sentinel,configId=hosts/vespa' timed out (timeout was 44000 ms): name=cloud.config.sentinel,configId=hosts/vespa, Current generation: 0, Generation changed: false, Config changed: false
2024-01-18 04:55:37 [2024-01-18 07:55:37.657] INFO    configproxy      configproxy.com.yahoo.config.subscription.impl.JRTConfigRequester        Request failed: Failed request (No application exists) from Connection { Socket[addr=/172.21.0.2,port=57312,localport=19070] }\nConnection spec: tcp/vespa:19070
2024-01-18 04:55:38 [2024-01-18 07:55:38.314] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.DelayedResponseHandler  Timed out (timeout 25000) getting config name=cloud.config.sentinel,configId=hosts/vespa, will retry
2024-01-18 04:55:43 [2024-01-18 07:55:43.498] INFO    config-sentinel  sentinel.config.frt.frtconfigagent       No response / error from config server. This is normal before an application package is deployed. (key: name=cloud.config.sentinel,configId=hosts/vespa) (errcode=103, validresponse:0), trying again in 8.000000 seconds
2024-01-18 04:56:17 [2024-01-18 07:56:17.315] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.DelayedResponseHandler  Timed out (timeout 25000) getting config name=cloud.config.sentinel,configId=hosts/vespa, will retry
2024-01-18 04:56:21 [2024-01-18 07:56:21.662] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.RpcConfigSourceClient   Subscribe for 'name=cloud.config.sentinel,configId=hosts/vespa,aa9115ebf63cfc6721f75ada21d7cdfa' failed, closing subscriber: Subscribe for 'name=cloud.config.sentinel,configId=hosts/vespa' timed out (timeout was 44000 ms): name=cloud.config.sentinel,configId=hosts/vespa, Current generation: 0, Generation changed: false, Config changed: false
2024-01-18 04:56:21 [2024-01-18 07:56:21.668] INFO    configproxy      configproxy.com.yahoo.config.subscription.impl.JRTConfigRequester        Request failed: Failed request (No application exists) from Connection { Socket[addr=/172.21.0.2,port=57312,localport=19070] }\nConnection spec: tcp/vespa:19070
2024-01-18 04:56:21 [2024-01-18 07:56:21.926] INFO    config-sentinel  sentinel.config.frt.frtconfigagent       No response / error from config server. This is normal before an application package is deployed. (key: name=cloud.config.sentinel,configId=hosts/vespa) (errcode=103, validresponse:0), trying again in 9.000000 seconds
2024-01-18 04:56:56 [2024-01-18 07:56:56.328] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.DelayedResponseHandler  Timed out (timeout 25000) getting config name=cloud.config.sentinel,configId=hosts/vespa, will retry
2024-01-18 04:57:01 [2024-01-18 07:57:01.283] INFO    config-sentinel  sentinel.config.frt.frtconfigagent       No response / error from config server. This is normal before an application package is deployed. (key: name=cloud.config.sentinel,configId=hosts/vespa) (errcode=103, validresponse:0), trying again in 10.000000 seconds
2024-01-18 04:57:05 [2024-01-18 07:57:05.665] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.RpcConfigSourceClient   Subscribe for 'name=cloud.config.sentinel,configId=hosts/vespa,aa9115ebf63cfc6721f75ada21d7cdfa' failed, closing subscriber: Subscribe for 'name=cloud.config.sentinel,configId=hosts/vespa' timed out (timeout was 44000 ms): name=cloud.config.sentinel,configId=hosts/vespa, Current generation: 0, Generation changed: false, Config changed: false
2024-01-18 04:57:05 [2024-01-18 07:57:05.670] INFO    configproxy      configproxy.com.yahoo.config.subscription.impl.JRTConfigRequester        Request failed: Failed request (No application exists) from Connection { Socket[addr=/172.21.0.2,port=57312,localport=19070] }\nConnection spec: tcp/vespa:19070
2024-01-18 04:57:09 [2024-01-18 07:57:09.495] WARNING config-sentinel  sentinel.config-sentinel Timeout getting config, please check your setup. Will exit and restart: Timed out while subscribing to 'cloud.config.sentinel', configid 'hosts/vespa'
2024-01-18 04:57:09 [2024-01-18 07:57:09.609] INFO    config-sentinel  runserver        will restart in 0 seconds
2024-01-18 04:57:25 [2024-01-18 07:57:25.332] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.DelayedResponseHandler  Timed out (timeout 15000) getting config name=cloud.config.sentinel,configId=hosts/vespa, will retry
2024-01-18 04:57:29 [2024-01-18 07:57:29.931] INFO    config-sentinel  sentinel.config.frt.frtconfigagent       No response / error from config server. This is normal before an application package is deployed. (key: name=cloud.config.sentinel,configId=hosts/vespa) (errcode=103, validresponse:0), trying again in 6.000000 seconds
2024-01-18 04:57:49 [2024-01-18 07:57:49.675] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.RpcConfigSourceClient   Subscribe for 'name=cloud.config.sentinel,configId=hosts/vespa,aa9115ebf63cfc6721f75ada21d7cdfa' failed, closing subscriber: Subscribe for 'name=cloud.config.sentinel,configId=hosts/vespa' timed out (timeout was 44000 ms): name=cloud.config.sentinel,configId=hosts/vespa, Current generation: 0, Generation changed: false, Config changed: false
2024-01-18 04:57:49 [2024-01-18 07:57:49.679] INFO    configproxy      configproxy.com.yahoo.config.subscription.impl.JRTConfigRequester        Request failed: Failed request (No application exists) from Connection { Socket[addr=/172.21.0.2,port=57312,localport=19070] }\nConnection spec: tcp/vespa:19070
2024-01-18 04:58:01 [2024-01-18 07:58:01.333] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.DelayedResponseHandler  Timed out (timeout 25000) getting config name=cloud.config.sentinel,configId=hosts/vespa, will retry
2024-01-18 04:58:06 [2024-01-18 07:58:06.351] INFO    config-sentinel  sentinel.config.frt.frtconfigagent       No response / error from config server. This is normal before an application package is deployed. (key: name=cloud.config.sentinel,configId=hosts/vespa) (errcode=103, validresponse:0), trying again in 7.000000 seconds
2024-01-18 04:58:33 [2024-01-18 07:58:33.681] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.RpcConfigSourceClient   Subscribe for 'name=cloud.config.sentinel,configId=hosts/vespa,aa9115ebf63cfc6721f75ada21d7cdfa' failed, closing subscriber: Subscribe for 'name=cloud.config.sentinel,configId=hosts/vespa' timed out (timeout was 44000 ms): name=cloud.config.sentinel,configId=hosts/vespa, Current generation: 0, Generation changed: false, Config changed: false
2024-01-18 04:58:33 [2024-01-18 07:58:33.686] INFO    configproxy      configproxy.com.yahoo.config.subscription.impl.JRTConfigRequester        Request failed: Failed request (No application exists) from Connection { Socket[addr=/172.21.0.2,port=57312,localport=19070] }\nConnection spec: tcp/vespa:19070
2024-01-18 04:58:39 [2024-01-18 07:58:39.334] INFO    configproxy      configproxy.com.yahoo.vespa.config.proxy.DelayedResponseHandler  Timed out (timeout 25000) getting config name=cloud.config.sentinel,configId=hosts/vespa, will retry
2024-01-18 04:58:43 [2024-01-18 07:58:43.674] INFO    config-sentinel  sentinel.config.frt.frtconfigagent       No response / error from config server. This is normal before an application package is deployed. (key: name=cloud.config.sentinel,configId=hosts/vespa) (errcode=103, validresponse:0), trying again in 8.000000 seconds

I noticed 2 things from Vespa logs:

  1. I see the line 2024-01-18 04:54:22 [2024-01-18 07:54:22.097] INFO configserver Container.com.yahoo.container.jdisc.ConfiguredApplication Switching to the latest deployed set of configurations and components. Application config generation: 0 which I saw a video which mentions that that is expected and should be ok.
  2. I notice how configproxy is retrying to connect to something and never connects.
  3. It seems that the application package has never been deployed.

From this I have a comment about pyvespa docs: I did not see any document that is not using the VespaDocker deployment approach. Is there a way to apply a ApplicationPackage to an existing locally deployed Vespa with pyvespa?

Also, I tested out wait_for_application_up(120) available from Vespa implementation, but it always throw a timeout error. I also looked around its code to find a "deploy" action of the schema, but I was not able to find anything like that.

@ricoms ricoms changed the title Manually deployed vespa with docker compose Manually deployed vespa with docker compose Failed to establish a new connection: [Errno 111] Jan 18, 2024
@kkraune kkraune self-assigned this Jan 24, 2024
@kkraune kkraune added this to the soon milestone Jan 24, 2024
@kkraune
Copy link
Member

kkraune commented Jan 25, 2024

Hi, sorry for slow reponse! The (first) problem to solve is the configproxy not reaching the config server. Please try https://github.com/vespa-engine/sample-apps/blob/master/examples/operations/multinode-HA/docker-compose.yaml and validate that works - then you can modify the compose file, adding your stuff.

Maybe best place to look is network/hostnames - this looks like a connectivity problem, so maybe add a network and use a fully qualified hostname instead of just vespa

@Gladiator566
Copy link

@kkraune Hi, I try to use bge-m3 model to do embedding hybrid search, and I use refer to official tutorial to deploy a local docker containter to use vespa. Since I have over millions data to feed, so I try to use feed_iterable function to feed iterable bulk data, and I encountered same problems as above, like WARNING/urllib3.connectionpool: Retrying NewConnectionError: Failed to establish a new connection or Max retries exceeded with URL sth like that. I try to set max_connections params to a huge number, and try to create a session to do feed, but it doesn't work, how can I solve this connection full error to insert bulk data to vespa? Thank you !

my env: linux, pyvespa version is 0.39, docker image is latest

@kkraune
Copy link
Member

kkraune commented Feb 5, 2024

Hi @Gladiator566 I think you must look in the vespa.log to validate what the problem might be - and if so, follow the advise to try /multinode-HA/docker-compose.yaml to verify this works, before trying your own configuration

You can also try https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa-cloud.html to make it easier, using the free trial, to eliminate other failures.

@Gladiator566
Copy link

@kkraune I try to use vespa cloud as tutorial, but i got error like RuntimeError: Status code 400 doing POST at https://api.vespa-external.aws.oath.cloud:4443/application/v4/tenant/bge-m3/application/bgeM3/instance/default/deploy/dev-aws-us-east-1c: Value of X-Content-Hash header does not match computed content hash, how to solve this problem?

@kkraune
Copy link
Member

kkraune commented Feb 6, 2024

Thanks for reporting. Can you add the steps you took, so we can reproduce? Or did you follow the steps in https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa-cloud.html and it failed here?

 app = vespa_cloud.deploy()

A good hint is also to make sure there are no applications already deployed.

@hmusum I assume this is an error from our API, we should document how to fix this

@Gladiator566
Copy link

Gladiator566 commented Feb 6, 2024

@kkraune yes, I follow the exact steps in https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa-cloud.html , concretely as bellow:

  1. download vespa-cli from github latest release
  2. vespa config set target cloud
  3. vespa config set application bge-m3.bgeM3
  4. vespa auth cert -N
  5. vespa auth api-key
  6. add public api-key to cloud browser key site
  7. vespa_cloud = VespaCloud( tenant=os.environ["TENANT_NAME"], application='bgeM3', key_content=None, key_location=api_key_path, application_package=application_package)

and it failed at
app = vespa_cloud.deploy(), there are no applications already deployed in cloud.

Thanks.

@kkraune
Copy link
Member

kkraune commented Feb 6, 2024

Hi again, I tried https://pyvespa.readthedocs.io/en/latest/getting-started-pyvespa-cloud.html and it worked for me. I run the notebook locally on my laptop. Some ideas

the .vespa directory in your home dir stores credentials - you cna temprarily move this to another name to reset all credentials, and try the guide again, with no other changes. You can also delete the api-key in the console and try with a fresh one

@hmusum
Copy link
Member

hmusum commented Feb 6, 2024

The "Value of X-Content-Hash header does not match computed content hash" error is due to some misconfiguration or bug on the client side, but it's hard to say what the user should do without knowing the root cause of the error.

@kkraune
Copy link
Member

kkraune commented Feb 7, 2024

The problem seems to be a mismatch with the hash computed in

https://github.com/vespa-engine/pyvespa/blob/master/vespa/deployment.py#L644

and validation of this in Vespa Cloud. pyvespa 0.39, which is the latest. We are looking into.

@kkraune
Copy link
Member

kkraune commented Feb 8, 2024

@Gladiator566 : I have created vespa-engine/vespa#30219 for the X-Content-Hash issue you reported - this is most likely a different issue than reported by @ricoms here. Thanks for reporting!

@vudangthinh
Copy link

@kkraune Hi, I try to use bge-m3 model to do embedding hybrid search, and I use refer to official tutorial to deploy a local docker containter to use vespa. Since I have over millions data to feed, so I try to use feed_iterable function to feed iterable bulk data, and I encountered same problems as above, like WARNING/urllib3.connectionpool: Retrying NewConnectionError: Failed to establish a new connection or Max retries exceeded with URL sth like that. I try to set max_connections params to a huge number, and try to create a session to do feed, but it doesn't work, how can I solve this connection full error to insert bulk data to vespa? Thank you !

Hi, I also encounter the same problem. Did you know how to fix it?

@kkraune
Copy link
Member

kkraune commented Feb 19, 2024

Hi @vudangthinh ! I don't think increasing number of connections will help, the error message is probably a symptom of a maxed out instance.

The https://docs.vespa.ai/en/vespa-cli.html has better feed flow control, can you please try that and see how the feeding goes and let me know?

@vudangthinh
Copy link

vudangthinh commented Feb 21, 2024

I tried to use vespa feed, however the error still persistent:
At first, the indexing process was ok, but when I run many document the error start happen:

feed: got error "Post "http://127.0.0.1:8080/document/v1/benchmark/hybridsearch/docid/2936": write tcp 127.0.0.1:52604->127.0.0.1:8080: write: broken pipe" (no body) for put id:benchmark:hybridsearch::2936: retrying
feed: got error "Post "http://127.0.0.1:8080/document/v1/benchmark/hybridsearch/docid/3283": write tcp 127.0.0.1:52610->127.0.0.1:8080: write: broken pipe" (no body) for put id:benchmark:hybridsearch::3283: retrying
feed: got error "Post "http://127.0.0.1:8080/document/v1/benchmark/hybridsearch/docid/3090": write tcp 127.0.0.1:52622->127.0.0.1:8080: write: broken pipe" (no body) for put id:benchmark:hybridsearch::3090: retrying
feed: got error "Post "http://127.0.0.1:8080/document/v1/benchmark/hybridsearch/docid/3273": write tcp 127.0.0.1:52634->127.0.0.1:8080: write: broken pipe" (no body) for put id:benchmark:hybridsearch::3273: retrying

@kkraune
Copy link
Member

kkraune commented Feb 21, 2024

OK - can you please check vespa.log inside the Docker Container? Could be a resource problem, the log might say

@bratseth
Copy link
Member

You are probably sending more requests than the system can handle timely and therefore some of them end up crossing a connection recycling event. These will be retried until timeout so not really an error in itself, but you probably want to increase your resources (maybe run with GPU) or feed slower. Setting a lower timeout (--timeout) should get rid of these messages and lead to less queuing which is probably advantageous if you want to determine faster what actual max throughput you can get.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants