Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] failed to Create documents: failed to insert documents: write tcp 172.18.0.4:48130->172.18.0.3:5432: write: connection reset by peer #218

Closed
byronz3d opened this issue Oct 6, 2023 · 12 comments

Comments

@byronz3d
Copy link

byronz3d commented Oct 6, 2023

I am creating a large index using ZepVectorStore with this code

        llm = OpenAI(model="gpt-3.5-turbo-16k", temperature=0.1)
        
        service_context = ServiceContext.from_defaults(chunk_size=512, llm=llm)

        from llama_index.storage.storage_context import StorageContext
        from llama_index.vector_stores import ZepVectorStore

        zep_api_url = "http://localhost:9000"
        collection_name = "collection_name"

        vector_store = ZepVectorStore(
            api_url=zep_api_url, collection_name=collection_name, embedding_dimensions=1536
        )

        storage_context = StorageContext.from_defaults(vector_store=vector_store)

        index = VectorStoreIndex.from_documents(documents,show_progress=True,service_context=service_context,storage_context=storage_context)

On Zep Docker Logs:

zep-postgres | 2023-10-06 02:41:38.844 UTC [62] LOG:  invalid message length
zep    | time="2023-10-06T02:41:38Z" level=error msg="failed to Create documents: failed to insert documents: write tcp 172.18.0.4:48130->172.18.0.3:5432: write: connection reset by peer"
zep    | time="2023-10-06T02:41:38Z" level=info msg="http://localhost:9000/api/v1/collection/smconfidential/document" bytes=133 category=router duration=79512706290 duration_display=1m19.512710596s method=POST proto=HTTP/1.1 remote_ip=172.18.0.1 status_code=500

In the beginning it saved the index to the postgres without a problem. But I added a few more documents to the index, and now it won't save at all.

Can you point me to the right direction on how to solve this issue?

Thanks in advance.

@byronz3d byronz3d changed the title [BUG] failed to Create documents: failed to insert documents: write tcp 172.18.0.4:48130->172.18.0.3:5432: write: connection reset by peer" zep | time="2023-10-06T02:41:38Z" level=info msg="http://localhost:9000/api/v1/collection/smconfidential/document" bytes=133 category=router duration=79512706290 duration_display=1m1 [BUG] failed to Create documents: failed to insert documents: write tcp 172.18.0.4:48130->172.18.0.3:5432: write: connection reset by peer Oct 6, 2023
@byronz3d
Copy link
Author

byronz3d commented Oct 6, 2023

I am creating a large index using ZepVectorStore with this code

        llm = OpenAI(model="gpt-3.5-turbo-16k", temperature=0.1)
        
        service_context = ServiceContext.from_defaults(chunk_size=512, llm=llm)

        from llama_index.storage.storage_context import StorageContext
        from llama_index.vector_stores import ZepVectorStore

        zep_api_url = "http://localhost:9000"
        collection_name = "collection_name"

        vector_store = ZepVectorStore(
            api_url=zep_api_url, collection_name=collection_name, embedding_dimensions=1536
        )

        storage_context = StorageContext.from_defaults(vector_store=vector_store)

        index = VectorStoreIndex.from_documents(documents,show_progress=True,service_context=service_context,storage_context=storage_context)

On Zep Docker Logs:

zep-postgres | 2023-10-06 02:41:38.844 UTC [62] LOG:  invalid message length
zep    | time="2023-10-06T02:41:38Z" level=error msg="failed to Create documents: failed to insert documents: write tcp 172.18.0.4:48130->172.18.0.3:5432: write: connection reset by peer"
zep    | time="2023-10-06T02:41:38Z" level=info msg="http://localhost:9000/api/v1/collection/collection_name/document" bytes=133 category=router duration=79512706290 duration_display=1m19.512710596s method=POST proto=HTTP/1.1 remote_ip=172.18.0.1 status_code=500

In the beginning it saved the index to the postgres without a problem. But I added a few more documents to the index, and now it won't save at all.

Can you point me to the right direction on how to solve this issue?

Thanks in advance.

@danielchalef
Copy link
Member

Your Postgres database is running out of memory and crashing. Please see: https://docs.getzep.com/deployment/production/#database-requirements

@danielchalef danielchalef closed this as not planned Won't fix, can't repro, duplicate, stale Oct 6, 2023
@byronz3d
Copy link
Author

byronz3d commented Oct 7, 2023

I still have the same issue.

On a 32GB Cloud VPS, I added the postgres command settings in docker-compose.yaml and have 25GB for maintenance_work_mem, but it still fails with the same error:

version: "3.7"
services:
  db:
    image: ghcr.io/getzep/postgres:latest
    container_name: zep-postgres
    restart: on-failure
    environment:
      - POSTGRES_USER=postgres
      - POSTGRES_PASSWORD=postgres
    command: |
      postgres
        -c maintenance_work_mem=25GB
        -c max_parallel_maintenance_workers=1000
        -c work_mem=10GB
        -c shared_buffers=5GB

when I do in psql "show all;" I can see my settings working

postgres=# show all;

maintenance_work_mem | 25GB
max_parallel_maintenance_workers | 1000
work_mem | 10GB

How much memory should I have for creating the index ? and are there other settings I should consider changing?

Thanks @danielchalef in advance!

@danielchalef
Copy link
Member

Please would you share your Postgres logs?

max_parallel_maintenance_workers of 1000 is far too high. Each worker will request memory and CPU. At worst you'll OOM and at best you'll have significant contention over the CPU, resulting in very poor performance.

You may find this tool useful: https://pgtune.leopard.in.ua/

@byronz3d
Copy link
Author

byronz3d commented Oct 7, 2023

I used the pgtune tool. I now have a 64GB RAM dedicated server and used the suggested settings from pgtune.
I still got the error. This is what postgres logged:

zep-postgres | 2023-10-07 20:51:22.960 UTC [61] LOG:  statement: SELECT "dc"."uuid", "dc"."created_at", "dc"."updated_at", "dc"."name", "dc"."description", "dc"."metadata", "dc"."table_name", "dc"."embedding_model_name", "dc"."embedding_dimensions", "dc"."is_auto_embedded", "dc"."distance_function", "dc"."is_normalized", "dc"."is_indexed", "dc"."index_type", "dc"."list_count", "dc"."probe_count", "dc"."document_count", "dc"."document_embedded_count" FROM "document_collection" AS "dc" WHERE (name = 'smconfidential')
zep-postgres | 2023-10-07 20:51:22.960 UTC [61] LOG:  statement: SELECT count(*) as document_count, COUNT(*) FILTER (WHERE is_embedded) as document_embedded_count FROM "docstore_smconfidential_1536"
zep-postgres | 2023-10-07 20:51:22.961 UTC [61] LOG:  statement: SELECT "dc"."uuid", "dc"."created_at", "dc"."updated_at", "dc"."name", "dc"."description", "dc"."metadata", "dc"."table_name", "dc"."embedding_model_name", "dc"."embedding_dimensions", "dc"."is_auto_embedded", "dc"."distance_function", "dc"."is_normalized", "dc"."is_indexed", "dc"."index_type", "dc"."list_count", "dc"."probe_count", "dc"."document_count", "dc"."document_embedded_count" FROM "document_collection" AS "dc" WHERE (name = 'smconfidential')
zep-postgres | 2023-10-07 20:51:22.962 UTC [61] LOG:  statement: SELECT count(*) as document_count, COUNT(*) FILTER (WHERE is_embedded) as document_embedded_count FROM "docstore_smconfidential_1536"
zep-postgres | 2023-10-07 20:51:41.707 UTC [61] LOG:  invalid message length
zep    | time="2023-10-07T20:51:41Z" level=error msg="failed to Create documents: failed to insert documents: write tcp 172.25.0.4:58780->172.25.0.2:5432: write: connection reset by peer"
zep    | time="2023-10-07T20:51:41Z" level=info msg="http://localhost:9000/api/v1/collection/smconfidential/document" bytes=133 category=router duration=64183473814 duration_display=1m4.183474414s method=POST proto=HTTP/1.1 remote_ip=172.25.0.1 status_code=500

@danielchalef
Copy link
Member

How large are your document batches and how large are the documents you're uploading?

@byronz3d
Copy link
Author

byronz3d commented Oct 8, 2023

It's a combination of about 200 PDF files and 3500+ JSON text files.
Some of the PDF files might be 5-8MB big.

When saving the index to hard disk, the vector_store JSON file is about 2.7GB.

@danielchalef
Copy link
Member

How are you chunking the documents? And are you batching your uploads?

@byronz3d
Copy link
Author

byronz3d commented Oct 8, 2023

I am using llama_index, if that helps. I am not sure what batching my uploads means.
I use a modified version of RemoteReader and JsonReader from llama_index.

This is a simplified part of the code I have, which is taken from llama_index examples:

from llama_index import VectorStoreIndex, ServiceContext
llm = OpenAI(model="gpt-3.5-turbo-16k", temperature=0.2)
service_context = ServiceContext.from_defaults(chunk_size=512, llm=llm)
index = VectorStoreIndex.from_documents(documents,show_progress=True,service_context=service_context)
index.storage_context.persist(persist_dir='./storage')

This will save the index in storage dir.

For Zep API, I changed it to:

    from llama_index.storage.storage_context import StorageContext
    from llama_index.vector_stores import ZepVectorStore

    zep_api_url = "http://localhost:9000"
    collection_name = "smconfidential"

    vector_store = ZepVectorStore(
        api_url=zep_api_url, collection_name=collection_name, embedding_dimensions=1536
    )

    storage_context = StorageContext.from_defaults(vector_store=vector_store)
    llm = OpenAI(model="gpt-3.5-turbo-16k", temperature=0.2)
    
    service_context = ServiceContext.from_defaults(chunk_size=512, llm=llm)

    index = VectorStoreIndex.from_documents(documents,show_progress=True,service_context=service_context,storage_context=storage_context)
    index.storage_context.persist(persist_dir='./storage')

Sorry if I'm not answering the question properly. I am pretty new to Zep, LlamaIndex and chatbots in general, so I don't understand all the processes/terminology yet.

Thanks again.

@danielchalef
Copy link
Member

I've updated the zep-python client to do the document batching. Would you please do the following:

Update Zep:

docker compose down
docker compose pull
docker compose up

Update zep-python:

pip install -U zep-python

or use the Python package manager of your choice.

@byronz3d
Copy link
Author

byronz3d commented Oct 9, 2023

It seems to work when building the index! thank you @danielchalef !

@danielchalef
Copy link
Member

Great to hear!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants