Hi,
I've been trying to integrate postgresml into a fastapi backend using korvus, but I have been facing issues with gpu resources. the gpu i am using is a mobile NVIDIA GeForce RTX 2060 with 6gb vram.
On initialisation of the backend, i create a collection and pipeline using the class definitions that korvus provides (they are in separate functions but ill place it in order of execution):
vector_collection = Collection("file")
vector_pipeline = Pipeline(
"splitter",
{
"file_contents": {
"splitter": {"model": "recursive_character"},
"semantic_search": {
"model": "mixedbread-ai/mxbai-embed-large-v1",
},
},
},
)
await vector_collection.add_pipeline(vector_pipeline)
So everytime i start up my docker compose (which includes a postgresml service and a python fastapi service), my gpu usage maxes out to about 93%, before i get an OutOfMemory exception being thrown in the backend. After restarting the postgresml service, the gpu usage drops back down, and if i restart the fastapi service again, it will startup fine. Occasionally, the gpu usage will max out again, and i need to restart the postgresml service again, but once this happens, the endpoints that include upserting documents and vector searching for documents work perfectly fine. Considering all of these, I am wondering if the issue could be due to garbage collection since resources that can be freed are not being freed up. Are there any workarounds for this or is my implementation incorrect?
Hi,
I've been trying to integrate postgresml into a fastapi backend using korvus, but I have been facing issues with gpu resources. the gpu i am using is a mobile NVIDIA GeForce RTX 2060 with 6gb vram.
On initialisation of the backend, i create a collection and pipeline using the class definitions that korvus provides (they are in separate functions but ill place it in order of execution):
So everytime i start up my docker compose (which includes a postgresml service and a python fastapi service), my gpu usage maxes out to about 93%, before i get an OutOfMemory exception being thrown in the backend. After restarting the postgresml service, the gpu usage drops back down, and if i restart the fastapi service again, it will startup fine. Occasionally, the gpu usage will max out again, and i need to restart the postgresml service again, but once this happens, the endpoints that include upserting documents and vector searching for documents work perfectly fine. Considering all of these, I am wondering if the issue could be due to garbage collection since resources that can be freed are not being freed up. Are there any workarounds for this or is my implementation incorrect?