So I use this in an async fast API. My question is does using the text embedding work concurrently or does it process one request at a time. I see that there's options for parallel and for threaded in the code. Can these be used to allow concurrent vectors being created¿?
I guess my main question is, using text embedding and calling one of the models means it can process concurrent requests? Or does it require a particular settings to be added. Or does it literally just produce one embedding at a time?. Now I know you can send multiple things to be vectorized at once in a request period but just to clarify what I'm asking is more like if 10 people were pressing embedding from the API given the appropriate resources will it concurrently create those embeddings or does it do one at a time. And if it can do concurrent is it by default or do I have to change your ad a setting somewhere?
So I use this in an async fast API. My question is does using the text embedding work concurrently or does it process one request at a time. I see that there's options for parallel and for threaded in the code. Can these be used to allow concurrent vectors being created¿?
I guess my main question is, using text embedding and calling one of the models means it can process concurrent requests? Or does it require a particular settings to be added. Or does it literally just produce one embedding at a time?. Now I know you can send multiple things to be vectorized at once in a request period but just to clarify what I'm asking is more like if 10 people were pressing embedding from the API given the appropriate resources will it concurrently create those embeddings or does it do one at a time. And if it can do concurrent is it by default or do I have to change your ad a setting somewhere?