-
Notifications
You must be signed in to change notification settings - Fork 2.1k
RuntimeError: Already borrowed #1229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Haven't seen this one before. Can you share more of the logs? Is it anyhow related to the limitation of concurrent requests (#1228 )? |
Yes it is linked to the concurrent requests (#1228 ). It's actually part of a stress test Here is what I see in my logs when I try simultaneous requests: haystack-api_1 | [2021-06-28 13:32:49 +0000] [11] [ERROR] Exception in ASGI application
haystack-api_1 | Traceback (most recent call last):
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/uvicorn/protocols/http/httptools_impl.py", line 371, in run_asgi
haystack-api_1 | result = await app(self.scope, self.receive, self.send)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/uvicorn/middleware/proxy_headers.py", line 59, in __call__
haystack-api_1 | return await self.app(scope, receive, send)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/fastapi/applications.py", line 199, in __call__
haystack-api_1 | await super().__call__(scope, receive, send)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/starlette/applications.py", line 112, in __call__
haystack-api_1 | await self.middleware_stack(scope, receive, send)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/starlette/middleware/errors.py", line 181, in __call__
haystack-api_1 | raise exc from None
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/starlette/middleware/errors.py", line 159, in __call__
haystack-api_1 | await self.app(scope, receive, _send)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/starlette/middleware/cors.py", line 78, in __call__
haystack-api_1 | await self.app(scope, receive, send)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/starlette/exceptions.py", line 82, in __call__
haystack-api_1 | raise exc from None
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/starlette/exceptions.py", line 71, in __call__
haystack-api_1 | await self.app(scope, receive, sender)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/starlette/routing.py", line 580, in __call__
haystack-api_1 | await route.handle(scope, receive, send)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/starlette/routing.py", line 241, in handle
haystack-api_1 | await self.app(scope, receive, send)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/starlette/routing.py", line 52, in app
haystack-api_1 | response = await func(request)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/fastapi/routing.py", line 215, in app
haystack-api_1 | dependant=dependant, values=values, is_coroutine=is_coroutine
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/fastapi/routing.py", line 151, in run_endpoint_function
haystack-api_1 | return await run_in_threadpool(dependant.call, **values)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/starlette/concurrency.py", line 40, in run_in_threadpool
haystack-api_1 | return await loop.run_in_executor(None, func, *args)
haystack-api_1 | File "/usr/local/lib/python3.7/concurrent/futures/thread.py", line 57, in run
haystack-api_1 | result = self.fn(*self.args, **self.kwargs)
haystack-api_1 | File "/home/user/rest_api/controller/search.py", line 55, in query
haystack-api_1 | result = _process_request(PIPELINE, request)
haystack-api_1 | File "/home/user/rest_api/controller/search.py", line 75, in _process_request
haystack-api_1 | top_k_reader=request.top_k_reader)
haystack-api_1 | File "/home/user/haystack/pipeline.py", line 134, in run
haystack-api_1 | raise Exception(f"Exception while running node `{node_id}` with input `{node_input}`: {e}, full stack trace: {tb}")
haystack-api_1 | Exception: Exception while running node `TitleEmbRetriever` with input `{'pipeline_type': 'Query', 'query': 'document administratif ?', 'filters': {}, 'top_k_retriever': None, 'top_k_reader': None, 'node_id': 'TitleEmbRetriever'}`: Already borrowed, full stack trace: Traceback (most recent call last):
haystack-api_1 | File "/home/user/haystack/pipeline.py", line 131, in run
haystack-api_1 | node_output, stream_id = self.graph.nodes[node_id]["component"].run(**node_input)
haystack-api_1 | File "/home/user/haystack/retriever/base.py", line 181, in run
haystack-api_1 | output, stream = run_query_timed(**kwargs)
haystack-api_1 | File "/home/user/haystack/retriever/base.py", line 43, in wrapper
haystack-api_1 | ret = fn(*args, **kwargs)
haystack-api_1 | File "/home/user/haystack/retriever/base.py", line 198, in run_query
haystack-api_1 | documents = self.retrieve(query=query, filters=filters, top_k=top_k_retriever, index=index)
haystack-api_1 | File "/home/user/haystack/retriever/dense.py", line 513, in retrieve
haystack-api_1 | query_emb = self.embed_queries(texts=[query])
haystack-api_1 | File "/home/user/haystack/retriever/dense.py", line 529, in embed_queries
haystack-api_1 | return self.embedding_encoder.embed_queries(texts)
haystack-api_1 | File "/home/user/haystack/retriever/dense.py", line 650, in embed_queries
haystack-api_1 | return self.embed(texts)
haystack-api_1 | File "/home/user/haystack/retriever/dense.py", line 645, in embed
haystack-api_1 | emb = self.embedding_model.encode(texts, batch_size=200, show_progress_bar=self.show_progress_bar)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/sentence_transformers/SentenceTransformer.py", line 156, in encode
haystack-api_1 | features = self.tokenize(sentences_batch)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/sentence_transformers/SentenceTransformer.py", line 309, in tokenize
haystack-api_1 | return self._first_module().tokenize(texts)
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/sentence_transformers/models/Transformer.py", line 100, in tokenize
haystack-api_1 | output.update(self.tokenizer(*to_tokenize, padding=True, truncation='longest_first', return_tensors="pt", max_length=self.max_seq_length))
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 2305, in __call__
haystack-api_1 | **kwargs,
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 2490, in batch_encode_plus
haystack-api_1 | **kwargs,
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/transformers/tokenization_utils_fast.py", line 382, in _batch_encode_plus
haystack-api_1 | pad_to_multiple_of=pad_to_multiple_of,
haystack-api_1 | File "/usr/local/lib/python3.7/site-packages/transformers/tokenization_utils_fast.py", line 335, in set_truncation_and_padding
haystack-api_1 | self._tokenizer.enable_truncation(max_length, stride=stride, strategy=truncation_strategy.value)
haystack-api_1 | RuntimeError: Already borrowed
It could be related to the Transformers library, but I haven't digged that far yet :) |
It seems related to UKPLab/sentence-transformers#794 and huggingface/tokenizers#537 I'm digging into haystack code to see if we can instantiate the tokenizer in each FastAPI thread instead of globally like it seems to be done right now. |
@MichaelBitard : If you find a solution I'd be highly interested as well. Loading the tokenizer on each call doesn't work for me, cause it takes between 20ms to 1s 👍 |
I found a really ugly in rest_api/controller/search.py I replaced the PIPELINE and /query code by: from random import randrange
from threading import Lock
NUMBER_OF_PIPELINES = 6
PIPELINES = []
MUTEXES = []
for i in range(NUMBER_OF_PIPELINES):
logger.info(f"Loading pipeline: {i}")
PIPELINES.append(Pipeline.load_from_yaml(Path(PIPELINE_YAML_PATH), pipeline_name=QUERY_PIPELINE_NAME))
MUTEXES.append(Lock())
logger.info(f"Loaded pipeline nodes: {PIPELINES[i].graph.nodes.keys()}")
PIPELINE = PIPELINES[0]
concurrency_limiter = RequestLimiter(CONCURRENT_REQUEST_PER_WORKER)
@router.post("/query", response_model=Response)
def query(request: Request):
logger.error("QUERY got request")
with concurrency_limiter.run():
pipeline_to_use = randrange(NUMBER_OF_PIPELINES)
logger.error(f"QUERY Running on pipeline {pipeline_to_use}")
MUTEXES[pipeline_to_use].acquire()
try:
logger.error(f'QUERY concurrency limit not reached, using pipeline: {pipeline_to_use}')
result = _process_request(PIPELINES[pipeline_to_use], request)
logger.error("QUERY request processed")
return result
finally:
MUTEXES[pipeline_to_use].release() I needed a quick fix because my client has some tests coming soon, but that's really not a valid workaround. |
Thank you very much for posting that @MichaelBitard! Will try it out. |
This quick fix works in my case. Not sure we can close this issue though... |
Let's keep it open until we have a nicer fix |
A nicer way to fix that would be to implement this mutex around the tokenizer instead of duplicating the whole pipeline. But the real fix should be made in the tokenizers |
Can confirm that wrapping the transformers model into a mutex acquire/release block does the trick. Thanks @MichaelBitard |
Looking at huggingface/tokenizers#537 it seems like the issue was mitigated (but not fully solved) in Can anybody confirm this is the case? |
Closing for now. If anybody spots this error again, please open a new issue specifying which |
Question
I am encountering a strange issue when I try to send multiple request at the same time (I am doing sort of a stress test). I see in the Haystack FAST_API this in the logs:
RuntimeError: Already borrowed
Anyone got this before ?
Additional context
I am using a single machine: 32Go CPU, no GPU.
The text was updated successfully, but these errors were encountered: