Skip to content

Encountered BrokenProcessPool, exiting process #108

@kkyeer

Description

@kkyeer

🐛 Describe the bug

I encountered an error while running the program to process PDFs:“CRITICAL - Encountered BrokenProcessPool, exiting process.”,followed by these error stack. I have done some research,tried switch machine/reinstall deps,didn't resolve it.My guess is there is something wrong with my pdf file(this file can be open with in pdf reader),but I'm not sure,so need some help plz.
Here is error stack:
CRITICAL - Encountered BrokenProcessPool, exiting process.
2025-03-08 16:15:34,086 - main - INFO - Got cancellation request for SGLang server
ERROR:asyncio:Task was destroyed but it is pending!
task: <Task cancelling name='Task-1' coro=<main() done, defined at /root/uuu/olmocr/olmocr/pipeline.py:896> wait_for=<_GatheringFuture pending cb=[Task.task_wakeup()]> cb=[gather.._done_callback() at /datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/tasks.py:764]>
ERROR:asyncio:Task exception was never retrieved
future: <Task finished name='Task-52' coro=<worker() done, defined at /root/uuu/olmocr/olmocr/pipeline.py:424> exception=SystemExit(1)>

  • Exception Group Traceback (most recent call last):
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 344, in process_pdf
    | async with asyncio.TaskGroup() as tg:
    | File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/taskgroups.py", line 145, in aexit
    | raise me from None
    | ExceptionGroup: unhandled errors in a TaskGroup (7 sub-exceptions)
    +-+---------------- 1 ----------------
    | Traceback (most recent call last):
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 224, in process_page
    | query = await build_page_query(pdf_local_path, page_num, args.target_longest_image_dim, local_anchor_text_len, image_rotation=local_image_rotation)
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 119, in build_page_query
    | image_base64, anchor_text = await asyncio.gather(image_base64, anchor_text) # type: ignore
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
    +---------------- 2 ----------------
    | Traceback (most recent call last):
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 224, in process_page
    | query = await build_page_query(pdf_local_path, page_num, args.target_longest_image_dim, local_anchor_text_len, image_rotation=local_image_rotation)
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 119, in build_page_query
    | image_base64, anchor_text = await asyncio.gather(image_base64, anchor_text) # type: ignore
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
    +---------------- 3 ----------------
    | Traceback (most recent call last):
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 224, in process_page
    | query = await build_page_query(pdf_local_path, page_num, args.target_longest_image_dim, local_anchor_text_len, image_rotation=local_image_rotation)
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 119, in build_page_query
    | image_base64, anchor_text = await asyncio.gather(image_base64, anchor_text) # type: ignore
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
    +---------------- 4 ----------------
    | Traceback (most recent call last):
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 224, in process_page
    | query = await build_page_query(pdf_local_path, page_num, args.target_longest_image_dim, local_anchor_text_len, image_rotation=local_image_rotation)
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 119, in build_page_query
    | image_base64, anchor_text = await asyncio.gather(image_base64, anchor_text) # type: ignore
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
    +---------------- 5 ----------------
    | Traceback (most recent call last):
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 224, in process_page
    | query = await build_page_query(pdf_local_path, page_num, args.target_longest_image_dim, local_anchor_text_len, image_rotation=local_image_rotation)
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 119, in build_page_query
    | image_base64, anchor_text = await asyncio.gather(image_base64, anchor_text) # type: ignore
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
    +---------------- 6 ----------------
    | Traceback (most recent call last):
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 224, in process_page
    | query = await build_page_query(pdf_local_path, page_num, args.target_longest_image_dim, local_anchor_text_len, image_rotation=local_image_rotation)
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 119, in build_page_query
    | image_base64, anchor_text = await asyncio.gather(image_base64, anchor_text) # type: ignore
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
    +---------------- 7 ----------------
    | Traceback (most recent call last):
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 224, in process_page
    | query = await build_page_query(pdf_local_path, page_num, args.target_longest_image_dim, local_anchor_text_len, image_rotation=local_image_rotation)
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | File "/root/uuu/olmocr/olmocr/pipeline.py", line 119, in build_page_query
    | image_base64, anchor_text = await asyncio.gather(image_base64, anchor_text) # type: ignore
    | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    | concurrent.futures.process.BrokenProcessPool: A process in the process pool was terminated abruptly while the future was running or pending.
    +------------------------------------

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/runners.py", line 189, in run
with Runner(debug=debug) as runner:
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/runners.py", line 63, in exit
self.close()
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/runners.py", line 71, in close
_cancel_all_tasks(loop)
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/runners.py", line 201, in _cancel_all_tasks
loop.run_until_complete(tasks.gather(*to_cancel, return_exceptions=True))
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/base_events.py", line 641, in run_until_complete
self.run_forever()
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/base_events.py", line 608, in run_forever
self._run_once()
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once
handle._run()
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/events.py", line 84, in _run
self._context.run(self._callback, *self._args)
File "/root/uuu/olmocr/olmocr/pipeline.py", line 440, in worker
async with asyncio.TaskGroup() as tg:
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/taskgroups.py", line 129, in aexit
raise self._base_error
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/base_events.py", line 641, in run_until_complete
self.run_forever()
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/base_events.py", line 608, in run_forever
self._run_once()
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/base_events.py", line 1936, in _run_once
handle._run()
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/asyncio/events.py", line 84, in _run
self._context.run(self._callback, *self._args)
File "/root/uuu/olmocr/olmocr/pipeline.py", line 371, in process_pdf
sys.exit(1)
SystemExit: 1
Exception ignored in: <function _ExecutorManagerThread.init..weakref_cb at 0x7f27341d02c0>
Traceback (most recent call last):
File "/datadisk/miniconda3/envs/olmocr/lib/python3.11/concurrent/futures/process.py", line 308, in weakref_cb
AttributeError: 'NoneType' object has no attribute 'util'
/datadisk/miniconda3/envs/olmocr/lib/python3.11/multiprocessing/resource_tracker.py:254: UserWarning: resource_tracker: There appear to be 160 leaked semaphore objects to clean up at shutdown
warnings.warn('resource_tracker: There appear to be %d '

Versions

Python 3.11.11
aiohappyeyeballs==2.5.0
aiohttp==3.11.13
aiosignal==1.3.2
annotated-types==0.7.0
anthropic==0.49.0
anyio==4.8.0
asttokens==3.0.0
attrs==25.1.0
beaker-py==1.34.1
bleach==6.2.0
boto3==1.37.8
botocore==1.37.8
cached_path==1.6.7
cachetools==5.5.2
certifi==2025.1.31
cffi==1.17.1
charset-normalizer==3.4.1
click==8.1.8
cloudpickle==3.1.1
compressed-tensors==0.8.0
cryptography==44.0.2
cuda-bindings==12.8.0
cuda-python==12.8.0
datasets==3.3.2
decorator==5.2.1
decord==0.6.0
dill==0.3.8
diskcache==5.6.3
distro==1.9.0
docker==7.1.0
einops==0.8.1
executing==2.2.0
fastapi==0.115.11
filelock==3.17.0
flashinfer @ file:///root/zcy/olmocr/olmocr/flashinfer-0.1.6%2Bcu124torch2.4-cp311-cp311-linux_x86_64.whl#sha256=19a01e2ec93662bc6b83819daaae277d93e7cc989343c5f8940af44a4cb66ba0
frozenlist==1.5.0
fsspec==2024.12.0
ftfy==6.3.1
gguf==0.10.0
google-api-core==2.24.1
google-auth==2.38.0
google-cloud-core==2.4.2
google-cloud-storage==2.19.0
google-crc32c==1.6.0
google-resumable-media==2.7.2
googleapis-common-protos==1.69.1
h11==0.14.0
hf_transfer==0.1.9
httpcore==1.0.7
httptools==0.6.4
httpx==0.28.1
huggingface-hub==0.27.1
idna==3.10
importlib_metadata==8.6.1
interegular==0.3.3
ipython==9.0.1
ipython_pygments_lexers==1.1.1
jedi==0.19.2
Jinja2==3.1.6
jiter==0.8.2
jmespath==1.0.1
jsonschema==4.23.0
jsonschema-specifications==2024.10.1
lark==1.2.2
lingua-language-detector==2.0.2
litellm==1.63.3
llvmlite==0.44.0
lm-format-enforcer==0.10.11
markdown-it-py==3.0.0
markdown2==2.5.3
MarkupSafe==3.0.2
matplotlib-inline==0.1.7
mdurl==0.1.2
mistral_common==1.5.3
modelscope==1.23.2
mpmath==1.3.0
msgpack==1.1.0
msgspec==0.19.0
multidict==6.1.0
multiprocess==0.70.16
nest-asyncio==1.6.0
networkx==3.4.2
numba==0.61.0
numpy==1.26.4
nvidia-cublas-cu12==12.4.5.8
nvidia-cuda-cupti-cu12==12.4.127
nvidia-cuda-nvrtc-cu12==12.4.127
nvidia-cuda-runtime-cu12==12.4.127
nvidia-cudnn-cu12==9.1.0.70
nvidia-cufft-cu12==11.2.1.3
nvidia-curand-cu12==10.3.5.147
nvidia-cusolver-cu12==11.6.1.9
nvidia-cusparse-cu12==12.3.1.170
nvidia-cusparselt-cu12==0.6.2
nvidia-ml-py==12.570.86
nvidia-nccl-cu12==2.21.5
nvidia-nvjitlink-cu12==12.4.127
nvidia-nvtx-cu12==12.4.127
-e git+https://github.com/allenai/olmocr.git@fc857f9c6dc24f92e986d3c66c9004c6e9cf1a60#egg=olmocr
openai==1.65.4
opencv-python-headless==4.11.0.86
orjson==3.10.15
outlines==0.0.46
packaging==24.2
pandas==2.2.3
parso==0.8.4
partial-json-parser==0.2.1.1.post5
pexpect==4.9.0
pillow==11.1.0
prometheus-fastapi-instrumentator==7.0.2
prometheus_client==0.21.1
prompt_toolkit==3.0.50
propcache==0.3.0
proto-plus==1.26.0
protobuf==5.29.3
psutil==7.0.0
ptyprocess==0.7.0
pure_eval==0.2.3
py-cpuinfo==9.0.0
pyairports==2.1.1
pyarrow==19.0.1
pyasn1==0.6.1
pyasn1_modules==0.4.1
pycountry==24.6.1
pycparser==2.22
pydantic==2.10.6
pydantic_core==2.27.2
Pygments==2.19.1
pypdf==5.3.1
pypdfium2==4.30.1
python-dateutil==2.9.0.post0
python-dotenv==1.0.1
python-multipart==0.0.20
pytz==2025.1
PyYAML==6.0.2
pyzmq==26.2.1
ray==2.43.0
referencing==0.36.2
regex==2024.11.6
requests==2.32.3
rich==13.9.4
rpds-py==0.23.1
rsa==4.9
s3transfer==0.11.4
safetensors==0.5.3
sentencepiece==0.2.0
setproctitle==1.3.5
sgl-kernel==0.0.3.post1
sglang==0.4.2
six==1.17.0
smart-open==7.1.0
sniffio==1.3.1
stack-data==0.6.3
starlette==0.46.0
sympy==1.13.1
tiktoken==0.9.0
tokenizers==0.21.0
torch==2.5.1
torchao==0.9.0
torchvision==0.20.1
tqdm==4.67.1
traitlets==5.14.3
transformers==4.49.0
triton==3.1.0
typing_extensions==4.12.2
tzdata==2025.1
urllib3==2.3.0
uvicorn==0.34.0
uvloop==0.21.0
vllm==0.6.4.post1
watchfiles==1.0.4
wcwidth==0.2.13
webencodings==0.5.1
websockets==15.0.1
wrapt==1.17.2
xformers==0.0.28.post3
xgrammar==0.1.15
xxhash==3.5.0
yarl==1.18.3
zipp==3.21.0
zstandard==0.23.0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions