Description
Before Asking 在提问之前
-
I have pulled the latest code of main branch to run again and the problem still existed. 我已经拉取了主分支上最新的代码,重新运行之后,问题仍不能解决。
Search before asking 先搜索,再提问
Question
我自己实现了一个新的算子,主要功能是调用一些自己实现的大模型接口去做模型推理和评测的相关任务,之前都可以正常运行,但是有一段时间有多个任务都出现如下的报错:
Traceback (most recent call last):
File "/mnt/jd_afsqh/gaominyu/code/jdh_data-jucier/data_juicer/core/data.py", line 216, in process
dataset, resource_util_per_op = Monitor.monitor_func(
File "/mnt/jd_afsqh/gaominyu/code/jdh_data-jucier/data_juicer/core/monitor.py", line 231, in monitor_func
ret = func()
File "/mnt/jd_afsqh/gaominyu/code/jdh_data-jucier/data_juicer/ops/base_op.py", line 350, in run
new_dataset = dataset.map(
File "/mnt/jd_afsqh/gaominyu/code/jdh_data-jucier/data_juicer/core/data.py", line 324, in map
new_ds = NestedDataset(super().map(*args, **kargs))
File "/mnt/afs2/zhy/my_conda_env/dj/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 557, in wrapper
out: Union["Dataset", "DatasetDict"] = func(self, *args, **kwargs)
File "/mnt/afs2/zhy/my_conda_env/dj/lib/python3.9/site-packages/datasets/arrow_dataset.py", line 3166, in map
for rank, done, content in iflatmap_unordered(
File "/mnt/afs2/zhy/my_conda_env/dj/lib/python3.9/site-packages/datasets/utils/py_utils.py", line 713, in iflatmap_unordered
raise RuntimeError(
RuntimeError: One of the subprocesses has abruptly died during map operation.To debug the error, disable multiprocessing.
2025-05-27 07:34:04 | INFO | data_juicer.utils.logger_utils:230 - Processing finished with:
算子中的并发量设置的是10,感觉似乎不太像并发量太大导致子进程崩溃的。
后面似乎又没出现过这个报错。
这个报错大概是什么原因导致的呢
Additional 额外信息
No response