Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] config to import yapf causes 'EOFError: Ran out of input' when distributed training #1480

Closed
2 tasks done
DeclK opened this issue Jan 24, 2024 · 9 comments
Closed
2 tasks done
Labels
bug Something isn't working

Comments

@DeclK
Copy link

DeclK commented Jan 24, 2024

Prerequisite

Environment

None

Reproduces the problem - code sample

None

Reproduces the problem - command or script

Just run torchrun --nproc_per_node 8 mmengine_train.py config.py --launcher pytorch

Reproduces the problem - error message

2024-01-24 17:16,Traceback (most recent call last):
2024-01-24 17:16,  File "mmengine_train.py", line 6, in <module>
2024-01-24 17:16,    from mmengine.config import Config
2024-01-24 17:16,  File "/usr/local/lib/python3.8/dist-packages/mmengine/__init__.py", line 3, in <module>
2024-01-24 17:16,    from .config import *
2024-01-24 17:16,  File "/usr/local/lib/python3.8/dist-packages/mmengine/config/__init__.py", line 2, in <module>
2024-01-24 17:16,    from .config import Config, ConfigDict, DictAction, read_base
2024-01-24 17:16,  File "/usr/local/lib/python3.8/dist-packages/mmengine/config/config.py", line 20, in <module>
2024-01-24 17:16,    import yapf
2024-01-24 17:16,  File "/usr/local/lib/python3.8/dist-packages/yapf/__init__.py", line 41, in <module>
2024-01-24 17:16,    from yapf.yapflib import yapf_api
2024-01-24 17:16,  File "/usr/local/lib/python3.8/dist-packages/yapf/yapflib/yapf_api.py", line 38, in <module>
2024-01-24 17:16,    from yapf.pyparser import pyparser
2024-01-24 17:16,  File "/usr/local/lib/python3.8/dist-packages/yapf/pyparser/pyparser.py", line 44, in <module>
2024-01-24 17:16,    from yapf.yapflib import format_token
2024-01-24 17:16,  File "/usr/local/lib/python3.8/dist-packages/yapf/yapflib/format_token.py", line 23, in <module>
2024-01-24 17:16,    from yapf.pytree import pytree_utils
2024-01-24 17:16,  File "/usr/local/lib/python3.8/dist-packages/yapf/pytree/pytree_utils.py", line 30, in <module>
2024-01-24 17:16,    from yapf_third_party._ylib2to3 import pygram
2024-01-24 17:16,  File "/usr/local/lib/python3.8/dist-packages/yapf_third_party/_ylib2to3/pygram.py", line 29, in <module>
2024-01-24 17:16,    python_grammar = driver.load_grammar(_GRAMMAR_FILE)
2024-01-24 17:16,  File "/usr/local/lib/python3.8/dist-packages/yapf_third_party/_ylib2to3/pgen2/driver.py", line 252, in load_grammar
2024-01-24 17:16,    g.load(gp)
2024-01-24 17:16,  File "/usr/local/lib/python3.8/dist-packages/yapf_third_party/_ylib2to3/pgen2/grammar.py", line 95, in load
2024-01-24 17:16,    d = pickle.load(f)
2024-01-24 17:16,EOFError: Ran out of input

Additional information

No response

@DeclK DeclK added the bug Something isn't working label Jan 24, 2024
@DeclK
Copy link
Author

DeclK commented Jan 24, 2024

I think it's a yapf problem, the repo of yapf also reported this link

@Data-drone
Copy link

I am getting this as well

@Data-drone
Copy link

In my case, setup is:

"mmengine==0.10.2"
"mmcv==2.1.0"
"mmdet==3.3.0"

on ubuntu

@MohammedSB
Copy link

MohammedSB commented Jan 29, 2024

Same problem using the older mmcv-full (1.3.0) and mmseg=0.11.0.
Using yapf=0.40.1

Small update: pip install yapf=0.32 fixes this issue for me.

@DeclK
Copy link
Author

DeclK commented Jan 30, 2024

I used a wait function to control each rank's import, and make sure they are imported one by one. Stupid, but working

import time
import os

def wait_before_import_config():
    t = int(os.environ.get('LOCAL_RANK', 0))
    time.sleep(t * 0.5)

def wait_after_import_config():
    t = int(os.environ.get('WORLD_SIZE', 0)) - int(os.environ.get('LOCAL_RANK', 0))
    time.sleep(t * 0.5)

wait_before_import_config()
from mmengine.config import Config
wait_after_import_config()

@matthost
Copy link

matthost commented Feb 2, 2024

This has been happening a lot for us. On older versions of mmcv.

@matthost
Copy link

matthost commented Feb 7, 2024

@DeclK you are a hero!

@DeclK DeclK closed this as completed Feb 18, 2024
@whlook
Copy link
Contributor

whlook commented Feb 28, 2024

I used a wait function to control each rank's import, and make sure they are imported one by one. Stupid, but working

import time
import os

def wait_before_import_config():
    t = int(os.environ.get('LOCAL_RANK', 0))
    time.sleep(t * 0.5)

def wait_after_import_config():
    t = int(os.environ.get('WORLD_SIZE', 0)) - int(os.environ.get('LOCAL_RANK', 0))
    time.sleep(t * 0.5)

wait_before_import_config()
from mmengine.config import Config
wait_after_import_config()

pls check this issue,may be helpful for you:google/yapf#1204

@whlook
Copy link
Contributor

whlook commented Feb 28, 2024

Similar problem i meet when using mmengine,how i find a way to fix in this issue:google/yapf#1204

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants