Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zero-Shot Video Captioning script issues #85

Open
shreyaskar123 opened this issue May 18, 2023 · 1 comment
Open

Zero-Shot Video Captioning script issues #85

shreyaskar123 opened this issue May 18, 2023 · 1 comment

Comments

@shreyaskar123
Copy link

shreyaskar123 commented May 18, 2023

I was trying to do zero-shot video-captioning on on mPLUG. I first downloaded the datasets via https://alice-open.oss-cn-zhangjiakou.aliyuncs.com/mPLUG/data.tar. The VATEX data here seemed to be the same as the actual ones on the website. Then I run sh scripts/videocap_vatex_mplug_large.sh but I run into a few issues

(1) pip install git+git://github.com/j-min/language-evaluation@master fails: Presumably this is because the github link isn't correct but when I ran it with https://github.com/j-min/language-evaluation I still got the following traceback

File "", line 1, in
File "C:\Users\shrey\anaconda3\lib\site-packages\language_evaluation_init_.py", line 15, in
from language_evaluation.coco_caption_py3.pycocoevalcap.eval import COCOEvalCap
File "C:\Users\shrey\anaconda3\lib\site-packages\language_evaluation\coco_caption_py3\pycocoevalcap\eval.py", line 11, in
"METEOR": (Meteor(), "METEOR"),
File "C:\Users\shrey\anaconda3\lib\site-packages\language_evaluation\coco_caption_py3\pycocoevalcap\meteor\meteor.py", line 20, in init
self.meteor_p = subprocess.Popen(self.meteor_cmd,
File "C:\Users\shrey\anaconda3\lib\subprocess.py", line 951, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Users\shrey\anaconda3\lib\subprocess.py", line 1420, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] The system cannot find the file specified

Exception ignored in: <function Meteor.del at 0x0000028948E36700>
Traceback (most recent call last):
File "C:\Users\shrey\anaconda3\lib\site-packages\language_evaluation\coco_caption_py3\pycocoevalcap\meteor\meteor.py", line 78, in del
self.lock.acquire()
AttributeError: 'Meteor' object has no attribute 'lock'

(2) It seems that the videocap_mplugx.py doesn't exist. videocap_mplug.py does and I am guessing this is what was intended (the args match up nicely) but when I run with it, I get the following traceback (a module issue and a kubernetes issue). I am not sure if this is because I am not able to install language_evaluations correctly and not downloading coco or if I am running the wrong file. This is the full traceback for reference. The issue persists even when I do pip install ruamel.yaml. Thank you so much for all you help!

Traceback (most recent call last):
[W C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [kubernetes.docker.internal]:3223 (system error: 10049 - The requested address is not valid in its context.).
[W C:\actions-runner_work\pytorch\pytorch\builder\windows\pytorch\torch\csrc\distributed\c10d\socket.cpp:601] [c10d] The client socket has failed to connect to [kubernetes.docker.internal]:3223 (system error: 10049 - The requested address is not valid in its context.).
Traceback (most recent call last):
File "C:\Users\shrey\OneDrive\AliceMind\mPLUG\videocap_mplug.py", line 4, in
import ruamel_yaml as yaml
ModuleNotFoundError: No module named 'ruamel_yaml'
Traceback (most recent call last):
File "C:\Users\shrey\OneDrive\AliceMind\mPLUG\videocap_mplug.py", line 4, in
import ruamel_yaml as yaml
ModuleNotFoundError: No module named 'ruamel_yaml'
Traceback (most recent call last):
File "C:\Users\shrey\OneDrive\AliceMind\mPLUG\videocap_mplug.py", line 4, in
import ruamel_yaml as yaml
ModuleNotFoundError: No module named 'ruamel_yaml'
Traceback (most recent call last):
File "C:\Users\shrey\OneDrive\AliceMind\mPLUG\videocap_mplug.py", line 4, in
import ruamel_yaml as yaml
ModuleNotFoundError: No module named 'ruamel_yaml'
Traceback (most recent call last):
File "C:\Users\shrey\OneDrive\AliceMind\mPLUG\videocap_mplug.py", line 4, in
import ruamel_yaml as yaml
ModuleNotFoundError: No module named 'ruamel_yaml'
Traceback (most recent call last):
File "C:\Users\shrey\OneDrive\AliceMind\mPLUG\videocap_mplug.py", line 4, in
Traceback (most recent call last):
File "C:\Users\shrey\OneDrive\AliceMind\mPLUG\videocap_mplug.py", line 4, in
import ruamel_yaml as yaml
ModuleNotFoundError: No module named 'ruamel_yaml'
import ruamel_yaml as yaml
ModuleNotFoundError: No module named 'ruamel_yaml'
Traceback (most recent call last):
File "C:\Users\shrey\OneDrive\AliceMind\mPLUG\videocap_mplug.py", line 4, in
import ruamel_yaml as yaml
ModuleNotFoundError: No module named 'ruamel_yaml'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 5716) of binary: C:\Users\shrey\AppData\Local\Programs\Python\Python39\python.exe
Traceback (most recent call last):
File "C:\Users\shrey\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "C:\Users\shrey\AppData\Local\Programs\Python\Python39\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\shrey\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\distributed\launch.py", line 196, in
main()
File "C:\Users\shrey\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\distributed\launch.py", line 192, in main
launch(args)
File "C:\Users\shrey\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\distributed\launch.py", line 177, in launch
run(args)
File "C:\Users\shrey\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\distributed\run.py", line 785, in run
elastic_launch(
File "C:\Users\shrey\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\distributed\launcher\api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "C:\Users\shrey\AppData\Local\Programs\Python\Python39\lib\site-packages\torch\distributed\launcher\api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

videocap_mplug.py FAILED

Failures:
[1]:
time : 2023-05-18_00:22:07
host : university email address
rank : 1 (local_rank: 1)
exitcode : 1 (pid: 20760)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[2]:
time : 2023-05-18_00:22:07
host : university email address
rank : 2 (local_rank: 2)
exitcode : 1 (pid: 12436)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[3]:
time : 2023-05-18_00:22:07
host : university email address
rank : 3 (local_rank: 3)
exitcode : 1 (pid: 16444)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[4]:
time : 2023-05-18_00:22:07
host : university email address
rank : 4 (local_rank: 4)
exitcode : 1 (pid: 25304)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[5]:
time : 2023-05-18_00:22:07
host : university email address
rank : 5 (local_rank: 5)
exitcode : 1 (pid: 20196)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[6]:
time : 2023-05-18_00:22:07
host : university email address
rank : 6 (local_rank: 6)
exitcode : 1 (pid: 9708)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
[7]:
time : 2023-05-18_00:22:07
host : university email address
rank : 7 (local_rank: 7)
exitcode : 1 (pid: 21216)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

Root Cause (first observed failure):
[0]:
time : 2023-05-18_00:22:07
host : university email address
rank : 0 (local_rank: 0)
exitcode : 1 (pid: 5716)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html

@kenhuang1964
Copy link

Hey @shreyaskar123! Did you end up figuring out how to resolve the issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants