Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subprocess call hangs indefinitely in execution.py #3427

Open
Danivilanova opened this issue Apr 5, 2024 · 2 comments
Open

Subprocess call hangs indefinitely in execution.py #3427

Danivilanova opened this issue Apr 5, 2024 · 2 comments

Comments

@Danivilanova
Copy link

Description

I encountered an issue with skypilot-nightly where a subprocess call hangs indefinitely. This occurs when trying to execute a task using the sky.exec method in my Python script. The script gets stuck at a specific subprocess call within the sky library, and I have to manually interrupt it to exit.

Code

task = sky.Task.from_yaml(TASK_FILE)
task.workdir = WORKDIR
task.num_nodes = 1  # One machine per run
task.run = f"./venv/bin/python3 -m module_x"

sky.exec(
    task,
    cluster_name=CLUSTER_NAME,
    detach_run=True,
)

Expected Behavior

I expected the sky.exec call to execute the task and complete or fail without hanging.

Actual Behavior

The script gets stuck at a subprocess call within the sky library's execution.py file. Specifically, the call to subprocess_utils.run('sky status --no-show-spot-jobs --no-show-services', env=env) never completes, forcing me to interrupt the process manually with CMD+C.

Environment

Python Version: 3.11.8
Library Version: skypilot-nightly==1.0.0.dev20240404
Operating System: MacBook Air M2 16GB macOS 14.4 (23E214)

Additional Information

Issue occurs in the library code, specifically at:
venv/lib/python3.11/site-packages/sky/execution.py

env = dict(os.environ, **{env_options.Options.DISABLE_LOGGING.value: '1'})
subprocess_utils.run('sky status --no-show-spot-jobs --no-show-services', env=env)

Commenting out the problematic subprocess call allows the script to proceed, suggesting the issue lies with how the subprocess is executed or managed.

Version & Commit info:

  • sky -v: skypilot, version 1.0.0.dev20240404
  • sky -c: skypilot, commit 495140e
@concretevitamin
Copy link
Collaborator

Thanks for the report @Danivilanova! Question: does this happen consistently e.g., if you rerun the script? What if you run sky status --no-show-spot-jobs --no-show-services manually in terminal?

@Michaelvll maybe one enhancement we can do is detect whether entrypoint is CLI or Python and in the latter, disable such logging-related calls.

@Danivilanova
Copy link
Author

Hi @concretevitamin!

It happens every time I run the script and running sky status --no-show-spot-jobs --no-show-services from the terminal or the Python Interpreter works in both cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants