Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run --scheduler local utils.echo --mesg fails using commit 8486264c #202

Closed
9 tasks
stevebyan opened this issue Sep 24, 2021 · 3 comments
Closed
9 tasks
Assignees

Comments

@stevebyan
Copy link

🐛 Bug

Module (check all that applies):

  • torchx.spec
  • torchx.component
  • torchx.apps
  • torchx.runtime
  • [ X] torchx.cli
  • torchx.schedulers
  • torchx.pipelines
  • torchx.aws
  • torchx.examples
  • other

To Reproduce

Steps to reproduce the behavior:

  1. pip install -e git+https://github.com/pytorch/torchx.git#egg=torchx
  2. torchx run --scheduler local utils.echo --msg "Hellow world"
(venv) smb % pip install -e git+https://github.com/pytorch/torchx.git#egg=torchx 
Obtaining torchx from git+https://github.com/pytorch/torchx.git#egg=torchx
  Cloning https://github.com/pytorch/torchx.git to ./venv/src/torchx
  Running command git clone -q https://github.com/pytorch/torchx.git /Users/smb/Work/GIT/PrivateFederatedLearning/venv/src/torchx
  Resolved https://github.com/pytorch/torchx.git to commit 8486264c622b9d6d04bbb23e3c0b2387b24f5a46
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Requirement already satisfied: pyre-extensions in ./venv/lib/python3.9/site-packages (from torchx) (0.0.22)
Requirement already satisfied: docstring-parser==0.8.1 in ./venv/lib/python3.9/site-packages (from torchx) (0.8.1)
Requirement already satisfied: pyyaml in ./venv/lib/python3.9/site-packages (from torchx) (5.4.1)
Requirement already satisfied: typing-inspect in ./venv/lib/python3.9/site-packages (from pyre-extensions->torchx) (0.6.0)
Requirement already satisfied: typing-extensions in ./venv/lib/python3.9/site-packages (from pyre-extensions->torchx) (3.10.0.0)
Requirement already satisfied: mypy-extensions>=0.3.0 in ./venv/lib/python3.9/site-packages (from typing-inspect->pyre-extensions->torchx) (0.4.3)
Installing collected packages: torchx
  Running setup.py develop for torchx
Successfully installed torchx-0.1.0rc0
(venv) smb % torchx run --scheduler local utils.echo --help              
usage: torchx run ...torchx_params... echo  [-h] [--msg MSG] [--image IMAGE]
                                            [--num_replicas NUM_REPLICAS]

App spec: Echos a message to stdout (calls /bin/echo)

optional arguments:
  -h, --help            show this help message and exit
  --msg MSG             message to echo
  --image IMAGE         image to use
  --num_replicas NUM_REPLICAS
                        number of replicas to run
(venv) smb % torchx run --scheduler local utils.echo --msg "Hellow world"
Traceback (most recent call last):
  File "/Users/smb/Work/GIT/PrivateFederatedLearning/venv/bin/torchx", line 33, in <module>
    sys.exit(load_entry_point('torchx', 'console_scripts', 'torchx')())
  File "/Users/smb/Work/GIT/PrivateFederatedLearning/venv/src/torchx/torchx/cli/main.py", line 62, in main
    args.func(args)
  File "/Users/smb/Work/GIT/PrivateFederatedLearning/venv/src/torchx/torchx/cli/cmd_run.py", line 157, in run
    self._run(runner, args)
  File "/Users/smb/Work/GIT/PrivateFederatedLearning/venv/src/torchx/torchx/cli/cmd_run.py", line 118, in _run
    result = runner.run_component(
  File "/Users/smb/Work/GIT/PrivateFederatedLearning/venv/src/torchx/torchx/runner/api.py", line 161, in run_component
    return self.run(app, scheduler, cfg)
  File "/Users/smb/Work/GIT/PrivateFederatedLearning/venv/src/torchx/torchx/runner/api.py", line 179, in run
    dryrun_info = self.dryrun(app, scheduler, cfg)
  File "/Users/smb/Work/GIT/PrivateFederatedLearning/venv/src/torchx/torchx/runner/api.py", line 269, in dryrun
    dryrun_info = sched.submit_dryrun(app, cfg or RunConfig())
  File "/Users/smb/Work/GIT/PrivateFederatedLearning/venv/src/torchx/torchx/schedulers/api.py", line 125, in submit_dryrun
    dryrun_info = self._submit_dryrun(app, resolved_cfg)
  File "/Users/smb/Work/GIT/PrivateFederatedLearning/venv/src/torchx/torchx/schedulers/local_scheduler.py", line 648, in _submit_dryrun
    request = self._to_popen_request(app, cfg)
  File "/Users/smb/Work/GIT/PrivateFederatedLearning/venv/src/torchx/torchx/schedulers/local_scheduler.py", line 687, in _to_popen_request
    img_root = image_provider.fetch(role.image)
  File "/Users/smb/Work/GIT/PrivateFederatedLearning/venv/src/torchx/torchx/schedulers/local_scheduler.py", line 141, in fetch
    raise ValueError(
ValueError: Invalid image name: ghcr.io/pytorch/torchx:0.1.0rc0, does not exist or is not a directory

Expected behavior

I expected the command to output the string Hellow world

Environment

  • torchx version (e.g. 0.1.0rc1): 0.1.0rc0
  • Python version: 3.9.6
  • OS (e.g., Linux): MacOS 10.15.7
  • How you installed torchx (conda, pip, source, docker): pip install -e git+https://github.com/pytorch/torchx.git#egg=torchx
  • Docker image and tag (if using docker):
  • Git commit (if installed from source):
  • Execution environment (on-prem, AWS, GCP, Azure etc): on-prem
  • Any other relevant information:

Additional context

@aivanou aivanou self-assigned this Sep 25, 2021
@aivanou
Copy link
Contributor

aivanou commented Sep 25, 2021

Hi @stevebyan !

Thank you for trying torchx!
My apologies, that you wasn't able to run successfully. Currently we are revisiting some internal components of the torchx, to make them easier to use. As a result, we changed the way we work with images.

There is a PR #203 that will fix this bug that you are experiencing.

In a meantime, can you please try:

torchx run --scheduler local utils.echo --image / --msg "Hellow world"

-Aliaksandr

@aivanou
Copy link
Contributor

aivanou commented Sep 25, 2021

One thing that we need to add is the better error messaging, will submit a PR for this.

@d4l3k
Copy link
Contributor

d4l3k commented Oct 20, 2021

@stevebyan I believe this is fixed in the current release/trunk. Feel free to reopen if you're still hitting this issue

tristanr@tristanr-arch2 ~> torchx run utils.echo --msg "Hellow world"
0.1.0rc2: Pulling from pytorch/torchx
Digest: sha256:f97df7bd3d2137b3dcb6b85e14f4eef9c1c7cdd826d12508cc7cafb08bb2f704
Status: Image is up to date for ghcr.io/pytorch/torchx:0.1.0rc2
ghcr.io/pytorch/torchx:0.1.0rc2
local_docker://torchx/echo_dfc8191d
torchx 2021-10-20 15:15:49 INFO     Waiting for the app to finish...
Hellow world
torchx 2021-10-20 15:15:50 INFO     Job finished: SUCCEEDED
tristanr@tristanr-arch2 ~> torchx run -s local_cwd utils.echo --msg "Hellow world"
Hellow world
local_cwd://torchx/echo_cfeb1ad7
torchx 2021-10-20 15:16:47 INFO     Waiting for the app to finish...
torchx 2021-10-20 15:16:47 INFO     Job finished: SUCCEEDED

@d4l3k d4l3k closed this as completed Oct 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants