Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Launch interactive jobs for development/debugging #204

Open
stsievert opened this issue Mar 27, 2020 · 3 comments
Open

Launch interactive jobs for development/debugging #204

stsievert opened this issue Mar 27, 2020 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@stsievert
Copy link
Contributor

stsievert commented Mar 27, 2020

Is your feature request related to a problem? Please describe.
I have a job that requires a GPU. It requires a GPU and specifies a Docker image. I need to debug on this image.

Currently, my solution is launch an EC2 machine, copy my files over then start developing/debugging.

Describe the solution you'd like
I method to land an interactive job on this image. Something like https://htcondor.readthedocs.io/en/latest/users-manual/submitting-a-job.html#interactive-jobs

Describe alternatives you've considered

  • Submitting my own submit file via condor_submit. However, that's inconvenient: I don't really want to write that file, and manage keeping my submit file and MapOptions similar.
@stsievert stsievert added the enhancement New feature or request label Mar 27, 2020
@stsievert
Copy link
Contributor Author

I would image that HTMap has a submit file it uses internally. Could that submit file be used to generate a debugging/development submit file? I think I'd like something with this:

import htmap
options = {...}
future = htmap.map(..., map_options=htmap.MapOptions(**options))
future.debugging_submit_file

future.debugging_submit_file would specify my Docker image and transfer all the Python files HTMap needs. It'd hopefully have a comment detailing how to submit with condor_submit.

This would enable debugging on HTCondor without needing to rent an EC2 instance to debug (my current solution). Would that be possible?

@stsievert stsievert changed the title Feature request: launch interactive jobs for debugging Feature request: launch interactive jobs for development/debugging Mar 27, 2020
@JoshKarpel JoshKarpel changed the title Feature request: launch interactive jobs for development/debugging Launch interactive jobs for development/debugging Mar 28, 2020
@JoshKarpel JoshKarpel self-assigned this Mar 28, 2020
@JoshKarpel
Copy link
Contributor

I really like this idea! But, I want to be careful about how we implement it. It would be possible to generate the submit description for a single component, but I'd prefer a solution that "keeps you inside Python", since the intent is to wrap up the low-level HTCondor operations behind Python(ic) APIs.

I'm thinking of something like...

htmap.interactive(func, args, kwargs, map_options=...)

which would then connect you to the job (i.e., put you in a shell) once it starts running. I'll ask around about interactive submits and condor_ssh_to_job and see what's possible.

@stsievert
Copy link
Contributor Author

I'd prefer a solution that "keeps you inside Python", ... I'm thinking of something like..

It'd be great to launch the interactive job from Python! That'd remove a lot of the HTCondor details.

I primarily use these interactive jobs for developing a single script, and would use bash on this remote machine to run the script over and over. I'd probably use it like this:

submit2:~ $ ls
launch.py  finished.py  train.py
submit2:~ $ python
>>> import htmap
>>> htmap.interactive(map_options=...)
# hangs while job launches
remote-machine:~ $ ls
launch.py  finished.py  train.py
remote-machine:~ $ python train.py
# make edits to train.py
remote-machine:~ $ python train.py
remote-machine:~ $ exit
>>> # back on submit2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants