-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hailctl] Add hailctl batch submit #12471
Conversation
I'm inclined to set |
local_file = file['from'] | ||
cloud_file = file['to'] | ||
in_file = b.read_input(cloud_file) | ||
j.command(f'ln -s {in_file} {local_file}') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might need to defend against users mistakenly providing fully qualified paths to --files
. I guess just os.basename is enough. Might want to verify the user didn't specify two files with the same name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to use the relative path from the user's current directory, so as to be compatible with scripts reading from relative paths.
@@ -0,0 +1,58 @@ | |||
import asyncio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah this is so delightfully short!
I'm happy to be overruled here but I like the "just copy their config" approach for configurations like these where both the local and batch backend could make sense, so the behavior is as consistent as is reasonable across environments. It will Just Work in the way you want if the user has the backend set to batch in their config ;) |
Let's enumerate the use-cases. In all cases, the point is to run the client script in the cloud.
I think use-case (4) is rare, though useful. If we're pitching I am somewhat frustrated that we have these different deployment strategies. It seems like unnecessary intellectual burden for a user to think about when they're just trying to run some Hail code. One option is to have different commands. Submitting a QoB job is done like this:
And submitting a batch job is done like this:
It pains me to think about trying to explain the difference between (1) and (4) to a scientist. At least with two distinct commands we can sort of ignore (4) and say: if you want to run Hail Query on Hail Batch use |
Or even |
Hmph, ya this seems annoyingly complicated, and I'd prefer to make one command with opinionated but configurable defaults than have different commands. One thing that feels inconsistent here is what we do in the Batch interface. We don't have the equivalent of |
I like this better than |
Re: testing, I wanted to wait on the spawned batch and ensure that it passed, but I had trouble doing that because it looks like the new rich progress bars are printed to stdout so I can't make use of json output and |
As an aside, we should definitely have a It seems to me that, given the precedent of We need a better name for local mode Spark or Query. I'm slowly realizing that lots of people don't realize you can use Hail on a laptop. Are there other tools that have already settled on terminology here? |
I think this is possible but I haven't looked into it. I'm opposed to stderr in Notebooks because I don't want it to appear with red background. I've recently realized that a lot of users find it concerning that we print so much red background text. You might also have a quiet mode? If you don't |
Ya the progress bars I mentioned were the file upload summary and the batch submission. I can try to make a quiet mode so it can cleanly print just the json. My intention was not to wait it in |
My inclination is to assume the arg parser works correctly and to just test calling a function that returns the information you need. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@danking Wait so shall I set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed I was too quick to approve.
@danking bump |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huzzah
CHANGELOG: Add
hailctl batch submit
command to run local scripts inside batch jobs.Can run the following
foo.py
:with the command
hailctl batch submit foo.py --name foobatch --files=baz