[hailctl] Add hailctl batch submit#12471
Conversation
|
I'm inclined to set |
| local_file = file['from'] | ||
| cloud_file = file['to'] | ||
| in_file = b.read_input(cloud_file) | ||
| j.command(f'ln -s {in_file} {local_file}') |
There was a problem hiding this comment.
You might need to defend against users mistakenly providing fully qualified paths to --files. I guess just os.basename is enough. Might want to verify the user didn't specify two files with the same name.
There was a problem hiding this comment.
Changed to use the relative path from the user's current directory, so as to be compatible with scripts reading from relative paths.
| @@ -0,0 +1,58 @@ | |||
| import asyncio | |||
There was a problem hiding this comment.
Ah this is so delightfully short!
I'm happy to be overruled here but I like the "just copy their config" approach for configurations like these where both the local and batch backend could make sense, so the behavior is as consistent as is reasonable across environments. It will Just Work in the way you want if the user has the backend set to batch in their config ;) |
|
Let's enumerate the use-cases. In all cases, the point is to run the client script in the cloud.
I think use-case (4) is rare, though useful. If we're pitching I am somewhat frustrated that we have these different deployment strategies. It seems like unnecessary intellectual burden for a user to think about when they're just trying to run some Hail code. One option is to have different commands. Submitting a QoB job is done like this: And submitting a batch job is done like this: It pains me to think about trying to explain the difference between (1) and (4) to a scientist. At least with two distinct commands we can sort of ignore (4) and say: if you want to run Hail Query on Hail Batch use |
|
Or even |
|
Hmph, ya this seems annoyingly complicated, and I'd prefer to make one command with opinionated but configurable defaults than have different commands. One thing that feels inconsistent here is what we do in the Batch interface. We don't have the equivalent of |
I like this better than |
|
Re: testing, I wanted to wait on the spawned batch and ensure that it passed, but I had trouble doing that because it looks like the new rich progress bars are printed to stdout so I can't make use of json output and |
|
As an aside, we should definitely have a It seems to me that, given the precedent of We need a better name for local mode Spark or Query. I'm slowly realizing that lots of people don't realize you can use Hail on a laptop. Are there other tools that have already settled on terminology here? |
I think this is possible but I haven't looked into it. I'm opposed to stderr in Notebooks because I don't want it to appear with red background. I've recently realized that a lot of users find it concerning that we print so much red background text. You might also have a quiet mode? If you don't |
|
Ya the progress bars I mentioned were the file upload summary and the batch submission. I can try to make a quiet mode so it can cleanly print just the json. My intention was not to wait it in |
|
My inclination is to assume the arg parser works correctly and to just test calling a function that returns the information you need. |
@danking Wait so shall I set |
danking
left a comment
There was a problem hiding this comment.
Indeed I was too quick to approve.
|
@danking bump |
CHANGELOG: Add
hailctl batch submitcommand to run local scripts inside batch jobs.Can run the following
foo.py:with the command
hailctl batch submit foo.py --name foobatch --files=baz