Convert run_with_beaker.py to use Blueprints. #1199

schmmd · 2018-05-11T22:17:13Z

A few breaking changes are required for this change. In addition, instead of running the experiment an experiment spec is output. This behavior can be changed if it's not desired, but I find it to be an improvement because it gives the user the opportunity to add additional flags that this script may not support.

A few breaking changes are required for this change. In addition, instead of running the experiment an experiment spec is output.

matt-gardner · 2018-05-12T15:24:50Z

Can we not change the behavior of this script? I use it frequently, and adding another step to running an experiment is annoying. I don't really care about blueprints vs. images, so that part of the change is fine with me, but not cluttering my file system with unnecessary files, or making me use two commands where I used to only need one. If you really want to output a file, could you make that a separate script, or a non-default option to this one?

schmmd · 2018-05-14T15:10:33Z

@matt-gardner I can mostly leave the script unchanged--although I'm going to remove some commands since they are deprecated and reportedly don't work.

One nice thing about the intermediate file is it gives you full control to be able to tweak your experiment. Without using the configuration, we need to keep adding arguments when we want additional functionality. Beaker is deprecating the run command, so some things might be challenging to tune via run.

But if that's undesirable to the people who actually use this script, I can cat the configuration directly to beaker.

matt-gardner · 2018-05-14T15:18:39Z

Sure, if there are arguments that are broken, by all means, great, let's fix them. Adding arguments for additional functionality is rare and a simple PR, and making that PR is much more desirable than adding a step for every invocation of this script. I submitted PRs just to shave a few seconds off of each invocation, if you remember ;).

ckarenz · 2018-05-14T18:45:34Z

To clarify: Beaker is not deprecating the experiment run command. We are transitioning the --file argument over to a new experiment create command. Create is the full-featured command which allows for arbitrarily complex (or simple) spec files. Run offers a simplified subset of Create by replacing the spec file with some short-hand arguments, and is purely for convenience.

ckarenz · 2018-05-14T18:49:51Z

scripts/ai2-internal/run_with_beaker.py

+        print(f"Using the specified blueprint: {blueprint}")
+    else:
+        print(f"Building the Docker image ({image})...")
+        subprocess.run(f'docker build -t {image} .', shell=True, check=True)


This pattern looks ideal. 👍

ckarenz · 2018-05-14T19:00:29Z

scripts/ai2-internal/run_with_beaker.py


 if __name__ == "__main__":
    parser = argparse.ArgumentParser()

    parser.add_argument('param_file', type=str, help='The model configuration file.')
-    parser.add_argument('--name', type=str, help='A name for the experiment.')
+    parser.add_argument('output_path', type=str, help='The destination to write the experiment spec.')


@matt-gardner makes a good point about convenience here. I'd recommend a slightly different pattern here. Drop output_path and instead echo the spec to STDOUT, so it can be piped into Beaker directly, like so:

./run_with_beaker.py <args...> | beaker experiment create --file=-

You can also switch on whether the spec is automatically run. One possible mechanism would be to offer a --dry-run flag. In this flag's absence, invoke Beaker directly. If --dry-run is present, then just echo the spec so the user can redirect to file and/or Beaker on their own.

Another option is to print the command you would invoke, and invoke this script with exec $(run_with_beaker.py). The output might be in the following format:

cat << EOF | beaker experiment create --file=- # Spec Goes Here... EOF

Either of these options also implies retaining the --name flag.

Funny, I just read this after I added a --dry-run flag ;-)

I actually personally think it's a good thing to have an extra step, but I'm happy to do what works best for other users (commit coming).

ckarenz · 2018-05-14T21:53:06Z

scripts/ai2-internal/run_with_beaker.py

+        },
+        "env": env
+    }
+    config_task = {"spec": config_spec}


You may consider adding "name": "<some value>" here. While optional, it would slightly simplify how this displays in the UI by hiding the task ID.

ckarenz

This seems reasonable to me, and Beaker usage and experiment spec look correct.

schmmd · 2018-05-14T22:36:10Z

The script now maintains the original IO, but it has an option argument for a blueprint (which skips creating a Docker image), an optional argument to output the spec to a specific location (rather than a temporary location), and a --dry-run flag.

Blueprints don't add much new functionality at this point, but we're ready for when they do and this cleans up and speeds up our script a bit by cutting ECR out of the loop. I understand that ECR is supported by not required.

matt-gardner · 2018-05-15T14:54:10Z

scripts/ai2-internal/run_with_beaker.py

 from allennlp.common.params import Params

-def main(param_file: str, extra_beaker_commands: List[str]):
-    ecr_repository = "896129387501.dkr.ecr.us-west-2.amazonaws.com"
+def main(param_file: str, args):


Can you give a type annotation for args?

matt-gardner · 2018-05-15T15:00:17Z

scripts/ai2-internal/run_with_beaker.py

-    subprocess.run(command, check=True)
+    dataset_mounts = []
+    for source in (args.source if args.source else []):
+        datasetId, containerPath = source.split(":") + "\n"


I wondered what this line did, so I tried it in an interpreter, and it crashed.

matt-gardner · 2018-05-15T15:06:35Z

scripts/ai2-internal/run_with_beaker.py

-    print(' '.join(command))
-    subprocess.run(command, check=True)
+    dataset_mounts = []
+    for source in (args.source if args.source else []):


Just add default=[] to the --source argument below, so you don't need this if/else block.

matt-gardner · 2018-05-15T15:07:19Z

scripts/ai2-internal/run_with_beaker.py

+            "datasetId": datasetId,
+            "containerPath": containerPath
+        })
+    for source in [f"{config_dataset_id}:/config.json"]:


Why not do sources = args.source + [f"{config_dataset_id}:/config.json"] (after getting rid of the if/else block above); then you only have one loop.

This works now that I have default.

matt-gardner · 2018-05-15T15:07:57Z

scripts/ai2-internal/run_with_beaker.py

+            "containerPath": containerPath
+        })
+
+    for var in (args.env if args.env else []):


Same here - use default=[] below instead of this if/else.

matt-gardner · 2018-05-15T15:19:21Z

scripts/ai2-internal/run_with_beaker.py

+    }
+    config_task = {"spec": config_spec}
+    if args.name:
+        config_task["name"] = args.name


You're putting the name in two places, where I think it just needs to be in one place. I don't think we want it here, we just want it on the experiment. Unless the beaker UI does something magical if the task name matches the experiment name? @ckarenz?

@ckarenz recommended putting it here for the UI.

Yes, and my guess is that he missed the fact that it was already given below.

@ckarenz from our earlier conversations I understood that name needed to be specified on the command-line, but from some experimentation it seems that I can specify it on the command line or in the configuration.

Interestingly, if I specify it on the command-line then it isn't part of the spec in the Beaker UI, but if I specify it in the configuration it is.

What is the best practice?

Ok, I looked at your beaker experiments, and the way you have it seems fine. Beaker has "experiments" and "tasks", where an "experiment" may contain several different "tasks". We only ever use a single "task" for our "experiments", so this distinction is a bit meaningless for us. Giving both the task and the experiment the same name seems fine.

If you look at your two most recent experiments, this one has both the experiment and the task name set to "food", while this one just has the experiment name set to "foo", with an empty task name, so the task shows up as an id. I don't have a strong preference between the two, either way is fine.

I think two different names have been conflated here. First we should clarify that there are two different concepts in play. A task is a minimal unit of work, or single process which is run once. An experiment is a graph of one or more tasks, tying them together with their dependencies. This script creates a single experiment containing a single task.

The name supplied to experiment create on the command line is the experiment name. It must be unique and therefore doesn't make sense to put in the spec; you can't use the same name twice without deleting your prior experiment or renaming it.

The name supplied in the file is the name of the task within that experiment. It must be unique within the experiment, but is not globally unique. It's used to link dependencies within the spec. When paired with an experiment name/ID it also serves as a human-readable substitute for the task's unique ID.

For your purposes the task name should probably remain constant, and be short.

matt-gardner · 2018-05-15T15:20:34Z

scripts/ai2-internal/run_with_beaker.py

+        experiment_command.append(args.name)
+
+    if args.dry_run:
+        print(f"This is a dry run (--dry-run).  Launch your job with the following command")


s/command/command:/

matt-gardner · 2018-05-15T15:22:05Z

scripts/ai2-internal/run_with_beaker.py

+    }
+
+    output_path = args.spec_output_path if args.spec_output_path else tempfile.mkstemp(".yaml",
+            "beaker-config")[1]


This should probably be beaker-config-, to separate the random numbers from "config".

DeNeutoy · 2018-05-15T19:48:08Z

scripts/ai2-internal/run_with_beaker.py

-        extra_beaker_commands.append(f'--desc={args.desc}')
-    if args.name:
-        # Remove spaces from the name, because Beaker doesn't allow them.
-        extra_beaker_commands.append(f'--name={args.name.replace(" ", "-")}')


Please don't change this behaviour, it is convenient

Apologies--the change was a mistake (I deleted the name argument and then re-added it).

Convert run_with_beaker.py to Blueprints.

12a8d6d

A few breaking changes are required for this change. In addition, instead of running the experiment an experiment spec is output.

schmmd requested a review from ckarenz May 11, 2018 22:17

ckarenz reviewed May 14, 2018

View reviewed changes

schmmd added 2 commits May 14, 2018 13:12

Preserve former behavior of script.

644b9e7

Fixup

f6b4afd

ckarenz reviewed May 14, 2018

View reviewed changes

ckarenz approved these changes May 14, 2018

View reviewed changes

schmmd added 2 commits May 14, 2018 15:22

Output name to experiment configuration.

95bddb2

Use JSON instead of yaml.

6f26b3b

schmmd assigned matt-gardner May 14, 2018

matt-gardner reviewed May 15, 2018

View reviewed changes

schmmd added 2 commits May 15, 2018 08:36

Address PR comments.

75e5164

Add type for args

15b4a4b

matt-gardner approved these changes May 15, 2018

View reviewed changes

DeNeutoy reviewed May 15, 2018

View reviewed changes

schmmd added 3 commits May 15, 2018 16:10

Make the spec name 'training'

0dbba08

Add back code that replaces spaces in names.

40f4764

Merge branch 'master' into blueprints

cf84af8

schmmd merged commit 8eb358b into allenai:master May 31, 2018

schmmd deleted the blueprints branch May 31, 2018 22:26

gabrielStanovsky pushed a commit to gabrielStanovsky/allennlp that referenced this pull request Sep 7, 2018

Convert run_with_beaker.py to use Blueprints. (allenai#1199)

a28d67e

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert run_with_beaker.py to use Blueprints. #1199

Convert run_with_beaker.py to use Blueprints. #1199

schmmd commented May 11, 2018

matt-gardner commented May 12, 2018

schmmd commented May 14, 2018

matt-gardner commented May 14, 2018

ckarenz commented May 14, 2018

ckarenz May 14, 2018

ckarenz May 14, 2018

schmmd May 14, 2018

ckarenz May 14, 2018

schmmd May 14, 2018

ckarenz left a comment

schmmd commented May 14, 2018

matt-gardner May 15, 2018

schmmd May 15, 2018

matt-gardner May 15, 2018

schmmd May 15, 2018

matt-gardner May 15, 2018

schmmd May 15, 2018

matt-gardner May 15, 2018

schmmd May 15, 2018

matt-gardner May 15, 2018

schmmd May 15, 2018

matt-gardner May 15, 2018

schmmd May 15, 2018

matt-gardner May 15, 2018

schmmd May 15, 2018

matt-gardner May 15, 2018

ckarenz May 15, 2018

matt-gardner May 15, 2018

schmmd May 15, 2018

matt-gardner May 15, 2018

schmmd May 15, 2018

DeNeutoy May 15, 2018

schmmd May 15, 2018

Convert run_with_beaker.py to use Blueprints. #1199

Convert run_with_beaker.py to use Blueprints. #1199

Conversation

schmmd commented May 11, 2018

matt-gardner commented May 12, 2018

schmmd commented May 14, 2018

matt-gardner commented May 14, 2018

ckarenz commented May 14, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ckarenz left a comment

Choose a reason for hiding this comment

schmmd commented May 14, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment