Skip to content

Commit

Permalink
Testing and docs
Browse files Browse the repository at this point in the history
  • Loading branch information
wlandau-lilly committed Jan 8, 2024
1 parent d26fa38 commit 07027a5
Show file tree
Hide file tree
Showing 11 changed files with 263 additions and 145 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Description: In computationally demanding analysis projects,
'clustermq' by Schubert (2019) <doi:10.1093/bioinformatics/btz284>),
and 'batchtools' by Lang, Bischl, and Surmann (2017).
<doi:10.21105/joss.00135>.
Version: 0.0.2.9000
Version: 0.0.3
License: MIT + file LICENSE
URL: https://wlandau.github.io/crew.aws.batch/,
https://github.com/wlandau/crew.aws.batch
Expand Down
4 changes: 2 additions & 2 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# crew.aws.batch 0.0.2.9000 (development)

# crew.aws.batch 0.0.3

* Move all job definition management methods to their own class. (See `crew_definition_aws_batch()`.)

# crew.aws.batch 0.0.2

Expand Down
19 changes: 2 additions & 17 deletions R/crew_definition_aws_batch.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,25 +3,10 @@
#' @family definition
#' @description Create an `R6` object to manage a job definition for AWS
#' Batch jobs.
#' @param job_queue Character of length 1, name of the AWS Batch
#' job queue.
#' @inheritParams crew_monitor_aws_batch
#' @param job_definition Character of length 1, name of the AWS Batch
#' job definition. The job definition might or might not exist
#' at the time `crew_definition_aws_batch()` is called. Either way is fine.
#' @param config Optional named list, `config` argument of
#' `paws.compute::batch()` with optional configuration details.
#' @param credentials Optional named list. `credentials` argument of
#' `paws.compute::batch()` with optional credentials (if not already
#' provided through environment variables such as `AWS_ACCESS_KEY_ID`).
#' @param endpoint Optional character of length 1. `endpoint`
#' argument of `paws.compute::batch()` with the endpoint to send HTTP
#' requests.
#' @param region Character of length 1. `region` argument of
#' `paws.compute::batch()` with an AWS region string such as `"us-east-2"`.
#' Serves as the region for both AWS Batch and CloudWatch. Tries to
#' default to `paws.common::get_config()$region`, then to
#' `Sys.getenv("AWS_REGION")` if unsuccessful, then
#' `Sys.getenv("AWS_REGION")`, then `Sys.getenv("AWS_DEFAULT_REGION")`.
crew_definition_aws_batch <- function(
job_queue,
job_definition = paste0(
Expand Down Expand Up @@ -596,7 +581,7 @@ crew_class_definition_aws_batch <- R6::R6Class(
arn = character(0L),
revision = integer(0L),
status = character(0L),
type =character(0L),
type = character(0L),
scheduling_priority = character(0L),
parameters = list(),
retry_strategy = list(),
Expand Down
3 changes: 1 addition & 2 deletions R/crew_monitor_aws_batch.R
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,7 @@
#' @param job_queue Character of length 1, name of the AWS Batch
#' job queue.
#' @param job_definition Character of length 1, name of the AWS Batch
#' job definition. The job definition might or might not exist
#' at the time `crew_monitor_aws_batch()` is called. Either way is fine.
#' job definition.
#' @param log_group Character of length 1,
#' AWS Batch CloudWatch log group to get job logs.
#' The default log group is often `"/aws/batch/job"`, but not always.
Expand Down
59 changes: 52 additions & 7 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -93,14 +93,12 @@ str(groups$SecurityGroups[[1L]])
#> $ VpcId : chr "vpc-00000"
```

# Job management
# Managing job definitions

With `crew.aws.batch`, your `crew` controller automatically submits jobs to AWS Batch. These jobs may fail or linger for any number of reasons, which could impede work and increase costs. So before you use `crew_controller_aws_batch()`, please learn how to monitor and terminate AWS Batch jobs manually.

`crew.aws.batch` defines a "monitor" class to help you take control of jobs and job definitions. Create a monitor object with `crew_monitor_aws_batch()`. You will need to supply a job definition name and a job queue name.
Before submitting jobs, AWS Batch requires a job definition to describe the container image and resource requirements. You can do this through the AWS web console, the AWS command line interface (CLI), a software development kit (SDK) like the `paws` R package, or the job definition class in `crew.aws.batch`. For `crew.aws.batch`, first create a job definition object.

```{r}
monitor <- crew_monitor_aws_batch(
```r
definition <- crew_definition_aws_batch(
job_definition = "YOUR_JOB_DEFINITION_NAME",
job_queue = "YOUR_JOB_QUEUE_NAME"
)
Expand All @@ -109,13 +107,60 @@ monitor <- crew_monitor_aws_batch(
The job definition may or may not exist at this point. If it does not exist, you can register with `register()`, an oversimplified limited-scope method which creates container-based job definitions with the `"awslogs"` log driver (for CloudWatch).^[The log group supplied to `crew_monitor_aws_batch()` must be valid. The default is `"/aws/batch/log"`, which may not exist if your system administrator has a custom logging policy.] Below, your container image can be as simple as a Docker Hub identifier (like `"alpine:latest:`) or a full URI of an ECR image.^[For the `crew` controller, you will definitely want an image with R and `crew` installed. For the purposes of testing the monitor, `"alpine:latest"` will work.]

```{r}
monitor$register(
definition$register(
image = "AWS_ACCOUNT_ID.dkr.ecr.AWS_REGION.amazonaws.com/ECR_REPOSITORY_NAME:IMAGE_TAG",
platform_capabilities = "EC2",
memory_units = "gigabytes",
memory = 8,
cpus = 2
)
#> # A tibble: 1 × 3
#> name revision arn
#> <chr> <int> <chr>
#> 1 YOUR_JOB_DEFINITION_NAME 81 arn:aws:batch:us-east-1:CENSORED:jo…
```

The `describe()` method shows information about current and past revisions of the job definition. Set `active` to `TRUE` to see just the active revisions.


```{r}
definition$describe(active = TRUE)
#> # A tibble: 2 × 16
#> name arn revision status type scheduling_priority parameters
#> <chr> <chr> <int> <chr> <chr> <dbl> <list>
#> 1 YOUR_JOB_DEFIN… arn:… 82 active cont… 3 <list [0]>
#> 2 YOUR_JOB_DEFIN… arn:… 81 active cont… 3 <list [0]>
#> # ℹ 9 more variables: retry_strategy <list>, container_properties <list>,
#> # timeout <list>, node_properties <list>, tags <list>,
#> # propagate_tags <lgl>, platform_capabilities <chr>,
#> # eks_properties <list>, container_orchestration_type <chr>
```

Use `deregister()` to deregister a revision of a job definition. If a revision number is not supplied, then it defaults to the greatest active revision number.

```{r}
definition$deregister()
#> # A tibble: 1 × 16
#> name arn revision status type scheduling_priority parameters
#> <chr> <chr> <int> <chr> <chr> <dbl> <list>
#> 1 YOUR_JOB_DEFIN… arn:… 81 active cont… 3 <list [0]>
#> # ℹ 9 more variables: retry_strategy <list>, container_properties <list>,
#> # timeout <list>, node_properties <list>, tags <list>,
#> # propagate_tags <lgl>, platform_capabilities <chr>,
#> # eks_properties <list>, container_orchestration_type <chr>
```

# Monitoring and terminating jobs

With `crew.aws.batch`, your `crew` controller automatically submits jobs to AWS Batch. These jobs may fail or linger for any number of reasons, which could impede work and increase costs. So before you use `crew_controller_aws_batch()`, please learn how to monitor and terminate AWS Batch jobs manually.

`crew_monitor_aws_batch()` defines a "monitor" to help you manually list, inspect, and terminate jobs. You will need to supply a job definition name and a job queue name.

```{r}
monitor <- crew_monitor_aws_batch(
job_definition = "YOUR_JOB_DEFINITION_NAME",
job_queue = "YOUR_JOB_QUEUE_NAME"
)
```

You can submit individual AWS Batch jobs to test your computing environment.
Expand Down
82 changes: 67 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,21 +109,17 @@ str(groups$SecurityGroups[[1L]])
#> $ VpcId : chr "vpc-00000"
```

# Job management
# Managing job definitions

With `crew.aws.batch`, your `crew` controller automatically submits jobs
to AWS Batch. These jobs may fail or linger for any number of reasons,
which could impede work and increase costs. So before you use
`crew_controller_aws_batch()`, please learn how to monitor and terminate
AWS Batch jobs manually.

`crew.aws.batch` defines a “monitor” class to help you take control of
jobs and job definitions. Create a monitor object with
`crew_monitor_aws_batch()`. You will need to supply a job definition
name and a job queue name.
Before submitting jobs, AWS Batch requires a job definition to describe
the container image and resource requirements. You can do this through
the AWS web console, the AWS command line interface (CLI), a software
development kit (SDK) like the `paws` R package, or the job definition
class in `crew.aws.batch`. For `crew.aws.batch`, first create a job
definition object.

``` r
monitor <- crew_monitor_aws_batch(
definition <- crew_definition_aws_batch(
job_definition = "YOUR_JOB_DEFINITION_NAME",
job_queue = "YOUR_JOB_QUEUE_NAME"
)
Expand All @@ -137,13 +133,69 @@ image can be as simple as a Docker Hub identifier (like
`"alpine:latest:`) or a full URI of an ECR image.[^4]

``` r
monitor$register(
definition$register(
image = "AWS_ACCOUNT_ID.dkr.ecr.AWS_REGION.amazonaws.com/ECR_REPOSITORY_NAME:IMAGE_TAG",
platform_capabilities = "EC2",
memory_units = "gigabytes",
memory = 8,
cpus = 2
)
#> # A tibble: 1 × 3
#> name revision arn
#> <chr> <int> <chr>
#> 1 YOUR_JOB_DEFINITION_NAME 81 arn:aws:batch:us-east-1:CENSORED:jo…
```

The `describe()` method shows information about current and past
revisions of the job definition. Set `active` to `TRUE` to see just the
active revisions.

``` r
definition$describe(active = TRUE)
#> # A tibble: 2 × 16
#> name arn revision status type scheduling_priority parameters
#> <chr> <chr> <int> <chr> <chr> <dbl> <list>
#> 1 YOUR_JOB_DEFIN… arn:… 82 active cont… 3 <list [0]>
#> 2 YOUR_JOB_DEFIN… arn:… 81 active cont… 3 <list [0]>
#> # ℹ 9 more variables: retry_strategy <list>, container_properties <list>,
#> # timeout <list>, node_properties <list>, tags <list>,
#> # propagate_tags <lgl>, platform_capabilities <chr>,
#> # eks_properties <list>, container_orchestration_type <chr>
```

Use `deregister()` to deregister a revision of a job definition. If a
revision number is not supplied, then it defaults to the greatest active
revision number.

``` r
definition$deregister()
#> # A tibble: 1 × 16
#> name arn revision status type scheduling_priority parameters
#> <chr> <chr> <int> <chr> <chr> <dbl> <list>
#> 1 YOUR_JOB_DEFIN… arn:… 81 active cont… 3 <list [0]>
#> # ℹ 9 more variables: retry_strategy <list>, container_properties <list>,
#> # timeout <list>, node_properties <list>, tags <list>,
#> # propagate_tags <lgl>, platform_capabilities <chr>,
#> # eks_properties <list>, container_orchestration_type <chr>
```

# Monitoring and terminating jobs

With `crew.aws.batch`, your `crew` controller automatically submits jobs
to AWS Batch. These jobs may fail or linger for any number of reasons,
which could impede work and increase costs. So before you use
`crew_controller_aws_batch()`, please learn how to monitor and terminate
AWS Batch jobs manually.

`crew_monitor_aws_batch()` defines a “monitor” to help you manually
list, inspect, and terminate jobs. You will need to supply a job
definition name and a job queue name.

``` r
monitor <- crew_monitor_aws_batch(
job_definition = "YOUR_JOB_DEFINITION_NAME",
job_queue = "YOUR_JOB_QUEUE_NAME"
)
```

You can submit individual AWS Batch jobs to test your computing
Expand Down Expand Up @@ -337,7 +389,7 @@ citation("crew.aws.batch")
To cite package 'crew.aws.batch' in publications use:

Landau WM (????). _crew.aws.batch: A Crew Launcher Plugin for AWS
Batch_. R package version 0.0.1,
Batch_. R package version 0.0.2.9000,
https://github.com/wlandau/crew.aws.batch,
<https://wlandau.github.io/crew.aws.batch/>.

Expand All @@ -346,7 +398,7 @@ A BibTeX entry for LaTeX users is
@Manual{,
title = {crew.aws.batch: A Crew Launcher Plugin for AWS Batch},
author = {William Michael Landau},
note = {R package version 0.0.1,
note = {R package version 0.0.2.9000,
https://github.com/wlandau/crew.aws.batch},
url = {https://wlandau.github.io/crew.aws.batch/},
}
Expand Down
1 change: 1 addition & 0 deletions inst/WORDLIST
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,7 @@ www
ARN
deregister
deregistered
mebibytes
MiB
nolint
Expand Down
7 changes: 7 additions & 0 deletions man/crew_definition_aws_batch.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 1 addition & 2 deletions man/crew_monitor_aws_batch.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 07027a5

Please sign in to comment.