Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set AWS container properties from container options #2471

Merged
merged 31 commits into from Dec 12, 2021

Conversation

manuelesimi
Copy link
Collaborator

@manuelesimi manuelesimi commented Dec 1, 2021

This contribution addresses the requirements from #2282 and #2413.

With this change, the container properties of a job definition registered by Nextflow in AWS Batch are set starting from the container options defined in the Nextflow task.

Example

Given the following container options:

process sayHelloFirst {

  containerOptions '--env MYVAR2=foo2 --env MYVAR3=foo3 --memory-swap 240000 --memory-swappiness 20 --shm-size 16000000 --ulimit nofile=1280:2560 --ulimit nproc=16:32 --privileged'

}

the job is registered with the following properties on AWS Batch:

properties

Do note that:

  1. this mapping does not include properties already set by other means in Nextflow (like volumes, mount points, cpus, memory, etc.)
  2. not all the docker run options can be mapped into AWS. See AWS doc.

pditommaso and others added 2 commits November 30, 2021 17:52
…tainer properties.

Signed-off-by: Manuele Simi <manuele.simi@gmail.com>
Signed-off-by: Manuele Simi <manuele.simi@gmail.com>
@manuelesimi manuelesimi marked this pull request as draft December 2, 2021 15:21
Signed-off-by: Manuele Simi <manuele.simi@gmail.com>
@manuelesimi
Copy link
Collaborator Author

manuelesimi commented Dec 2, 2021

I just pushed a fix for tmpfs mounts. They are now correctly set in the container properties:

process sayHelloFirst {
  containerOptions '--tmpfs /run:rw,noexec,nosuid,size=64 --tmpfs /app:ro,size=128'
  ...
}

tmpfs mounts

The size must be in MiB.

@manuelesimi manuelesimi marked this pull request as ready for review December 2, 2021 16:34
Copy link
Member

@pditommaso pditommaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Manuale, this is welcome. Actually, I had gave a try some time ago but gave up when I realised container options should be managed at Job Definition level (what a design mistake ..)

My main comment is that this is a good opportunity to manage the containerOption as a map. Look at my commit that I've just pushed

5e9c1cc

@manuelesimi
Copy link
Collaborator Author

Great work (and fast!).
I noticed you pushed your change for the map options to another branch. How should I proceed? Merger yours in this one, wait until you merge yours into master and then merge master back into this, or else?

@pditommaso
Copy link
Member

I would say to merge my branch in your PR, then you can continue with that

@manuelesimi
Copy link
Collaborator Author

@pditommaso There is a problem with your parser. The Map doesn't consider options that can be repeated multiple times (like --env or --tmpfs). Each option should be mapped to a List of values.

@manuelesimi
Copy link
Collaborator Author

manuelesimi commented Dec 2, 2021

Also, '=' cannot be used as separator in the QuoteStringTokenizer. There are options that use '=' in their values. E.g. --env MY_VAR=value.

@pditommaso
Copy link
Member

It's getting complicated. Should repeatable options be supported? my original plan was only to support a subset of them.

@manuelesimi
Copy link
Collaborator Author

manuelesimi commented Dec 2, 2021

It's getting complicated. Should repeatable options be supported? my original plan was only to support a subset of them.

Parsing command line options is never easy. Unfortunately ulimit is one of the repeatable options that I expect users will require.

I made a change that seems to work:
From '--env VAR_FOO -e VAR_FOO2=value2 --env VAR_FOO3=value3 --user nf-user', I get:

list

Would that be OK with you?

What's probably more concerning is that we should support only the format "option value" (which is the one documented in the docker run help), not "option=value". So a value with an equal char is not separated (like in the screenshot above).

@pditommaso
Copy link
Member

Is basically a Map<String,List> ?

@manuelesimi
Copy link
Collaborator Author

Is basically a Map<String,List> ?

yes, it is.

@pditommaso
Copy link
Member

Maybe we should model it as a List of pairs, or even a custom holder class

@manuelesimi
Copy link
Collaborator Author

I'd go for a custom holder where we can add the lookup methods we will need.

@pditommaso
Copy link
Member

👍

@manuelesimi
Copy link
Collaborator Author

Question:

The method Map getContainerOptionsMap() in TaskConfig returns a Map. Hence, we have the following options:

  1. we change the method signature to return our custom holder class. As far as I see, it is used only here so it would have a low impact.
  2. our custom holder extends a Map (but then we need to cast back to the holder's class to access its methods, which I'd really dislike)
  3. we stick to a Map<String,List>

Which one would you favor?

@pditommaso
Copy link
Member

I would say 1.

@manuelesimi
Copy link
Collaborator Author

Also a combination of 1. and 2. would work well.

@manuelesimi
Copy link
Collaborator Author

I would say 1.

Sounds good.

@pditommaso
Copy link
Member

As you wish 😄

…ptions.

Signed-off-by: Manuele Simi <manuele.simi@gmail.com>
@pditommaso
Copy link
Member

Just realised that AwsContainerOptionsMapper was tested via indirectly the TaskHandler. Think it would be better to simplify this testing directly the class ie.

 when:
 def map = CmdLineHelper.parseGnuArgs('--ulimit nofile=1280:2560 --ulimit nproc=16:32')
 def properties = AwsContainerOptionsMapper.createContainerOpts(map) 
 then:
 properties.getUlimits().size() == 2
 properties.getUlimits().get(0).toString() == '{HardLimit: 2560,Name: nofile,SoftLimit: 1280}'
 properties.getUlimits().get(1).toString() == '{HardLimit: 32,Name: nproc,SoftLimit: 16}'

def 'should set env vars'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '--env VAR_FOO -e VAR_FOO2=value2 --env VAR_FOO3=value3')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
def environment = job.getContainerProperties().getEnvironment()
environment.size() == 3
environment.get(0).toString() == '{Name: VAR_FOO,}'
environment.get(1).toString() == '{Name: VAR_FOO3,Value: value3}'
environment.get(2).toString() == '{Name: VAR_FOO2,Value: value2}'
}
def 'should set ulimits'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '--ulimit nofile=1280:2560 --ulimit nproc=16:32')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
def properties = job.getContainerProperties()
properties.getUlimits().size() == 2
properties.getUlimits().get(0).toString() == '{HardLimit: 2560,Name: nofile,SoftLimit: 1280}'
properties.getUlimits().get(1).toString() == '{HardLimit: 32,Name: nproc,SoftLimit: 16}'
}
def 'should set privileged, readonly and user'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '--privileged --read-only --user nf-user')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
job.getContainerProperties().getPrivileged() == true
job.getContainerProperties().getReadonlyRootFilesystem() == true
job.getContainerProperties().getUser() == 'nf-user'
}
def 'should set tmpfs linux params'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '--tmpfs /run:rw,noexec,nosuid,size=64 --tmpfs /app:ro,size=128')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
def params = job.getContainerProperties().getLinuxParameters()
params.getTmpfs().size() == 2
params.getTmpfs().get(0).toString() == '{ContainerPath: /run,Size: 64,MountOptions: [rw, noexec, nosuid]}'
params.getTmpfs().get(1).toString() == '{ContainerPath: /app,Size: 128,MountOptions: [ro]}'
}
def 'should set memory linux params'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '--memory-swappiness 90 --memory-swap 2048 --shm-size 1024 ')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
def params = job.getContainerProperties().getLinuxParameters()
params.getSharedMemorySize() == 1024
params.getMaxSwap() == 2048
params.getSwappiness() == 90
}
def 'should set exactly one param'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '--privileged')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
def properties = job.getContainerProperties()
properties.getLinuxParameters() == null
properties.getUlimits() == null
properties.getPrivileged() == true
properties.getReadonlyRootFilesystem() == null
properties.getUser() == null
}
def 'should set no params'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
def properties = job.getContainerProperties()
properties.getLinuxParameters() == null
properties.getUlimits() == null
properties.getPrivileged() == null
properties.getReadonlyRootFilesystem() == null
properties.getUser() == null
}

@manuelesimi
Copy link
Collaborator Author

manuelesimi commented Dec 10, 2021

Just realised that AwsContainerOptionsMapper was tested via indirectly the TaskHandler. Think it would be better to simplify this testing directly the class ie.

 when:
 def map = CmdLineHelper.parseGnuArgs('--ulimit nofile=1280:2560 --ulimit nproc=16:32')
 def properties = AwsContainerOptionsMapper.createContainerOpts(map) 
 then:
 properties.getUlimits().size() == 2
 properties.getUlimits().get(0).toString() == '{HardLimit: 2560,Name: nofile,SoftLimit: 1280}'
 properties.getUlimits().get(1).toString() == '{HardLimit: 32,Name: nproc,SoftLimit: 16}'

def 'should set env vars'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '--env VAR_FOO -e VAR_FOO2=value2 --env VAR_FOO3=value3')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
def environment = job.getContainerProperties().getEnvironment()
environment.size() == 3
environment.get(0).toString() == '{Name: VAR_FOO,}'
environment.get(1).toString() == '{Name: VAR_FOO3,Value: value3}'
environment.get(2).toString() == '{Name: VAR_FOO2,Value: value2}'
}
def 'should set ulimits'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '--ulimit nofile=1280:2560 --ulimit nproc=16:32')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
def properties = job.getContainerProperties()
properties.getUlimits().size() == 2
properties.getUlimits().get(0).toString() == '{HardLimit: 2560,Name: nofile,SoftLimit: 1280}'
properties.getUlimits().get(1).toString() == '{HardLimit: 32,Name: nproc,SoftLimit: 16}'
}
def 'should set privileged, readonly and user'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '--privileged --read-only --user nf-user')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
job.getContainerProperties().getPrivileged() == true
job.getContainerProperties().getReadonlyRootFilesystem() == true
job.getContainerProperties().getUser() == 'nf-user'
}
def 'should set tmpfs linux params'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '--tmpfs /run:rw,noexec,nosuid,size=64 --tmpfs /app:ro,size=128')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
def params = job.getContainerProperties().getLinuxParameters()
params.getTmpfs().size() == 2
params.getTmpfs().get(0).toString() == '{ContainerPath: /run,Size: 64,MountOptions: [rw, noexec, nosuid]}'
params.getTmpfs().get(1).toString() == '{ContainerPath: /app,Size: 128,MountOptions: [ro]}'
}
def 'should set memory linux params'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '--memory-swappiness 90 --memory-swap 2048 --shm-size 1024 ')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
def params = job.getContainerProperties().getLinuxParameters()
params.getSharedMemorySize() == 1024
params.getMaxSwap() == 2048
params.getSwappiness() == 90
}
def 'should set exactly one param'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '--privileged')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
def properties = job.getContainerProperties()
properties.getLinuxParameters() == null
properties.getUlimits() == null
properties.getPrivileged() == true
properties.getReadonlyRootFilesystem() == null
properties.getUser() == null
}
def 'should set no params'() {
given:
def task = Mock(TaskRun)
task.getName() >> 'batch-task'
task.getConfig() >> new TaskConfig(containerOptions: '')
def handler = Spy(AwsBatchTaskHandler)
handler.task >> task
when:
def job = handler.makeJobDefRequest('repo/any_image:latest')
then:
1 * handler.getAwsOptions() >> { new AwsOptions(cliPath: '/bin/aws') }
def properties = job.getContainerProperties()
properties.getLinuxParameters() == null
properties.getUlimits() == null
properties.getPrivileged() == null
properties.getReadonlyRootFilesystem() == null
properties.getUser() == null
}

It is also tested there. When you asked for a test, I thought you wanted a dedicated test class. I agree and will adjust the tests as you suggested.

Signed-off-by: Manuele Simi <manuele.simi@gmail.com>
Signed-off-by: Manuele Simi <manuele.simi@gmail.com>
@manuelesimi
Copy link
Collaborator Author

Just realised that AwsContainerOptionsMapper was tested via indirectly the TaskHandler. Think it would be better to simplify this testing directly the class ie.

 when:
 def map = CmdLineHelper.parseGnuArgs('--ulimit nofile=1280:2560 --ulimit nproc=16:32')
 def properties = AwsContainerOptionsMapper.createContainerOpts(map) 
 then:
 properties.getUlimits().size() == 2
 properties.getUlimits().get(0).toString() == '{HardLimit: 2560,Name: nofile,SoftLimit: 1280}'
 properties.getUlimits().get(1).toString() == '{HardLimit: 32,Name: nproc,SoftLimit: 16}'

It is also tested there. When you asked for a test, I thought you wanted a dedicated test class. I agree and will adjust the tests as you suggested.

done, see d1d9da0

@manuelesimi
Copy link
Collaborator Author

I think I've addressed your latest comments. Pls, let me know if I missed something. Thanks!

@manuelesimi
Copy link
Collaborator Author

Hold on, there is a test failing. I know why. I can fix it later.

Signed-off-by: Manuele Simi <manuele.simi@gmail.com>
@manuelesimi
Copy link
Collaborator Author

manuelesimi commented Dec 11, 2021

I changed two GoogleLifeSciencesTaskHandlerTest tests that use the new config map. I hope I didn't misinterpret them, but basically I'd expect that when the entrypoint is not set in the map, the default entrypoint is returned. See 96c3fd4

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Copy link
Member

@pditommaso pditommaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, Sunday PR approval! thanks for this, much appreciated!

@pditommaso pditommaso merged commit ac7f1e7 into master Dec 12, 2021
@pditommaso pditommaso deleted the aws/container_properties branch December 12, 2021 17:07
@manuelesimi
Copy link
Collaborator Author

Terrific! This was much needed for our AWS projects.
Thanks for all the help and patience along .

@pditommaso
Copy link
Member

Released along with version 21.12.1-edge

phue pushed a commit to phue/nextflow that referenced this pull request Jan 25, 2022
This commit adds the support for `containerOptions` directive for AWS Batch jobs 

This is useful to fine control the container execution properties using the same
the option used for the Docker container runtime.

Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
Co-authored-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@pditommaso pditommaso added this to the 22.04.0 milestone Feb 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants