Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for selecting minimum CPU platform on GCP #1633

Merged
merged 14 commits into from Jun 22, 2020

Conversation

hnawar
Copy link
Contributor

@hnawar hnawar commented Jun 16, 2020

This PR adds support to specify the minimum CPU platform used by a run when using the Google Cloud Life Sciences executor

@pditommaso
Copy link
Member

Thanks for submitting this PR. I was thinking this would make more sense at process (job) level instead of pipeline wide. What do you think?

@hnawar
Copy link
Contributor Author

hnawar commented Jun 17, 2020

I thought about that. There is currently no additional cost for specifying Skylake, Cascade lake is only available in the N2 instances so they would need to specify a different machineType all together.

The setting specifies the minimum CPU Platform, and when not specified you are still likely to end up on Skylake.

It was mainly easier for me to do it on the profile level rather than on process level, and the benefit of adding it to process level was not worth it.

I'll check with Andrew to get his opinion

@hnawar
Copy link
Contributor Author

hnawar commented Jun 18, 2020

I had a quick sanity check with our team, and I think pipeline level config should be sufficient.
Let me know if you are happy with that

@pditommaso
Copy link
Member

Ok, that's reasonable. If so it's needed to add the reading of that value from the nextflow config file.

See for example here:

final boolean disableBinDir = config.navigate('google.lifeSciences.disableRemoteBinDir',false)
final preemptible = config.navigate("google.lifeSciences.preemptible", false) as boolean
final bootDiskSize = config.navigate('google.lifeSciences.bootDiskSize') as MemoryUnit
final sshDaemon = config.navigate('google.lifeSciences.sshDaemon', false) as boolean
final sshImage = config.navigate('google.lifeSciences.sshImage', DEFAULT_SSH_IMAGE) as String
final copyImage = config.navigate('google.lifeSciences.copyImage', DEFAULT_COPY_IMAGE) as String
final debugMode = config.navigate('google.lifeSciences.debug', System.getenv('NXF_DEBUG'))
final privateAddr = config.navigate('google.lifeSciences.usePrivateAddress') as boolean
final requesterPays = config.navigate('google.enableRequesterPaysBuckets') as boolean

Also, uni tests should be added to corresponding changes. It would be nice also to add an entry in the docs for google LS

final boolean disableBinDir = config.navigate('google.lifeSciences.disableRemoteBinDir',false)
final preemptible = config.navigate("google.lifeSciences.preemptible", false) as boolean
final bootDiskSize = config.navigate('google.lifeSciences.bootDiskSize') as MemoryUnit
final sshDaemon = config.navigate('google.lifeSciences.sshDaemon', false) as boolean
final sshImage = config.navigate('google.lifeSciences.sshImage', DEFAULT_SSH_IMAGE) as String
final copyImage = config.navigate('google.lifeSciences.copyImage', DEFAULT_COPY_IMAGE) as String
final debugMode = config.navigate('google.lifeSciences.debug', System.getenv('NXF_DEBUG'))
final privateAddr = config.navigate('google.lifeSciences.usePrivateAddress') as boolean
final requesterPays = config.navigate('google.enableRequesterPaysBuckets') as boolean

Tx!

@hnawar
Copy link
Contributor Author

hnawar commented Jun 18, 2020

Thanks,
I added the cpuPlatform to the ConfigGroovy.
I ran some small test after compiling from scratch and verified it is passed properly to the API.
This is how it looks like in my profile to specify Skylake.
google.lifeSciences.cpuPlatform = 'Intel Skylake'

I've added it to the documentation as an example as it will be the most common option for most users.

I'll look at how to add unit test as well, but can sure use some help on that.

docs/google.rst Outdated Show resolved Hide resolved
@pditommaso
Copy link
Member

Thanks for this changes. For the tests,it could be enough t o verify the cpuPlatform is taken correctly for the config, for example look here

def 'should set requester pays' () {
when:
def config = GoogleLifeSciencesConfig.fromSession0([google:[project:'foo', region:'x', lifeSciences: [:]]])
then:
config.enableRequesterPaysBuckets == false
when:
config = GoogleLifeSciencesConfig.fromSession0([google:[project:'foo', region:'x', enableRequesterPaysBuckets:true]])
then:
config.enableRequesterPaysBuckets == true
}

and then it's included in the request, look for example here

when:
def req = handler.createPipelineRequest()
then:
task.getConfig() >> new TaskConfig(machineType: 'n1-1234')
and:
req.machineType == 'n1-1234'
req.project == 'my-project'
req.zone == ['my-zone']
req.region == ['my-region']
req.diskName == GoogleLifeSciencesTaskHandler.DEFAULT_DISK_NAME
req.diskSizeGb == null
!req.preemptible
req.taskName == "nf-bad893071e9130b866d43a4fcabb95b6"
req.containerImage == 'my/image'
req.workDir.toUriString() == 'gs://my-bucket/work/dir'
req.sharedMount.getPath() == '/work/dir'
req.sharedMount.getDisk() == GoogleLifeSciencesTaskHandler.DEFAULT_DISK_NAME
!req.sharedMount.getReadOnly()
req.bootDiskSizeGb == null
req.entryPoint == GoogleLifeSciencesConfig.DEFAULT_ENTRY_POINT
!req.usePrivateAddress

and

def 'should configure resources correctly'() {
given:
def SCOPES = ["https://www.googleapis.com/auth/cloud-platform"]
def type = "testType"
def zone = ["testZone1","testZone2"]
def region = ["testRegion1","testRegion2"]
def diskName = "testDisk"
def preEmptible = true
def acc = new AcceleratorResource(request: 4, type: 'nvidia-tesla-k80')
def helper = new GoogleLifeSciencesHelper()
when:
def resources1 = helper.createResources(new GoogleLifeSciencesSubmitRequest(
machineType:type,
zone:zone,
diskName: diskName,
diskSizeGb: 100,
preemptible: true))
then:
with(resources1) {
getVirtualMachine().getMachineType() == type
getZones() == zone
getRegions() == null
getVirtualMachine().getDisks().get(0).getName() == diskName
getVirtualMachine().getDisks().get(0).getSizeGb() == 100
getVirtualMachine().getServiceAccount().getScopes() == SCOPES
getVirtualMachine().getPreemptible() == preEmptible
!getVirtualMachine().getAccelerators()
!getVirtualMachine().getNetwork()?.getUsePrivateAddress()
}
when:
def resources2 = helper.createResources(new GoogleLifeSciencesSubmitRequest(
machineType:type,
region:region,
diskName:diskName,
diskSizeGb: 200,
preemptible: true))
then:
with(resources2) {
getVirtualMachine().getMachineType() == type
getZones() == null
getRegions() == region
getVirtualMachine().getDisks().get(0).getName() == diskName
getVirtualMachine().getDisks().get(0).getSizeGb() == 200
getVirtualMachine().getServiceAccount().getScopes() == SCOPES
getVirtualMachine().getPreemptible() == preEmptible
!getVirtualMachine().getAccelerators()
!getVirtualMachine().getNetwork()?.getUsePrivateAddress()
}
when:
def resources3 = helper.createResources( new GoogleLifeSciencesSubmitRequest(
machineType:type,
zone:zone,
diskName:diskName,
preemptible: false,
accelerator: acc,
bootDiskSizeGb: 75,
usePrivateAddress: true ))
then:
with(resources3) {
getVirtualMachine().getMachineType() == type
getZones() == zone
getVirtualMachine().getDisks().get(0).getName() == diskName
getVirtualMachine().getServiceAccount().getScopes() == SCOPES
!getVirtualMachine().getDisks().get(0).getSizeGb()
!getVirtualMachine().getPreemptible()
getVirtualMachine().getAccelerators().size()==1
getVirtualMachine().getAccelerators()[0].getCount()==4
getVirtualMachine().getAccelerators()[0].getType()=='nvidia-tesla-k80'
getVirtualMachine().getBootDiskSizeGb() == 75
getVirtualMachine().getNetwork().getUsePrivateAddress()
}

Also, please, make sure so sign-off the commit to fulfill the DCO bot requirement. Thanks!

hnawar added 13 commits June 20, 2020 20:24
Adding cpuPlatform parameter to the request

Signed-off-by: hnawar <hnawar@google.com>
Add support for CPU Platform selection

Signed-off-by: hnawar <hnawar@google.com>
Add support for CPU platform

Signed-off-by: hnawar <hnawar@google.com>
Adding support for CPU Platform selection

Signed-off-by: hnawar <hnawar@google.com>
Signed-off-by: hnawar <hnawar@google.com>
Signed-off-by: hnawar <hnawar@google.com>
Add the google.lifeSciences.cpuPlatform to the documentation

Signed-off-by: hnawar <hnawar@google.com>
Add quotes to the example of google.lifeSciences.cpuPlatform

Signed-off-by: hnawar <hnawar@google.com>
Add link to min CPU platform documentation

Signed-off-by: hnawar <hnawar@google.com>
Add unit test for cpuPlatform in config

Signed-off-by: hnawar <hnawar@google.com>
Signed-off-by: hnawar <hnawar@google.com>
Signed-off-by: hnawar <hnawar@google.com>
Fix Accidental deletion

Signed-off-by: hnawar <hnawar@google.com>
Signed-off-by: hnawar <hnawar@google.com>
@hnawar
Copy link
Contributor Author

hnawar commented Jun 20, 2020

I have added the unit tests, signed the commits with DCO and Travis build passed.

@hnawar
Copy link
Contributor Author

hnawar commented Jun 21, 2020

I have also staged changes to add support for specifying disk type per process. I can push these changes as well and put all in one PR if this is more convenient or submit a separate pull request

Copy link
Member

@pditommaso pditommaso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this contribution. Merging it.

@pditommaso pditommaso merged commit afc4375 into nextflow-io:master Jun 22, 2020
@pditommaso
Copy link
Member

As for specifying disk type per process, free free to draft a separate PR for it. However, I would like to keep a portable model for this, i.e. able to support also other computing services.

@pditommaso
Copy link
Member

Any suggestion to retrieve the list of avail cpu platforms for a given zone using the Java API?

Can't find an immediate way using the com.google.api.services.compute.Compute client.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants