-
Notifications
You must be signed in to change notification settings - Fork 758
Azure Batch worker pool supports managed identity #5670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Azure Batch worker pool supports managed identity #5670
Conversation
This feature adds an option to tell Azure Batch the worker pool has a managed identity attached. Currently, the only feature is it passes the managed identity client id to the task as an environment variable but this could be extended to support file staging in or out. Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com>
✅ Deploy Preview for nextflow-docs-staging ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
No longer requires SAS tokens for file staging, it can use the managed identity to authenticate. Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com>
|
Hello, |
5a93547 to
27345a6
Compare
Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com>
| // This is a bad solution and breaks Fusion for everyone | ||
| if (!(pool.opts.managedIdentityId && it.key == "AZURE_STORAGE_SAS_TOKEN")) { | ||
| opts += "-e $it.key=$it.value " | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This breaks Fusion for everyone who isn't using managed identities. Ideally we take the pool.opts when creating the launcher but I've ran out of time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale? remove AZURE_STORAGE_SAS_TOKEN when pool.opts.managedIdentityId is provided?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We were having authentication problems with Fusion using the SAS token instead of the identity, they might be resolved now.
Ideally, we would not generate a SAS token if we already have a managed identity to reduce our security surface further.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bentsherman and anyone who can help, I've spent ages trying to work out how to get the poolOpts into AzFusionEnv to disable creating a SAS token here, but I can't see a good way of doing it without doing a full Azure specific implementation of TaskBean, FusionScriptLauncher, etc. Is there a better way?
I wonder if I can create a class that is task + pool, kinda like the AzJobKey then use that in the Fusion env builder.
Anyway, all of this to say a proper solution for Fusion would require a bit of a refactor top-to-bottom and I don't know how far I should go with this before punting off to actual good developers 😆
Using azcopy is fine, it's basically ready to go.
| env.put('AZCOPY_AUTO_LOGIN_TYPE', 'MSI') // azcopy | ||
| env.put('AZCOPY_MSI_CLIENT_ID', pool.opts.managedIdentityId) // azcopy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These looks unrelated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what makes azcopy work.
The first iteration of this PR did not include Fusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should not be affected
| // This is a bad solution and breaks Fusion for everyone | ||
| if (!(pool.opts.managedIdentityId && it.key == "AZURE_STORAGE_SAS_TOKEN")) { | ||
| opts += "-e $it.key=$it.value " | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the rationale? remove AZURE_STORAGE_SAS_TOKEN when pool.opts.managedIdentityId is provided?
Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com>
Builds out tests for new fusion functionality for Azure Batch including managed identity support. Required a refactor of the AzBatchService to use the new createBatchTaskContent method for more precise testing but that should be generally helpful. Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com>
Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com>
…ker and easier to write Signed-off-by: adamrtalbot <12817534+adamrtalbot@users.noreply.github.com>
|
Superseded by #6118 |
|
Closing in favour of #6118 |
This feature adds an option to tell Azure Batch the worker pool has a managed identity attached.
azcopy needs to be updated, since the current version isn't recent enough.
When using this feature, you can tell Nextflow that a node pool has a managed identity attached:
azure { storage { accountName = 'seqeralabs' } batch { location = 'eastus' accountName = 'seqeralabs' copyToolInstallMode = "off" pools { "hello-world-entra-mi" { vmType = "standard_e2ds_v5" managedIdentityId = "myManagedIdentityId" } } } }it will then use the native azcopy managed identity authentication to download and upload files.
I haven't handled all the old code yet, so Nextflow will generate a SAS which it only uses sometimes.
Importantly, this 'fixes' #5669, because it doesn't need a SAS token to access the files (tested).
To use:
bash -c "tar -xzvf azcopy.tar.gz && chmod +x azcopy*/azcopy && mkdir -p $AZ_BATCH_NODE_SHARED_DIR/bin/ && cp azcopy*/azcopy $AZ_BATCH_NODE_SHARED_DIR/bin/"azcopy.tar.gzmy-pool:To do: