Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AWS client settings aren't used in AWS batch instances. #2671

Closed
onuryukselen opened this issue Feb 23, 2022 · 5 comments
Closed

AWS client settings aren't used in AWS batch instances. #2671

onuryukselen opened this issue Feb 23, 2022 · 5 comments
Milestone

Comments

@onuryukselen
Copy link

onuryukselen commented Feb 23, 2022

Bug report

Expected behavior and actual behavior

I have set s3Acl:"BucketOwnerFullControl" for AWS client attribute in nextflow.config file. .command.sh file transferred to s3 as expected however batch instance got the following error:

Command exit status:
  -

Command output:
  (empty)

Command error:
  upload failed: - to s3://shared/run116/work/ff/d4b0123414871deb148c9143a4b6f5/.command.begin An error occurred (AccessDenied) when calling the PutObject operation: Access Denied
  upload failed: - to s3://shared/run116/work/ff/d4b0123414871deb148c9143a4b6f5/.exitcode An error occurred (AccessDenied) when calling the PutObject operation: Access Denied
  upload failed: ./.command.log to s3://shared/run116/work/ff/d4b0123414871deb148c9143a4b6f5/.command.log An error occurred (AccessDenied) when calling the PutObject operation: Access Denied```

This error happens when I don't use --acl "bucket-owner-full-control" in my aws CLI commands. I guess the specified AWS client attribute isn't used in the AWS batch instance. Since when I enter that batch instance and execute the upload command with the aws CLI, it works as expected.

Steps to reproduce the problem

This is my config file:

process.executor = 'awsbatch'
process.container = '7058936963.dkr.ecr.us-east-1.amazonaws.com/test:1.0'
docker.enabled = true
aws {
    client {
        s3Acl = "BucketOwnerFullControl"
    }
}

Program output

This is my .nextflow.log file:

Feb-23 21:08:57.692 [main] INFO  nextflow.cli.CmdRun - N E X T F L O W  ~  version 21.12.1-edge
Feb-23 21:08:58.400 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins default=[nf-amazon@1.4.0]
Feb-23 21:08:58.412 [main] INFO  org.pf4j.DefaultPluginStatusProvider - Enabled plugins: []
Feb-23 21:08:58.413 [main] INFO  org.pf4j.DefaultPluginStatusProvider - Disabled plugins: []
Feb-23 21:08:58.416 [main] INFO  org.pf4j.DefaultPluginManager - PF4J version 3.4.1 in 'deployment' mode
Feb-23 21:08:58.426 [main] INFO  org.pf4j.AbstractPluginManager - No plugins
Feb-23 21:08:58.426 [main] DEBUG nextflow.plugin.PluginUpdater - Installing plugin nf-amazon version: 1.4.0
Feb-23 21:08:58.432 [main] INFO  org.pf4j.AbstractPluginManager - Plugin 'nf-amazon@1.4.0' resolved
Feb-23 21:08:58.432 [main] INFO  org.pf4j.AbstractPluginManager - Start plugin 'nf-amazon@1.4.0'
Feb-23 21:08:58.449 [main] DEBUG nextflow.plugin.BasePlugin - Plugin started nf-amazon@1.4.0
Feb-23 21:08:58.460 [main] DEBUG nextflow.file.FileHelper - > Added 'S3FileSystemProvider' to list of installed providers [s3]
Feb-23 21:08:58.498 [main] DEBUG nextflow.Session - Session uuid: 311489f0-02c8-4472-9535-4680b32d287d
Feb-23 21:08:58.499 [main] DEBUG nextflow.Session - Run name: marvelous_knuth
Feb-23 21:08:58.499 [main] DEBUG nextflow.Session - Executor pool size: 32
Feb-23 21:08:58.508 [main] DEBUG nextflow.file.FileHelper - Creating a file system instance for provider: S3FileSystemProvider
Feb-23 21:08:58.516 [main] DEBUG nextflow.file.FileHelper - AWS S3 config details: {s3Acl=BucketOwnerFullControl, max_error_retry=5}

Feb-23 21:08:58.855 [main] DEBUG com.upplication.s3fs.AmazonS3Client - Setting S3 canned ACL=bucket-owner-full-control [BucketOwnerFullControl]
Feb-23 21:08:58.855 [main] DEBUG c.u.s3fs.S3FileSystemProvider - Using S3 serial downloader
Feb-23 21:08:58.888 [main] DEBUG nextflow.cli.CmdRun - 
  Version: 21.12.1-edge build 5653
  Created: 22-12-2021 07:53 UTC 
  System: Linux 5.4.0-1045-aws
  Runtime: Groovy 3.0.9 on OpenJDK 64-Bit Server VM 11.0.13+8-Ubuntu-0ubuntu1.18.04
  Encoding: UTF-8 (UTF-8)
  Process: 4891@ip-10-220-27-79 [10.220.27.79]
  CPUs: 32 - Mem: 124.4 GB (29.2 GB) - Swap: 0 (0)
Feb-23 21:08:59.244 [main] WARN  com.amazonaws.util.Base64 - JAXB is unavailable. Will fallback to SDK implementation which may be less performant.If you are using Java 9+, you will need to include javax.xml.bind:jaxb-api as a dependency.
Feb-23 21:08:59.253 [main] DEBUG nextflow.file.FileHelper - Can't check if specified path is NFS (1): /s3://shared/run116/work

Feb-23 21:08:59.253 [main] DEBUG nextflow.Session - Work-dir: s3://shared/run116/work [null]
Feb-23 21:08:59.254 [main] DEBUG nextflow.Session - Bucket-dir: s3://shared/run116/work
Feb-23 21:08:59.254 [main] DEBUG nextflow.Session - Script base path does not exist or is not a directory: /export/run116/bin
Feb-23 21:08:59.281 [main] DEBUG nextflow.executor.ExecutorFactory - Extension executors providers=[AwsBatchExecutor]
Feb-23 21:08:59.290 [main] DEBUG nextflow.Session - Observer factory: DefaultObserverFactory
Feb-23 21:08:59.327 [main] DEBUG nextflow.util.CustomThreadPool - Creating default thread pool > poolSize: 33; maxThreads: 1000
Feb-23 21:08:59.378 [main] DEBUG nextflow.Session - Session start invoked
Feb-23 21:09:01.226 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution
Feb-23 21:09:01.316 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: awsbatch
Feb-23 21:09:01.316 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'awsbatch'
Feb-23 21:09:01.317 [main] DEBUG nextflow.executor.Executor - [warm up] executor > awsbatch
Feb-23 21:09:01.328 [main] DEBUG nextflow.util.ThrottlingExecutor - Creating throttling executor with opts: nextflow.util.ThrottlingExecutor$Options(poolName:AWSBatch-executor, limiter:RateLimiter[stableRate=50.0qps], poolSize:160, maxPoolSize:160, queueSize:5000, maxRetries:10, keepAlive:1m, autoThrottle:true, errorBurstDelay:1s, rampUpInterval:100, rampUpFactor:1.2, rampUpMaxRate:1.7976931348623157E308, backOffFactor:2.0, backOffMinRate:0.0166666667, retryDelay:1s)
Feb-23 21:09:01.333 [main] DEBUG nextflow.util.ThrottlingExecutor - Creating throttling executor with opts: nextflow.util.ThrottlingExecutor$Options(poolName:AWSBatch-reaper, limiter:RateLimiter[stableRate=50.0qps], poolSize:160, maxPoolSize:160, queueSize:5000, maxRetries:10, keepAlive:1m, autoThrottle:true, errorBurstDelay:1s, rampUpInterval:100, rampUpFactor:1.2, rampUpMaxRate:1.7976931348623157E308, backOffFactor:2.0, backOffMinRate:0.0166666667, retryDelay:1s)
Feb-23 21:09:01.334 [main] DEBUG n.cloud.aws.batch.AwsBatchExecutor - Creating parallel monitor for executor 'awsbatch' > pollInterval=10s; dumpInterval=5m
Feb-23 21:09:01.392 [main] DEBUG n.cloud.aws.batch.AwsBatchExecutor - [AWS BATCH] Executor options=AwsOptions(cliPath:null, storageClass:null, storageEncryption:null, remoteBinDir:null, region:null, maxParallelTransfers:4, maxTransferAttempts:5, delayBetweenAttempts:10s, retryMode:standard, fetchInstanceType:false, jobRole:null, volumes:[], awsCli:aws)
Feb-23 21:09:01.404 [main] DEBUG nextflow.util.CustomThreadPool - Creating default thread pool > poolSize: 33; maxThreads: 1000
Feb-23 21:09:01.490 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: awsbatch
Feb-23 21:09:01.490 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'awsbatch'
Feb-23 21:09:01.497 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: awsbatch
Feb-23 21:09:01.497 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'awsbatch'
Feb-23 21:09:01.509 [main] DEBUG nextflow.executor.ExecutorFactory - << taskConfig executor: awsbatch
Feb-23 21:09:01.509 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'awsbatch'
Feb-23 21:09:01.596 [main] DEBUG nextflow.executor.ExecutorFactory - >> processorType: 'awsbatch'
Feb-23 21:09:01.600 [main] DEBUG nextflow.script.ScriptRunner - > Await termination 
Feb-23 21:09:01.600 [main] DEBUG nextflow.Session - Session await
Feb-23 21:09:01.706 [Actor Thread 5] DEBUG nextflow.util.ThreadPoolBuilder - Creating thread pool 'FileTransfer' minSize=4; maxSize=4; workQueue=LinkedBlockingQueue[10000]; allowCoreThreadTimeout=false
Feb-23 21:09:02.579 [AWSBatch-executor-1] DEBUG n.c.aws.batch.AwsBatchTaskHandler - [AWS BATCH] submitted > job=485e83f0-0c9d-43d6-9de3-4416342084e3; work-dir=s3://shared/run116/work/12/c13c3d70ca7da3b0589919c4050afb
Feb-23 21:09:02.579 [AWSBatch-executor-1] INFO  nextflow.Session - [12/c13c3d] Submitted process > test
Feb-23 21:12:31.484 [Task monitor] DEBUG n.c.aws.batch.AwsBatchTaskHandler - [AWS BATCH] Cannot read exitstatus for task: `test` | /s3://shared/run116/work/12/c13c3d70ca7da3b0589919c4050afb/.exitcode
Feb-23 21:13:21.680 [Task monitor] DEBUG nextflow.processor.TaskRun - Unable to dump output of process 'null' -- Cause: java.nio.file.NoSuchFileException: /tmp/temp-s3-9280743837789757731/.command.out
Feb-23 21:13:21.682 [Task monitor] ERROR nextflow.processor.TaskProcessor - Error executing process > 'test'

Caused by:
  Essential container in task exited

Command executed:

  #!/usr/bin/env python 
...
Command exit status:
  -

Command output:
  (empty)

Command error:
  upload failed: - to s3://shared/run116/work/ff/d4b0123414871deb148c9143a4b6f5/.command.begin An error occurred (AccessDenied) when calling the PutObject operation: Access Denied
  upload failed: - to s3://shared/run116/work/ff/d4b0123414871deb148c9143a4b6f5/.exitcode An error occurred (AccessDenied) when calling the PutObject operation: Access Denied
  upload failed: ./.command.log to s3://shared/run116/work/ff/d4b0123414871deb148c9143a4b6f5/.command.log An error occurred (AccessDenied) when calling the PutObject operation: Access Denied

Work dir:
  s3://shared/run116/work/ff/d4b0123414871deb148c9143a4b6f5

Tip: when you have fixed the problem you can continue the execution adding the option `-resume` to the run command line

Environment

  • Nextflow version: 21.12.1-edge
  • Java version: openjdk version "11.0.13" 2021-10-19
  • Operating system: Linux
  • Bash version: GNU bash, version 4.4.20(1)-release (x86_64-pc-linux-gnu)
@onuryukselen onuryukselen changed the title AWS client settings don't use in AWS batch instances. AWS client settings aren't used in AWS batch instances. Feb 25, 2022
@MrMarkW
Copy link

MrMarkW commented Apr 11, 2022

I'm having a similar issue with upload failed because storageEncryption = 'AES256' exist used on all AWS Batch jobs. The setting doesn't seem to apply to the life cycle of all jobs used by Nextflow when running in AWS Batch. I currently have deny policies on my buckets if the security header isn't send as part of the s3 request.

aws {
   client {
      storageEncryption = 'AES256'
   }
...
}

@bolah
Copy link

bolah commented Apr 26, 2022

I can confirm @MrMarkW we have the same issue. Our workaround was to create a wrapper around AWS cli and add the required parameter to S3 copy/upload command.

@stale
Copy link

stale bot commented Dec 21, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Dec 21, 2022
pditommaso added a commit that referenced this issue Dec 29, 2022
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
pditommaso added a commit that referenced this issue Dec 29, 2022
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@pditommaso pditommaso added this to the 23.04.0 milestone Dec 29, 2022
@stale stale bot removed the stale label Dec 29, 2022
@pditommaso
Copy link
Member

Thanks for reporting the problem with the ACL. A patch has been uploaded a964491.

Other comments seem unrelated and should be reported as a separate issue.

abhi18av pushed a commit to abhi18av/nextflow that referenced this issue Jan 10, 2023
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
l-modolo pushed a commit to l-modolo/nextflow that referenced this issue Jan 25, 2023
Signed-off-by: Paolo Di Tommaso <paolo.ditommaso@gmail.com>
@VivekTodur
Copy link

Hello @pditommaso, I use the latest development build (version 23.03.0-edge build 5851). Even with

aws{
client {
s3Acl = 'BucketOwnerFullControl'
}
}

Still getting the "access denied error". Are there any additional configurations required?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants