Skip to content
This repository has been archived by the owner on May 6, 2024. It is now read-only.

Do not use nextflow readLine() since it downloads files on cluster head nodes #61

Closed
harper357 opened this issue Oct 5, 2022 · 3 comments
Labels
bug Something isn't working

Comments

@harper357
Copy link

harper357 commented Oct 5, 2022

Description of the bug

Im getting an odd error when trying to run 40 samples through the pipeline on AWS batch.

Everything proceeds normally until the MZMLINDEXING step when the head node crashes, with the error java error “Failed to acquire stream chunk”.

The log where the error happens:

| 2022-10-05T11:20:19.906-07:00 | [51/a3778a] Submitted process > NFCORE_QUANTMS:QUANTMS:FILE_PREPARATION:MZMLINDEXING (file_32)
  | 2022-10-05T11:20:23.240-07:00 | Failed to acquire stream chunk
  | 2022-10-05T11:20:23.240-07:00 | -- Check script '/root/.nextflow/assets/[users_name]/nf-quantms/./workflows/../subworkflows/local/file_preparation.nf' at line: 32 or see '.nextflow.log' file for more details
  | 2022-10-05T11:20:23.263-07:00 | -�[0;35m[nf-core/quantms]�[0;31m Pipeline completed with errors�[0m-
  | 2022-10-05T11:20:23.267-07:00 | WARN: Killing running tasks (39)
  | 2022-10-05T11:20:23.469-07:00CopyWARN: Unable to get file attributes file: s3://[users_bucket]/versions.yml -- Cause: com.amazonaws.SdkClientException: Failed to sanitize XML document destined for handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler | WARN: Unable to get file attributes file: s3://[users_bucket]/_nextflow/runs/39/df0b79873b070c19eddd53c33b8288/versions.yml -- Cause: com.amazonaws.SdkClientException: Failed to sanitize XML document destined for handler class com.amazonaws.services.s3.model.transform.XmlResponsesSaxParser$ListBucketHandler
  | 2022-10-05T11:20:27.115-07:00 | Failed to acquire stream chunk
  | 2022-10-05T11:20:39.624-07:00 | === Running Cleanup ===

The code refenced:

nonIndexedMzML: file(it[1]).withReader {
f = it; 1.upto(5) {
if (f.readLine().contains("indexedmzML")) return false;
}
return true;

The head node seems to crash at a different file number if I change the amount of memory I assign to the head node. Is all the data passing through the head node somewhere? I've never had this problem with any of my NGS pipelines, they use more and larger files, so I am a little confused at this crash.

Command used and terminal output

No response

Relevant files

The log file where it crashes: (it is dated different, but this is the same error that always shows)

Oct-10 23:03:07.744 [Actor Thread 15] ERROR nextflow.extension.DataflowHelper - @unknown
java.io.IOException: Failed to acquire stream chunk
	at com.upplication.s3fs.ng.FutureInputStream.nextBuffer(FutureInputStream.java:78)
	at com.upplication.s3fs.ng.FutureInputStream.read(FutureInputStream.java:63)
	at java.base/sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:270)
	at java.base/sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:313)
	at java.base/sun.nio.cs.StreamDecoder.read(StreamDecoder.java:188)
	at java.base/java.io.InputStreamReader.read(InputStreamReader.java:177)
	at java.base/java.io.BufferedReader.fill(BufferedReader.java:162)
	at java.base/java.io.BufferedReader.readLine(BufferedReader.java:329)
	at java.base/java.io.BufferedReader.readLine(BufferedReader.java:396)
	at java_io_BufferedReader$readLine.call(Unknown Source)
	at Script_d4bc0d6a$_runScript_closure1$_closure2$_closure5$_closure8$_closure9$_closure11.doCall(Script_d4bc0d6a:32)
	at jdk.internal.reflect.GeneratedMethodAccessor276.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
	at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
	at groovy.lang.Closure.call(Closure.java:412)
	at groovy.lang.Closure.call(Closure.java:428)
	at org.codehaus.groovy.runtime.DefaultGroovyMethods.upto(DefaultGroovyMethods.java:16406)
	at org.codehaus.groovy.runtime.dgm$875.doMethodInvoke(Unknown Source)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1268)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
	at groovy.runtime.metaclass.NumberDelegatingMetaClass.invokeMethod(NumberDelegatingMetaClass.java:60)
	at org.codehaus.groovy.runtime.callsite.PojoMetaClassSite.call(PojoMetaClassSite.java:44)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:148)
	at Script_d4bc0d6a$_runScript_closure1$_closure2$_closure5$_closure8$_closure9.doCall(Script_d4bc0d6a:31)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
	at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
	at groovy.lang.Closure.call(Closure.java:412)
	at groovy.lang.Closure.call(Closure.java:428)
	at org.codehaus.groovy.runtime.IOGroovyMethods.withReader(IOGroovyMethods.java:1160)
	at org.apache.groovy.nio.extensions.NioExtensions.withReader(NioExtensions.java:1434)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.codehaus.groovy.runtime.metaclass.ReflectionMetaMethod.invoke(ReflectionMetaMethod.java:54)
	at org.codehaus.groovy.runtime.metaclass.NewInstanceMetaMethod.invoke(NewInstanceMetaMethod.java:54)
	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1268)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
	at groovy.runtime.metaclass.NextflowDelegatingMetaClass.invokeMethod(NextflowDelegatingMetaClass.java:66)
	at org.codehaus.groovy.runtime.callsite.PojoMetaClassSite.call(PojoMetaClassSite.java:44)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:139)
	at Script_d4bc0d6a$_runScript_closure1$_closure2$_closure5$_closure8.doCall(Script_d4bc0d6a:30)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
	at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
	at groovy.lang.Closure.call(Closure.java:412)
	at groovy.lang.Closure.call(Closure.java:428)
	at nextflow.extension.BranchOp.doNext(BranchOp.groovy:55)
	at jdk.internal.reflect.GeneratedMethodAccessor258.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1268)
	at groovy.lang.MetaClassImpl.invokeMethodClosure(MetaClassImpl.java:1048)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1142)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
	at groovy.lang.Closure.call(Closure.java:412)
	at groovy.lang.Closure.call(Closure.java:428)
	at groovy.lang.Closure$call.call(Unknown Source)
	at nextflow.extension.DataflowHelper$_subscribeImpl_closure2.doCall(DataflowHelper.groovy:285)
	at jdk.internal.reflect.GeneratedMethodAccessor202.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.codehaus.groovy.reflection.CachedMethod.invoke(CachedMethod.java:107)
	at groovy.lang.MetaMethod.doMethodInvoke(MetaMethod.java:323)
	at org.codehaus.groovy.runtime.metaclass.ClosureMetaClass.invokeMethod(ClosureMetaClass.java:274)
	at groovy.lang.MetaClassImpl.invokeMethod(MetaClassImpl.java:1035)
	at groovy.lang.Closure.call(Closure.java:412)
	at groovyx.gpars.dataflow.operator.DataflowOperatorActor.startTask(DataflowOperatorActor.java:120)
	at groovyx.gpars.dataflow.operator.DataflowOperatorActor.onMessage(DataflowOperatorActor.java:108)
	at groovyx.gpars.actor.impl.SDAClosure$1.call(SDAClosure.java:43)
	at groovyx.gpars.actor.AbstractLoopingActor.runEnhancedWithoutRepliesOnMessages(AbstractLoopingActor.java:293)
	at groovyx.gpars.actor.AbstractLoopingActor.access$400(AbstractLoopingActor.java:30)
	at groovyx.gpars.actor.AbstractLoopingActor$1.handleMessage(AbstractLoopingActor.java:93)
	at groovyx.gpars.util.AsyncMessagingCore.run(AsyncMessagingCore.java:132)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: Cannot reserve 10,485,760 bytes of direct buffer memory (allocated: 1070363393, limit: 1,073,741,824)
	at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
	at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
	at com.upplication.s3fs.ng.FutureInputStream.nextBuffer(FutureInputStream.java:75)
	... 94 common frames omitted
Caused by: java.lang.OutOfMemoryError: Cannot reserve 10485760 bytes of direct buffer memory (allocated: 1070363393, limit: 1073741824)
	at java.base/java.nio.Bits.reserveMemory(Bits.java:178)
	at java.base/java.nio.DirectByteBuffer.<init>(DirectByteBuffer.java:121)
	at java.base/java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:332)
	at com.upplication.s3fs.ng.ChunkBuffer.<init>(ChunkBuffer.java:41)
	at com.upplication.s3fs.ng.ChunkBufferFactory.create(ChunkBufferFactory.java:65)
	at com.upplication.s3fs.ng.S3ParallelDownload.doDownload(S3ParallelDownload.java:136)
	at com.upplication.s3fs.ng.S3ParallelDownload.lambda$safeDownload$1(S3ParallelDownload.java:127)
	at dev.failsafe.Functions.lambda$toCtxSupplier$11(Functions.java:236)
	at dev.failsafe.Functions.lambda$get$0(Functions.java:46)
	at dev.failsafe.internal.RetryPolicyExecutor.lambda$apply$0(RetryPolicyExecutor.java:75)
	at dev.failsafe.SyncExecutionImpl.executeSync(SyncExecutionImpl.java:176)
	at dev.failsafe.FailsafeExecutor.call(FailsafeExecutor.java:437)
	at dev.failsafe.FailsafeExecutor.get(FailsafeExecutor.java:115)
	at com.upplication.s3fs.ng.S3ParallelDownload.safeDownload(S3ParallelDownload.java:127)
	at com.upplication.s3fs.ng.FutureIterator.lambda$init$0(FutureIterator.java:59)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
	... 3 common frames omitted
Oct-10 23:03:07.773 [Actor Thread 15] DEBUG nextflow.Session - Session aborted -- Cause: Failed to acquire stream chunk

System information

Nextflow version (eg. 22.04.5)
Hardware AWS
Executor awsbatch
Container engine: default
OS AWSLinux
Version of nf-core/quantms v1.1dev

@harper357 harper357 added the bug Something isn't working label Oct 5, 2022
@jpfeuffer
Copy link
Collaborator

Hmm now, looking at the line, this check might actually be performed by nextflow itself and therefore on the head node. But I kind of thought that reading the first X lines should be doable.

@jpfeuffer jpfeuffer changed the title Head node crashing on AWS Do not use nextflow readLine() since it downloads files on cluster head nodes Mar 7, 2023
@ypriverol
Copy link
Member

This has been open for more than a year now.

@jpfeuffer
Copy link
Collaborator

We don't index by default anymore and therefore do not check the files anymore. If you provide raw files or indexedMzml, this should not happen anymore.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants