Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Class used in plugin can't be used in a resume run #23

Open
nvnieuwk opened this issue Jun 9, 2023 · 18 comments
Open

Class used in plugin can't be used in a resume run #23

nvnieuwk opened this issue Jun 9, 2023 · 18 comments

Comments

@nvnieuwk
Copy link
Collaborator

nvnieuwk commented Jun 9, 2023

Bug report

Expected behavior and actual behavior

In the nf-validation plugin we use an extended class of LinkedHashMap called ImmutableMap for the meta fields. This class works fine when running in a normal run, but gives these warnings during a resumed run and reruns everything:

WARN: [PLUS (1)] Unable to resume cached task -- See log file for details

The log shows the following:

com.esotericsoftware.kryo.KryoException: Unable to find class: nextflow.validation.ImmutableMap
	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:138)
	at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115)
	at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:641)
	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:752)
	at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:143)
	at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:21)
	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761)
	at com.esotericsoftware.kryo.Kryo$readClassAndObject$2.call(Unknown Source)
	at nextflow.util.KryoHelper.deserialize(SerializationHelper.groovy:181)
	at nextflow.util.KryoHelper.deserialize(SerializationHelper.groovy)
	at nextflow.util.KryoHelper$deserialize.call(Unknown Source)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:139)
	at nextflow.processor.TaskContext.deserialize(TaskContext.groovy:202)
	at nextflow.cache.CacheDB.getTaskEntry(CacheDB.groovy:88)
	at nextflow.processor.TaskProcessor.checkCachedOrLaunchTask(TaskProcessor.groovy:770)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.codehaus.groovy.runtime.callsite.PlainObjectMetaMethodSite.doInvoke(PlainObjectMetaMethodSite.java:48)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite$PogoCachedMethodSiteNoUnwrapNoCoerce.invoke(PogoMetaMethodSite.java:189)
	at org.codehaus.groovy.runtime.callsite.PogoMetaMethodSite.callCurrent(PogoMetaMethodSite.java:57)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCallCurrent(CallSiteArray.java:51)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:171)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callCurrent(AbstractCallSite.java:203)
	at nextflow.processor.TaskProcessor.invokeTask(TaskProcessor.groovy:618)
	at nextflow.processor.InvokeTaskAdapter.call(InvokeTaskAdapter.groovy:52)
	at groovyx.gpars.dataflow.operator.DataflowOperatorActor.startTask(DataflowOperatorActor.java:120)
	at groovyx.gpars.dataflow.operator.ForkingDataflowOperatorActor.access$001(ForkingDataflowOperatorActor.java:35)
	at groovyx.gpars.dataflow.operator.ForkingDataflowOperatorActor$1.run(ForkingDataflowOperatorActor.java:58)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
	at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.lang.ClassNotFoundException: nextflow.validation.ImmutableMap
	at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:641)
	at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:188)
	at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:520)
	at java.base/java.lang.Class.forName0(Native Method)
	at java.base/java.lang.Class.forName(Class.java:467)
	at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:136)

Steps to reproduce the problem

You can clone this repository and run it with nextflow run main.nf. When rerunning the mini pipeline with the -resume flag, you'll see the errors/warnings.

Environment

  • Nextflow version: 23.04.0
  • Java version: openjdk 17.0.3-internal 2022-04-19
  • Operating system: Linux (Ubuntu 20.04 LTS)
  • Bash version: 5.0.17
@pditommaso pditommaso transferred this issue from nextflow-io/nextflow Jun 9, 2023
@pditommaso
Copy link
Member

The ImmutableMap should implement CacheFunnel interface

@nvnieuwk
Copy link
Collaborator Author

nvnieuwk commented Jun 9, 2023

Thanks for the suggestion @pditommaso! I've sadly been unable to fix it with CacheFunnel. I implemented it like this:

// A class that works like Map, but returns an immutable copy with each method
public class ImmutableMap extends LinkedHashMap implements CacheFunnel {

    Map internalMap

    ImmutableMap(Map initialMap) {
        internalMap = initialMap
    }

    // Override the methods of the Map interface

    @Override
    Hasher funnel(Hasher hasher, HashMode mode) {
        hasher.putUnencodedChars(internalMap)
        return hasher
    }

    // Rest of the class

This still gives this error in the log: Caused by: java.lang.ClassNotFoundException: nextflow.validation.ImmutableMap

Did I implement it wrong?

@pditommaso
Copy link
Member

Likely the best thing to do is to use Collections.unmodifiableMap instead of implement your own Immutable class

https://chat.openai.com/share/b0e9f648-21a6-4069-b474-a4a60e8f334d

@nvnieuwk
Copy link
Collaborator Author

nvnieuwk commented Jun 9, 2023

We've investigated that and the main problem was that when copying an unmodifiableMap during a .map for example, the return type is again modifiable. This custom class made sure that it always returns immutable maps.

@nvnieuwk
Copy link
Collaborator Author

nvnieuwk commented Jun 9, 2023

You can see that discussion here: nextflow-io#32

@pditommaso
Copy link
Member

I think it's not a plugin role to change the semantic of nextflow operator. Therefore you should use usual Map objects instead of ImmutableMap

@nvnieuwk
Copy link
Collaborator Author

Okay thank you for all the help, too bad there is no way to enforce the immutability of the meta map, but I understand your point on this

@ewels
Copy link
Member

ewels commented Jun 13, 2023

@pditommaso is there any other way to achieve what we want here?

The immutable maps feature came after @robsyme gave a talk about the dangers of meta map mutability: https://nf-co.re/events/2023/bytesize_workflow_safety

We can drop back to a regular map again, but it would be nice to protect users (and devs) from this problem if possible, somehow.

@pditommaso
Copy link
Member

We'll take this into account for DSL3

@robsyme
Copy link
Collaborator

robsyme commented Jun 13, 2023

Perhaps we can use the @valueobject decoration to automatically implement the required interface. I'll try and test this week.

@mirpedrol
Copy link
Collaborator

mirpedrol commented Jul 3, 2023

Hello!
I am starting to add fromSamplesheet() to some nf-core pipelines, and it would be good to solve this before I merge any PR. Were you able to test if the decorator works @robsyme?

@pditommaso
Copy link
Member

👉 #23

@robsyme
Copy link
Collaborator

robsyme commented Jul 3, 2023

Paolo - the goal here is not to change in any way how Nextflow's operators work. The only goal here is to construct an object with the following properties:

  1. Presents a Map-like (aka 🐍 dict) interface for holding metadata
  2. Users can use the plus() operator to append new maps
    a) When they do this, a new object is returned rather than modifying the original object
  3. Can be serialised by Kryo

@robsyme
Copy link
Collaborator

robsyme commented Jul 4, 2023

I've tried a couple of approaches today and hit some interesting and instructive road blocks 😆

Let's say we want these properties:

  • Map-like interface
  • Returns the same class when using plus method
  • Serialized by Kryo
  • Immutable

Example 1

We can put together a very simple example class to illustrate the challenges. We might do something like:

import nextflow.util.KryoHelper

class Meta {
    @Delegate Map internal = new LinkedHashMap()

    static { KryoHelper.register(Meta) }

    Object put(Object key, Object value) { internal.put(key, value) }

    Meta plus(Map right) { new Meta(internal: internal + right) }

    // ... and any other methods (minus, etc)
}
  • ✅ Map-like interface
  • ✅ Returns the same class when using plus method
  • ✅ Serialized by Kryo
  • ❌ Immutable

Example 2

We can add immutability in a number of different ways, but let's say we go for Nextflow's built-in ValueObject annotation:

import nextflow.util.KryoHelper
import nextflow.io.ValueObject

@ValueObject
class Meta {
    @Delegate Map internal = new LinkedHashMap()

    static { KryoHelper.register(Meta) }

    Object put(Object key, Object value) { internal.put(key, value) }

    Meta plus(Map right) { new Meta(internal: internal + right) }
}

Making the object immutable breaks Kryo serialization:

  • ✅ Map-like interface
  • ✅ Returns the same class when using plus method
  • ❌ Serialized by Kryo
  • ✅ Immutable

The Nextflow logs report:

Jul.-04 10:48:46.677 [Actor Thread 7] WARN  nextflow.processor.TaskProcessor - [TestCache (1)] Unable to resume cached task -- See log file for details
java.lang.UnsupportedOperationException: null
	at java.base/java.util.Collections$UnmodifiableMap.put(Collections.java:1505)
	at java_util_Map$put$1.call(Unknown Source)
	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:47)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:125)
	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:148)
	at Meta.put(Meta.groovy:10)
	at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:144)
	at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:21)
	at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:761)
	at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:143)
	at com.esotericsoftware.kryo.serializers.MapSerializer.read(MapSerializer.java:21)

Why does is this no longer serializable?

Kryo MapSerializer

Our Meta class is a Map of sorts, and so the object is deserialized from a byte string to a new instance using the Kryo built-in MapSerializer. In this Serializer class, it first creates a new, empty object and then adds each key+value pair to the empty Map as they are read out of the serialized form.

This work of incrementally adding new key+value pairs is not supported by our newly immutable class, which is why the Kryo serialization breaks.

It's certainly possible to create a custom Serializer class, but I'm starting to think that this immutability feature is starting to feel a little over-engineered. There's a possibility that we're introducing a little too much "magic" - maybe using a vanailla LinkedHashMap class and then providing guidance in documentation is a better approach.

@pditommaso
Copy link
Member

The only goal here is to construct an object with the following properties:

Why this plugin should use a "magic" object and not just a plain Map?

@ewels
Copy link
Member

ewels commented Jul 4, 2023

Because @robsyme gave a nf-core/bytesize talk that put the fear of god into us all about mutable map objects 😆

https://nf-co.re/events/2023/bytesize_workflow_safety

https://www.youtube.com/watch?v=A357C-ux6Dw

@robsyme
Copy link
Collaborator

robsyme commented Jul 4, 2023

Haha. It was certainly not my intention to scare anybody! 😆

Paolo: We had people modifying the map in flight, which can lead to results that depend on task execution timing. For example:

workflow {
    nums = Channel.of(1..10) | map { [val:it] }

    nums
    | Foo
    | map { meta -> meta.val += 1 }

    nums 
    | VariableProcess
    | DoSomethingElse
}

In this example, modification of the meta object in the closure modifies the same object being passed to VariableProcess and DoSomethingElse. If VariableProcess happens to finish quickly, DoSomethingElse might be launched before the val property is incremented. If VariableProcess takes longer than Foo, then the increment will happen beforehand. This can lead to unpredictable results and unusual behaviour where process caches might change, etc.

@robsyme
Copy link
Collaborator

robsyme commented Jul 4, 2023

I think that implementing the CacheFunnel interface is the easiest path forward, but because the object is also a Map, it's cachefunnel implementation will never be used. I've just opened up a Nextflow PR that would remedy this: nextflow-io/nextflow#4077

@nvnieuwk nvnieuwk transferred this issue from nextflow-io/nf-validation Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants