Skip to content

Distinct with JSONL would cause JVM core dump #1138

@Yicong-Huang

Description

@Yicong-Huang

The code dump happens after some execution with Distinct operator + JSONL. This could be related to null fields produced by JSONL operator (with a nullable source data), and it has some bugs with LinkedHashSet that used internally in Distinct Op.

Some error does happen like this:

[05-06-2021 03:20:44.035] [WARN ] [WorkerActorVirtualIdentity(Layer(3,Distinct-operator-8be11d7b-89d1-4a95-b2dc-48f97a10abf2,main)[0]).logWarning(40)] - Receiver class scala/collection/mutable/HashSet must be the current class or a subtype of interface scala/collection/mutable/HashTable
scala.collection.mutable.HashTable.findOrAddEntry$(HashTable.scala:167)
scala.collection.mutable.LinkedHashSet.findOrAddEntry(LinkedHashSet.scala:44)
scala.collection.mutable.LinkedHashSet.add(LinkedHashSet.scala:68)
edu.uci.ics.texera.workflow.operators.distinct.DistinctOpExec.processTexeraTuple(DistinctOpExec.scala:18)
edu.uci.ics.texera.workflow.common.operators.OperatorExecutor.processTuple(OperatorExecutor.scala:14)
edu.uci.ics.texera.workflow.common.operators.OperatorExecutor.processTuple$(OperatorExecutor.scala:10)
edu.uci.ics.texera.workflow.operators.distinct.DistinctOpExec.processTuple(DistinctOpExec.scala:10)
edu.uci.ics.amber.engine.architecture.worker.DataProcessor.processInputTuple(DataProcessor.scala:95)
edu.uci.ics.amber.engine.architecture.worker.DataProcessor.handleInputTuple(DataProcessor.scala:190)
edu.uci.ics.amber.engine.architecture.worker.DataProcessor.edu$uci$ics$amber$engine$architecture$worker$DataProcessor$$runDPThreadMainLogic(DataProcessor.scala:144)
edu.uci.ics.amber.engine.architecture.worker.DataProcessor$$anon$1.run(DataProcessor.scala:42)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:748) 

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions