Read time out while fetching all edges of a SuperNode #1467

Krittam · 2019-03-04T17:16:57Z

I have a graph in which a few nodes have many incoming edges(Supernode). All the edges are of same type/label. There is a query in which i need to report the total no of incoming edges.
I'm using cassandarathrift as storage backend
g.V().has('vid','qwerty').inE().count().next()
This fails with

14108889 [gremlin-server-session-1] WARN org.apache.tinkerpop.gremlin.server.op.AbstractEvalOpProcessor - Exception processing a script on request [RequestMessage{, requestId=c3d902b7-0fdd-491d-8639-546963212474, op='eval', processor='session', args={gremlin=g.V().has('vid','qwerty').inE().count().next(), session=2831d264-4566-4d15-99c5-d9bbb202b1f8, bindings={}, manageTransaction=false, batchSize=64}}].
TimedOutException()
at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14696)
at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14633)
at org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:14559)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:741)
at org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:725)
at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:143)
at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getSlice(CassandraThriftKeyColumnValueStore.java:100)
at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:82)
at org.janusgraph.diskstorage.keycolumnvalue.cache.ExpirationKCVSCache.getSlice(ExpirationKCVSCache.java:129)
at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:288)
at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:285)
at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69)
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55)
at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470)
at org.janusgraph.diskstorage.BackendTransaction.edgeStoreMultiQuery(BackendTransaction.java:285)
at org.janusgraph.graphdb.database.StandardJanusGraph.edgeMultiQuery(StandardJanusGraph.java:441)
at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.lambda$executeMultiQuery$3(StandardJanusGraphTx.java:1054)
at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:98)
at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:90)
at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.executeMultiQuery(StandardJanusGraphTx.java:1054)
at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.execute(MultiVertexCentricQueryBuilder.java:113)
at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.edges(MultiVertexCentricQueryBuilder.java:133)
at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.initialize(JanusGraphVertexStep.java:95)
at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.processNextStart(JanusGraphVertexStep.java:101)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:83)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:113)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38)
at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:200)
at java_util_Iterator$next.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:117)
at Script13.run(Script13.groovy:1)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:843)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:548)
at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:233)
at org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines.eval(ScriptEngines.java:120)
at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:290)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

However g.V().has('vid','qwerty').inE().limit(10000).count().next() gives
==>10000

Now if i wanted to filter all the edges based on some condition i would have used vertex centric indexes but I simply want all the incoming edges.
The said vertex is expected to have millions of such edges.
Please help

The text was updated successfully, but these errors were encountered:

porunov · 2019-03-04T17:48:50Z

@Krittam , what happens when you use next query?:
g.V().has('vid','qwerty').inE().limit(Long.MAX_VALUE).count().next()

Krittam · 2019-03-04T18:57:51Z

@porunov the said query produced this exception.

TimedOutException()
at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14696)
at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14633)
at org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:14559)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:741)
at org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:725)
at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:143)
at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getSlice(CassandraThriftKeyColumnValueStore.java:100)
at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:82)
at org.janusgraph.diskstorage.keycolumnvalue.cache.ExpirationKCVSCache.getSlice(ExpirationKCVSCache.java:129)
at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:288)
at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:285)
at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69)
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55)
at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470)
at org.janusgraph.diskstorage.BackendTransaction.edgeStoreMultiQuery(BackendTransaction.java:285)
at org.janusgraph.graphdb.database.StandardJanusGraph.edgeMultiQuery(StandardJanusGraph.java:441)
at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.lambda$executeMultiQuery$3(StandardJanusGraphTx.java:1054)
at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:98)
at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:90)
at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.executeMultiQuery(StandardJanusGraphTx.java:1054)
at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.execute(MultiVertexCentricQueryBuilder.java:113)
at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.edges(MultiVertexCentricQueryBuilder.java:133)
at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.initialize(JanusGraphVertexStep.java:95)
at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.processNextStart(JanusGraphVertexStep.java:101)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
at org.apache.tinkerpop.gremlin.process.traversal.step.filter.FilterStep.processNextStart(FilterStep.java:37)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:83)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:113)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38)
at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:200)
at java_util_Iterator$next.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:117)
at Script5.run(Script5.groovy:1)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:843)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:548)
at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:233)
at org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines.eval(ScriptEngines.java:120)
at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:290)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

Krittam · 2019-03-13T06:18:46Z

I am using janusgraph 0.2.0. I also tried using spark to run OLAP query but i am using yarn to submit spark jobs on my hadoop cluster and there is not enough documentation available for that.
Also in 0.2.0 there are many missing jars which make the problem even worse.

porunov · 2019-03-13T08:46:27Z

Can you reproduce this issue with JanusGraph 0.3.1?

Krittam · 2019-03-13T11:10:23Z

on JanusGraph 0.3.1 query fails with this exception:

Frame size (124696054) larger than max length (15728640)!

On increasing the storage.cassandra.frame-size-mb property to 512 the query starts execution but doesn't complete (I waited for around 45 mins before finally giving up on it!)
Also please note that while i tried this with JanusGraph 0.3.1 release, the underlying cassandra db was unchanged (Same as when it was created by the original 0.2.0 release)

porunov · 2019-03-25T13:33:14Z

Confirming performance issue in 0.3.1 also.
I've created 1 vertex with 1 million incoming vertices. Count of 1 million vertices took 6 seconds.
Then I've created 1 vertex with 16 million incoming vertices. The count couldn't be executed in 5 minutes.

porunov · 2019-03-25T16:09:40Z

Currently I didn't find a good solution to count edges. Vertex centric indexes don't help in count operation also.

kverma12 · 2020-03-02T07:55:32Z

Changing the storage backend to cql and other properties relevant to cql solved the issue for me.
Here are the properties which i used:

storage.backend=cql
storage.cql.keyspace=t_graph
storage.cql.read-consistency-level=ONE

farodin91 added the kind/performance label Mar 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Read time out while fetching all edges of a SuperNode #1467

Read time out while fetching all edges of a SuperNode #1467

Krittam commented Mar 4, 2019

porunov commented Mar 4, 2019

Krittam commented Mar 4, 2019

Krittam commented Mar 13, 2019

porunov commented Mar 13, 2019

Krittam commented Mar 13, 2019

porunov commented Mar 25, 2019

porunov commented Mar 25, 2019

kverma12 commented Mar 2, 2020

Read time out while fetching all edges of a SuperNode #1467

Read time out while fetching all edges of a SuperNode #1467

Comments

Krittam commented Mar 4, 2019

porunov commented Mar 4, 2019

Krittam commented Mar 4, 2019

Krittam commented Mar 13, 2019

porunov commented Mar 13, 2019

Krittam commented Mar 13, 2019

porunov commented Mar 25, 2019

porunov commented Mar 25, 2019

kverma12 commented Mar 2, 2020