New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calling first on route is slow #84
Comments
The route is not executed until you try to iterate over or otherwise force I'd expect this performance issue to be something related to Titan itself. Regards,
|
Thanks Darrick - I have been chasing ghosts all week and wasn't thinking straight here. Regarding what Titan is doing - you raised some concerns about the whether pacer-titan is using indexing properly. How can I tell if the route is using a vertex centric index of edge properties? My hunch is that these routes are doing iteration on the store's edges rather than a lookup in the index. |
Hi David, Thanks for your work with the performance benchmarks, I’ve not had time to optimise the Titan adapter much at this point as we are not currently using it in our commercial product. Your route #<V-TitanQuery ([["label", "store"], ["store_pretty_url", "beach-chalet-brewery-and-restaurant-san-francisco"]]) -> V -> outE(:tickets) -> E-Property(date=="2015-01-10", ticket_id=="134063") -> inV -> V-Property(Guestly::Extensions::Ticket)> is probably not using the vertex-centric index, you would need to use pacer-titan’s experimental and crude vertex_query route extension, something like: store.vertex_query(‘tickets’, :out) { interval(‘date’, graph.encode_property(date1), graph.encode_property(date2)) }.in_v to get all tickets between date1 and date2 using the index. This sort of thing is apparently much easier with TP3’s Gremlin implementation but I’ve not looked into how much work would need to be done to port Pacer to support TP3. Cheers,
|
Thanks Ilya |
As far as I know, there haven't been any other attempts at vertex queries so far. I haven't looked at the ones in pacer-titan, but I do think it should be possible to create something similar to the key index code. In that code I support using Range to express intervals and Set to express 'or' queries on a single property. I also try to always keep the property encoding stuff internal. |
Okay - set up some test code and got some interesting results: The Java::ComThinkaureliusTitanGraphdbQueryVertex::VertexCentricQueryBuilder#edges is too expensive in the Vertex Centric Index query unless you are calling it less than 10 times or on an extremely large number of edges. Getting all of the edges and using the VertexQueryPipe with the default out_e method seems to be faster right now. |
Those are some incredibly surprising results! |
I guess one thing to keep in mind is that my backing data store is an inmemory implementation of dynamodb running on local host - run these tests on AWS EC2 with Dynamo as a service and both the latency and bandwidth characteristics will be very different. Any thoughts on the call stack profiles in the benchmark folder? |
Checked a couple of them and they look fine to me. |
Tried it with Gremlin and Titan 1.0.0 Looks like even the latest Titan/Gremlin is no better that the testing I did in Pacer. |
Found it was the vertex partition on store that was crushing the performance on the vertex query. |
Seeing pathologic performance issue with calling first on a route.
Can anyone help me understand what I am seeing here?
Without First:
VS
With First:
Here is the query execution plan:
The Store route extensions tickets method is:
We are running:
The text was updated successfully, but these errors were encountered: