Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calling first on route is slow #84

Closed
dstuebe opened this issue Dec 13, 2015 · 11 comments
Closed

Calling first on route is slow #84

dstuebe opened this issue Dec 13, 2015 · 11 comments

Comments

@dstuebe
Copy link
Contributor

dstuebe commented Dec 13, 2015

Seeing pathologic performance issue with calling first on a route.
Can anyone help me understand what I am seeing here?

Without First:

irb(main):007:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134063') }}
       user     system      total        real
   0.000000   0.000000   0.000000 (  0.002000)
=> [#<Benchmark::Tms:0x79c4f23b @stime=0.0, @label="", @cstime=0.0, @real=0.002000093460083008, @total=0.0, @cutime=0.0, @utime=0.0>]
irb(main):008:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134063') }}
       user     system      total        real
   0.010000   0.000000   0.010000 (  0.004000)
=> [#<Benchmark::Tms:0x9147ba2 @stime=0.0, @label="", @cstime=0.0, @real=0.003999948501586914, @total=0.00999999999999801, @cutime=0.0, @utime=0.00999999999999801>]
irb(main):009:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134063') }}
       user     system      total        real
   0.010000   0.000000   0.010000 (  0.003000)
=> [#<Benchmark::Tms:0x5341641d @stime=0.0, @label="", @cstime=0.0, @real=0.003000020980834961, @total=0.00999999999999801, @cutime=0.0, @utime=0.00999999999999801>]
irb(main):010:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134063') }}
       user     system      total        real
   0.010000   0.000000   0.010000 (  0.005000)
=> [#<Benchmark::Tms:0x24018c8b @stime=0.0, @label="", @cstime=0.0, @real=0.005000114440917969, @total=0.00999999999999801, @cutime=0.0, @utime=0.00999999999999801>]
irb(main):011:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134063') }}
       user     system      total        real
   0.040000   0.000000   0.040000 (  0.019000)
=> [#<Benchmark::Tms:0x5bde6148 @stime=0.0, @label="", @cstime=0.0, @real=0.01900005340576172, @total=0.03999999999999915, @cutime=0.0, @utime=0.03999999999999915>]
irb(main):012:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134063') }}
       user     system      total        real
   0.010000   0.000000   0.010000 (  0.003000)
=> [#<Benchmark::Tms:0x16fc5622 @stime=0.0, @label="", @cstime=0.0, @real=0.0029997825622558594, @total=0.010000000000005116, @cutime=0.0, @utime=0.010000000000005116>]
irb(main):013:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134061') }}
       user     system      total        real
   0.000000   0.000000   0.000000 (  0.003000)

VS

With First:

irb(main):016:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134061').first }}
       user     system      total        real
   1.310000   0.080000   1.390000 (  0.922000)
=> [#<Benchmark::Tms:0x75e4fe3c @stime=0.08000000000000007, @label="", @cstime=0.0, @real=0.9219999313354492, @total=1.3899999999999952, @cutime=0.0, @utime=1.3099999999999952>]
irb(main):017:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134061').first }}
       user     system      total        real
   0.060000   0.000000   0.060000 (  0.033000)
=> [#<Benchmark::Tms:0x13f4048e @stime=0.0, @label="", @cstime=0.0, @real=0.03299999237060547, @total=0.060000000000002274, @cutime=0.0, @utime=0.060000000000002274>]
irb(main):018:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134061').first }}
       user     system      total        real
   0.040000   0.000000   0.040000 (  0.025000)
=> [#<Benchmark::Tms:0x638977e0 @stime=0.0, @label="", @cstime=0.0, @real=0.02500009536743164, @total=0.04000000000000625, @cutime=0.0, @utime=0.04000000000000625>]
irb(main):019:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134061').first }}
       user     system      total        real
   0.060000   0.000000   0.060000 (  0.034000)
=> [#<Benchmark::Tms:0x8cb7185 @stime=0.0, @label="", @cstime=0.0, @real=0.03399991989135742, @total=0.060000000000002274, @cutime=0.0, @utime=0.060000000000002274>]
irb(main):020:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134061').first }}
       user     system      total        real
   0.050000   0.000000   0.050000 (  0.041000)

Here is the query execution plan:

[24] pry(main)> store.tickets(date: '2015-01-10', ticket_id: '134063')
=> #<V[9021833224]>
Total: 1
#<V-TitanQuery ([["label", "store"], ["store_pretty_url", "beach-chalet-brewery-and-restaurant-san-francisco"]]) -> V -> outE(:tickets) -> E-Property(date=="2015-01-10", ticket_id=="134063") -> inV -> V-Property(Guestly::Extensions::Ticket)>

The Store route extensions tickets method is:

        def tickets(opts = {})
          out_e(:tickets, opts).in_v(Guestly::Extensions::Ticket)
        end

We are running:

  • Using pacer-titan 0.0.7
  • Using pacer 2.0.22
@pangloss
Copy link
Owner

The route is not executed until you try to iterate over or otherwise force
it to materialize its results.

I'd expect this performance issue to be something related to Titan itself.

Regards,
Darrick
On Dec 13, 2015 12:04 PM, "David Stuebe" notifications@github.com wrote:

Seeing pathologic performance issue with calling first on a route.
Can anyone help me understand what I am seeing here?

Without First:

irb(main):007:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134063') }}
user system total real
0.000000 0.000000 0.000000 ( 0.002000)
=> [#<Benchmark::Tms:0x79c4f23b @stime=0.0, @Label="", @CsTime=0.0, @real=0.002000093460083008, @ToTal=0.0, @cutime=0.0, @utime=0.0>]
irb(main):008:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134063') }}
user system total real
0.010000 0.000000 0.010000 ( 0.004000)
=> [#<Benchmark::Tms:0x9147ba2 @stime=0.0, @Label="", @CsTime=0.0, @real=0.003999948501586914, @ToTal=0.00999999999999801, @cutime=0.0, @utime=0.00999999999999801>]
irb(main):009:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134063') }}
user system total real
0.010000 0.000000 0.010000 ( 0.003000)
=> [#<Benchmark::Tms:0x5341641d @stime=0.0, @Label="", @CsTime=0.0, @real=0.003000020980834961, @ToTal=0.00999999999999801, @cutime=0.0, @utime=0.00999999999999801>]
irb(main):010:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134063') }}
user system total real
0.010000 0.000000 0.010000 ( 0.005000)
=> [#<Benchmark::Tms:0x24018c8b @stime=0.0, @Label="", @CsTime=0.0, @real=0.005000114440917969, @ToTal=0.00999999999999801, @cutime=0.0, @utime=0.00999999999999801>]
irb(main):011:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134063') }}
user system total real
0.040000 0.000000 0.040000 ( 0.019000)
=> [#<Benchmark::Tms:0x5bde6148 @stime=0.0, @Label="", @CsTime=0.0, @real=0.01900005340576172, @ToTal=0.03999999999999915, @cutime=0.0, @utime=0.03999999999999915>]
irb(main):012:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134063') }}
user system total real
0.010000 0.000000 0.010000 ( 0.003000)
=> [#<Benchmark::Tms:0x16fc5622 @stime=0.0, @Label="", @CsTime=0.0, @real=0.0029997825622558594, @ToTal=0.010000000000005116, @cutime=0.0, @utime=0.010000000000005116>]
irb(main):013:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134061') }}
user system total real
0.000000 0.000000 0.000000 ( 0.003000)

VS

With First:

irb(main):016:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134061').first }}
user system total real
1.310000 0.080000 1.390000 ( 0.922000)
=> [#<Benchmark::Tms:0x75e4fe3c @stime=0.08000000000000007, @Label="", @CsTime=0.0, @real=0.9219999313354492, @ToTal=1.3899999999999952, @cutime=0.0, @utime=1.3099999999999952>]
irb(main):017:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134061').first }}
user system total real
0.060000 0.000000 0.060000 ( 0.033000)
=> [#<Benchmark::Tms:0x13f4048e @stime=0.0, @Label="", @CsTime=0.0, @real=0.03299999237060547, @ToTal=0.060000000000002274, @cutime=0.0, @utime=0.060000000000002274>]
irb(main):018:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134061').first }}
user system total real
0.040000 0.000000 0.040000 ( 0.025000)
=> [#<Benchmark::Tms:0x638977e0 @stime=0.0, @Label="", @CsTime=0.0, @real=0.02500009536743164, @ToTal=0.04000000000000625, @cutime=0.0, @utime=0.04000000000000625>]
irb(main):019:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134061').first }}
user system total real
0.060000 0.000000 0.060000 ( 0.034000)
=> [#<Benchmark::Tms:0x8cb7185 @stime=0.0, @Label="", @CsTime=0.0, @real=0.03399991989135742, @ToTal=0.060000000000002274, @cutime=0.0, @utime=0.060000000000002274>]
irb(main):020:0> Benchmark.bm { |bm| bm.report { store.tickets(date: '2015-01-10', ticket_id: '134061').first }}
user system total real
0.050000 0.000000 0.050000 ( 0.041000)

Here is the query execution plan:

[24] pry(main)> store.tickets(date: '2015-01-10', ticket_id: '134063')
=> #<V[9021833224]>Total: 1#<V-TitanQuery ([["label", "store"], ["store_pretty_url", "beach-chalet-brewery-and-restaurant-san-francisco"]]) -> V -> outE(:tickets) -> E-Property(date=="2015-01-10", ticket_id=="134063") -> inV -> V-Property(Guestly::Extensions::Ticket)>

The Store route extensions tickets method is:

    def tickets(opts = {})
      out_e(:tickets, opts).in_v(Guestly::Extensions::Ticket)
    end

We are running:

  • Using pacer-titan 0.0.7
  • Using pacer 2.0.22


Reply to this email directly or view it on GitHub
#84.

@dstuebe
Copy link
Contributor Author

dstuebe commented Dec 13, 2015

Thanks Darrick - I have been chasing ghosts all week and wasn't thinking straight here.

Regarding what Titan is doing - you raised some concerns about the whether pacer-titan is using indexing properly. How can I tell if the route is using a vertex centric index of edge properties? My hunch is that these routes are doing iteration on the store's edges rather than a lookup in the index.

@ilyakard
Copy link

Hi David,

Thanks for your work with the performance benchmarks, I’ve not had time to optimise the Titan adapter much at this point as we are not currently using it in our commercial product.

Your route

#<V-TitanQuery ([["label", "store"], ["store_pretty_url", "beach-chalet-brewery-and-restaurant-san-francisco"]]) -> V -> outE(:tickets) -> E-Property(date=="2015-01-10", ticket_id=="134063") -> inV -> V-Property(Guestly::Extensions::Ticket)>

is probably not using the vertex-centric index, you would need to use pacer-titan’s experimental and crude vertex_query route extension, something like:

store.vertex_query(‘tickets’, :out) { interval(‘date’, graph.encode_property(date1), graph.encode_property(date2)) }.in_v

to get all tickets between date1 and date2 using the index. This sort of thing is apparently much easier with TP3’s Gremlin implementation but I’ve not looked into how much work would need to be done to port Pacer to support TP3.

Cheers,
Ilya

On 14 Dec 2015, at 5:48 AM, David Stuebe notifications@github.com wrote:

Thanks Darrick - I have been chasing ghosts all week and wasn't thinking straight here.

Regarding what Titan is doing - you raised some concerns about the whether pacer-titan is using indexing properly. How can I tell if the route is using a vertex centric index of edge properties? My hunch is that these routes are doing iteration on the store's edges rather than a lookup in the index.


Reply to this email directly or view it on GitHub #84 (comment).

@dstuebe
Copy link
Contributor Author

dstuebe commented Dec 14, 2015

Thanks Ilya
I will try that and let you know what I find. Is there a point of comparison for a less crude vertex query in one of the other pacer plugin adapters that we can look at for guidance on implementation if we want to move it forward?
David

@pangloss
Copy link
Owner

As far as I know, there haven't been any other attempts at vertex queries so far. I haven't looked at the ones in pacer-titan, but I do think it should be possible to create something similar to the key index code. In that code I support using Range to express intervals and Set to express 'or' queries on a single property. I also try to always keep the property encoding stuff internal.

@dstuebe
Copy link
Contributor Author

dstuebe commented Dec 15, 2015

Okay - set up some test code and got some interesting results:
https://github.com/swipely/graph_performance#vci

The Java::ComThinkaureliusTitanGraphdbQueryVertex::VertexCentricQueryBuilder#edges is too expensive in the Vertex Centric Index query unless you are calling it less than 10 times or on an extremely large number of edges.

Getting all of the edges and using the VertexQueryPipe with the default out_e method seems to be faster right now.
Comments and suggestions welcome.

@pangloss
Copy link
Owner

Those are some incredibly surprising results!

@dstuebe
Copy link
Contributor Author

dstuebe commented Dec 15, 2015

I guess one thing to keep in mind is that my backing data store is an inmemory implementation of dynamodb running on local host - run these tests on AWS EC2 with Dynamo as a service and both the latency and bandwidth characteristics will be very different.

Any thoughts on the call stack profiles in the benchmark folder?
https://github.com/swipely/graph_performance/tree/master/benchmark_results/dstuebe_2015-12-15
Do you see anything unexpected?

@pangloss
Copy link
Owner

Checked a couple of them and they look fine to me.

@dstuebe
Copy link
Contributor Author

dstuebe commented Dec 17, 2015

Tried it with Gremlin and Titan 1.0.0
https://gist.github.com/dstuebe/415fd0ca883af76d7787
For a Store with 10,000 Tickets
Grabbing 10 tickets took 1.187s
Grabbing 100 tickets took 8.579s
Grabbing 1000 tickets took 66.479s

Looks like even the latest Titan/Gremlin is no better that the testing I did in Pacer.

@dstuebe
Copy link
Contributor Author

dstuebe commented Dec 17, 2015

Found it was the vertex partition on store that was crushing the performance on the vertex query.
https://github.com/swipely/graph_performance#testing-10000100-with-different-config-and-indexing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants