Slow query(and it gets slower with readMode: "outdated") #5195

thelinuxlich · 2015-12-11T07:29:21Z

So I have this query which returns 13k items before the .group() and 569 after:

r.table("pageviews").between([7,r.time(2015,12,1,"-02:00")],[7,r.now()],{index: "client_id_created_at"})
.eqJoin("session_id",r.table("sessions")).zip().merge(row => {
  return {
                            point: row("point").default(null),
                            pageview_count: row("pageviews"),
                            is_unique: row("is_unique"),
                            is_rejected: row("pageviews").lt(2),
                            session_duration: row("updated_at").toEpochTime().sub(row("created_at").toEpochTime()),
                            session_properties: row("properties"),
                            session_referer_type: row("referer")("type").default(null),
                            session_referer_domain: row("referer")("domain").default(null),
                            session_referer_query: row("referer")("query").default(null),
                            session_referer_path: row("referer")("path").default(null),
                            origin: row("origin").default(null),
                            weather: row("weather").default({
                                temperature: -1,
                                type: "null",
                                weather_icon: "null"
                            }),
                            device: row("device"),
                            is_multi_device: row.hasFields("next_device"),
                            geo: row("geo").default({
                                city: "null",
                                country: "null",
                                country_code: "null",
                                latitude: -1,
                                longitude: -1,
                                region: "null",
                                region_code: "null"
                            })
};
}).group("href")

This takes approximately 5 seconds, running it with a node.js script on the same machine as the database. I have a cluster of 3 n1-standard4 google cloud instances, all with 100gb SSD disks(database is using 4gb, cache is using 10gb of 15gb available RAM.

This is supposed to run on hundreds of millions of records in the worst case, so I get really alarmed that it takes 5 seconds for a small dataset.
Also, I tried setting readMode to "outdated" and to my surprise it slowed the query to 7 seconds.

I'm using latest RethinkDB version and rethinkdbdash

thelinuxlich · 2015-12-11T07:56:20Z

Running successive times doesn't change the speed, removing the eqJoin().zip() part reduces the time to 1.8 seconds, with readMode: "outdated" it raises to 2.1 seconds.

thelinuxlich · 2015-12-11T08:32:09Z

An average document of the pageviews table:

{
"actions": 0 ,
"campaign_params": { } ,
"category": null ,
"client_id": 7 ,
"created_at": Thu Dec 03 2015 16:15:18 GMT-02:00 ,
"deep_link": false ,
"domain":  "delivery.veran.com.br" ,
"end_ts": 1449170115516 ,
"ended_at": Thu Dec 03 2015 16:15:36 GMT-02:00 ,
"href": http://delivery.veran.com.br/produto, »
"id":  "000147a8-5b2b-4c16-9e4c-4eac40664562-7-browser031220151600-1449170096530" ,
"nav_type":  "normal" ,
"referer": {
"domain":  "delivery.veran.com.br" ,
"path":  "/" ,
"query": null ,
"type":  "internal"
} ,
"scroll": [
{
"percentage": {
"pixel_depth": 763 ,
"timing": 7.32 ,
"value": 25
}
} ,
{
"percentage": {
"pixel_depth": 0 ,
"timing": 7.32 ,
"value": 50
}
} ,
{
"percentage": {
"pixel_depth": 0 ,
"timing": 7.32 ,
"value": 75
}
} ,
{
"percentage": {
"pixel_depth": 1063 ,
"timing": 7.82 ,
"value": 100
}
}
] ,
"session_id":  "000147a8-5b2b-4c16-9e4c-4eac40664562-7-browser031220151600" ,
"ssl": false ,
"timing": {
"ttdr": 2.71 ,
"ttfb": 0.28 ,
"tti": 1.98
} ,
"title":  "supermercado delivery veran - agora também somos delivery!" ,
"ts": 1449170096530 ,
"visitor_id":  "000147a8-5b2b-4c16-9e4c-4eac40664562"
}

Session documents should be half the size approximately

Also, when running this query, all four cores go to ~90% but memory stays the same(8gb consumed), this is running under a write load of ~50/sec

And it is impossible to run this query on the Data Explorer, it always hangs

thelinuxlich · 2015-12-11T18:49:46Z

I moved the secondary index into the primary id as an array(on a temporary table, just for testing), but the client_id is incremental so all the data went into only one shard(could this be a issue for secondary indexes too?).
Run the same query, now I see a slight performance increase with readMode: "outdated"(4.2 seconds) and without it is the same thing(5.1 seconds)

thelinuxlich · 2015-12-11T19:04:24Z

Rebalanced the test table, run the query again. 4.2 seconds and 4.6s with readMode: "outdated", it seems outdated reads will always be slower when the data is evenly distributed among the shards

danielmewes · 2015-12-11T20:02:13Z

To benefit from outdated reads, you could try this:

Replicate the table "sessions" so that each server hosting a primary of "pageviews" also has a full replica of the "sessions" table.
Use read_mode="outdated" for "sessions", but the normal "single" read mode for "pageviews".

i.e. something like this:

r.table("pageviews").between([7,r.time(2015,12,1,"-02:00")],[7,r.now()],{index: "client_id_created_at"})
  .eqJoin("session_id", r.table("sessions", {readMode: "outdated"})).zip().merge(........

This will allow the eqJoin queries to be served locally, rather than having to go to another server, since each shard of "pageviews" will have a full copy of the "sessions" table.

I suspect that the eqJoin slowdown will be addressed by #5115 .
I'm unsure why this is still taking 1.8 seconds with that part removed.

Could you post the full query you tested when you removed the .eqJoin.zip and got the 1.8 seconds?

thelinuxlich · 2015-12-11T20:24:55Z

All tables are already replicated evenly among all shards

thelinuxlich · 2015-12-11T20:33:49Z

With readMode: "outdated" on the sessions table and removing this optarg from pageviews I get 3,6 seconds

thelinuxlich · 2015-12-11T20:38:53Z

The query that returned in 1.8 seconds:

r.db("aidax").table("pageviews").between([7,r.time(2015,12,1,"-02:00")],[7,r.now()],{index: "client_id_created_at"})
.merge({session: r.db("aidax").table("sessions").get(r.row("session_id"))})

thelinuxlich · 2015-12-11T21:02:01Z

Talking with @danielmewes, it seems I should raise the cpu cores of my cluster and rerun the queries. I will do that and report again soon, but this is really sad, I didn't know RethinkDB wasn't suitable for big datasets(although today a million records is nothing)

danielmewes · 2015-12-11T21:20:00Z

I believe this is just joins being slow, i.e. #5115 .

RethinkDB can generally handle billions of rows, though it's probably not the right system to run analytics at such scales. More for OLTP-style queries where you only use a small selection of rows at a time.

thelinuxlich · 2015-12-11T21:33:12Z

I don't intend to process billiions of rows in a single query, but my point stands, in 2015, a million rows is nothing

thelinuxlich · 2015-12-11T23:26:59Z

@danielmewes, I've replaced the three servers of my cluster with 16-core instances each(they were 4-core), running the first query of this issue gives me roughly the same time :(

danielmewes added the tp:performance label Dec 11, 2015

danielmewes added this to the 2.3-polish milestone Dec 11, 2015

thelinuxlich mentioned this issue Dec 12, 2015

RethinkDB is overall really really slow with a table containing millions of rows #5204

Open

danielmewes modified the milestones: 2.3-polish, subsequent Apr 7, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow query(and it gets slower with readMode: "outdated") #5195

Slow query(and it gets slower with readMode: "outdated") #5195

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

danielmewes commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

danielmewes commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

Slow query(and it gets slower with readMode: "outdated") #5195

Slow query(and it gets slower with readMode: "outdated") #5195

Comments

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

danielmewes commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

danielmewes commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015

thelinuxlich commented Dec 11, 2015