PostgreSQL: Type cast via ruby-pg #17650

Closed
wants to merge 6 commits into
from

Projects

None yet

6 participants

@larskanis
Contributor

This pull request moves type casts for several data types into the scope of ruby-pg. This is done in three parts:

  1. As generic type casts of common Ruby classes for database inputs and PostgreSQL outputs. This is similar to the type casting of mysql2 and sqlite3 gems.
  2. As cached type maps alongside the prepared query statement.
  3. As explicit type en/decoder for more complex data types like Array types.

This simple benchmark should get a speedup of 50%: https://gist.github.com/larskanis/aec0a74cb3454f696b08

@rafaelfranca rafaelfranca and 1 other commented on an outdated diff Nov 17, 2014
@@ -81,7 +81,7 @@ platforms :ruby do
gem 'sqlite3', '~> 1.3.6'
group :db do
- gem 'pg', '>= 0.15.0'
+ gem 'pg', '>= 0.18.0.pre20141117110243'
@rafaelfranca
rafaelfranca Nov 17, 2014 Member

Is this version bump required?

@larskanis
larskanis Nov 17, 2014 Contributor

Yes, the latest prerelease of the pg gem is required.

@rafaelfranca
rafaelfranca Nov 17, 2014 Member

I see. I guess we are too close to Rails 4.2 to merge this on it. So we will have to push it to Rails 5.

@larskanis
larskanis Nov 17, 2014 Contributor

The next pg gem will surely be released before Rails 4.2. But Rails 5 is equally OK.

@rafaelfranca rafaelfranca commented on an outdated diff Nov 17, 2014
activerecord/lib/active_record/log_subscriber.rb
value = value ? "<#{value.bytesize} bytes of binary data>" : "<NULL binary data>"
end
-
@rafaelfranca
rafaelfranca Nov 17, 2014 Member

keep this line.

@rafaelfranca rafaelfranca and 2 others commented on an outdated diff Nov 17, 2014
...connection_adapters/postgresql/database_statements.rb
@@ -161,20 +161,42 @@ def substitute_at(column, index = 0)
end
def exec_query(sql, name = 'SQL', binds = [])
- execute_and_clear(sql, name, binds) do |result|
- types = {}
- fields = result.fields
- fields.each_with_index do |fname, i|
- ftype = result.ftype i
- fmod = result.fmod i
- types[fname] = get_oid_type(ftype, fmod, fname)
+ execute_and_clear(sql, name, binds) do |result, pe|
@rafaelfranca
rafaelfranca Nov 17, 2014 Member

What means pe?

@larskanis
larskanis Nov 17, 2014 Contributor

statement pool entry

@rafaelfranca
rafaelfranca Nov 17, 2014 Member

So it is better to use a more descriptive variable. pe can mean a lot of things.

@larskanis
larskanis Nov 17, 2014 Contributor

Sure!

@sgrif
sgrif Nov 17, 2014 Member

We will still need to give the types to the result set, even if the values have been cast for us.

@sgrif
sgrif Nov 17, 2014 Member

Never mind I scrolled down. >_<

@rafaelfranca rafaelfranca commented on an outdated diff Nov 17, 2014
...connection_adapters/postgresql/database_statements.rb
@@ -55,7 +55,7 @@ def select_value(arel, name = nil, binds = [])
def select_values(arel, name = nil)
arel, binds = binds_from_relation arel, []
sql = to_sql(arel, binds)
- execute_and_clear(sql, name, binds) do |result|
+ execute_and_clear(sql, name, binds) do |result, |
@rafaelfranca
rafaelfranca Nov 17, 2014 Member

We prefer |result, _|

@rafaelfranca rafaelfranca added this to the 5.0.0 milestone Nov 17, 2014
@sgrif
Member
sgrif commented Nov 17, 2014

Very happy with this direction. I'm not a fan of how much the encoder and decoders permeated through the system. This will take some time to go through, I'll have more specific feedback soon. Thanks for working on this! Very excited to get this in for Rails 5.

@sgrif sgrif and 1 other commented on an outdated diff Nov 17, 2014
...ve_record/connection_adapters/postgresql/oid/float.rb
@@ -5,14 +5,13 @@ module OID # :nodoc:
class Float < Type::Float # :nodoc:
include Infinity
- def cast_value(value)
@sgrif
sgrif Nov 17, 2014 Member

This is used for user input, not just database input.

@larskanis
larskanis Nov 30, 2014 Contributor

True, I fixed this and added a test case for this here .

@SamSaffron
Contributor

awesome, so happy to have these changes in pg. time casting is way to expensive now.

though @larskanis casting must be deferred, people using AR very sadly over-select.

stuff like Topic.first.title is very common, no point doing all the date casting there, we need to ensure this change defers all casting till first use of column.

@larskanis
Contributor

@SamSaffron Time casting is still not implemented in ruby-pg. However since you're always hunting for performance, can you do a check of this patch with your application, too?

@SamSaffron
Contributor

I will, will give it a shot today and let you know

On Thu, Nov 20, 2014 at 9:54 PM, Lars Kanis notifications@github.com
wrote:

@SamSaffron https://github.com/SamSaffron Time casting is still not
implemented in ruby-pg. However since you're always hunting for
performance, can you do a check of this patch with your application, too?


Reply to this email directly or view it on GitHub
#17650 (comment).

@SamSaffron
Contributor

I just benched it using Discourse bench and it is coming up somewhat slower (even though memory seems better)

Before patch:

Your Results: (note for timings- percentile is first, duration is second in millisecs)
---
categories_admin:
  50: 77
  75: 84
  90: 141
  99: 231
home_admin:
  50: 67
  75: 74
  90: 127
  99: 232
topic_admin:
  50: 30
  75: 31
  90: 33
  99: 116
categories:
  50: 67
  75: 74
  90: 139
  99: 233
home:
  50: 51
  75: 55
  90: 66
  99: 202
topic:
  50: 16
  75: 17
  90: 17
  99: 98
timings:
  load_rails: 3435
ruby-version: 2.1.2-p95
rss_kb: 262364
pss_kb: 255786
memorysize: 5.89 GB
operatingsystem: Ubuntu
architecture: amd64
processor0: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz
physicalprocessorcount: '1'
kernelversion: 3.13.0
virtual: vmware

After patch

---
categories_admin:
  50: 81
  75: 90
  90: 102
  99: 202
home_admin:
  50: 73
  75: 82
  90: 98
  99: 196
topic_admin:
  50: 31
  75: 35
  90: 39
  99: 53
categories:
  50: 72
  75: 86
  90: 125
  99: 236
home:
  50: 54
  75: 59
  90: 118
  99: 225
topic:
  50: 15
  75: 16
  90: 19
  99: 86
timings:
  load_rails: 3199
ruby-version: 2.1.2-p95
rss_kb: 238716
pss_kb: 232126
memorysize: 5.89 GB
operatingsystem: Ubuntu
architecture: amd64
processor0: Intel(R) Core(TM) i7-4770K CPU @ 3.50GHz
physicalprocessorcount: '1'
kernelversion: 3.13.0
virtual: vmware

So for example home page in admin mode moved up to 73ms from 67ms.

This technique will make stuff faster its just that we need a clean pattern for deferring all casts, ideally AR::Result can hold off on even going to the PG::Result till we start accessing stuff. The change is quite deep afaik.

@larskanis
Contributor

@SamSaffron There must be something wrong with your environment. I did a run of the (very nice) Discourse bench on my system, and got a noticeable speedup. Not the 50% as in a pure ActiveRecord benchmark, but measurable better.

I also noticed that Discourse uses a lot of queries without binds. These are completely uncached currently and therefore don't benefit from this pull request. So I turned caching on for all queries, even these without query parameters, and got another speedup.

Your Results: (note for timings- percentile is first, duration is second in millisecs)

master-dcb3ac2              typecast-ruby-pg            all queries cached
---                         ---                         ---
categories_admin:           categories_admin:           categories_admin:
  50: 107                     50: 104                     50: 97
  75: 112                     75: 109                     75: 102
  90: 158                     90: 155                     90: 109
  99: 219                     99: 176                     99: 167
home_admin:                 home_admin:                 home_admin:
  50: 99                      50: 92                      50: 87
  75: 103                     75: 96                      75: 90
  90: 108                     90: 104                     90: 96
  99: 131                     99: 158                     99: 122
topic_admin:                topic_admin:                topic_admin:
  50: 51                      50: 47                      50: 43
  75: 54                      75: 49                      75: 46
  90: 57                      90: 52                      90: 49
  99: 65                      99: 133                     99: 74
categories:                 categories:                 categories:
  50: 86                      50: 81                      50: 78
  75: 90                      75: 86                      75: 82
  90: 107                     90: 136                     90: 89
  99: 160                     99: 171                     99: 142
home:                       home:                       home:
  50: 60                      50: 59                      50: 53
  75: 62                      75: 61                      75: 55
  90: 68                      90: 65                      90: 60
  99: 123                     99: 129                     99: 121
topic:                      topic:                      topic:
  50: 19                      50: 18                      50: 15
  75: 20                      75: 19                      75: 16
  90: 22                      90: 22                      90: 19
  99: 92                      99: 101                     99: 87
timings:                    timings:                    timings:
  load_rails: 4306            load_rails: 4282            load_rails: 4317
ruby-version: 2.1.1-p76     ruby-version: 2.1.1-p76     ruby-version: 2.1.1-p76
rss_kb: 280872              rss_kb: 335780              rss_kb: 301032
pss_kb: 276609              pss_kb: 331827              pss_kb: 297034
virtual: physical           virtual: physical           virtual: physical
architecture: amd64         architecture: amd64         architecture: amd64
operatingsystem: Ubuntu     operatingsystem: Ubuntu     operatingsystem: Ubuntu
physicalprocessorcount: 1   physicalprocessorcount: 1   physicalprocessorcount: 1
processor0: Intel(R) Core(TMprocessor0: Intel(R) Core(TMprocessor0: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz
kernelversion: 3.13.0       kernelversion: 3.13.0       kernelversion: 3.13.0
memorysize: 5.72 GB         memorysize: 5.72 GB         memorysize: 5.72 GB

So I'll try to add some caching for queries without parameters. This must be more hesitant than the current caching, to avoid blowing out all prepared statements from the cache with some IN (lot of ids) kind of queries.

Regarding deferred retrieval of fields from the PG::Result object, I did some experimentation already here . This is more for streaming result data, but could be the base for deferred type casts of single attributes, too.

@sgrif
Member
sgrif commented Nov 21, 2014

@SamSaffron This is showing performance improvements for me as well (on Discourse). Make sure you rebase onto master and point pg for discourse at the pre-release.

@sgrif
Member
sgrif commented Nov 21, 2014

I do, however, see a performance regression when instantiating multiple records as a micro benchmark

1000.times do |n|
  Person.create! name: "Aaron#{n}", fingers: 10 + n, group_id: 123 + n, height: 1.9 * n, committer: n % 2 == 0
end

require 'benchmark/ips'

Benchmark.ips do |x|
  x.report('all')         { Person.all.to_a }
end
@SamSaffron
Contributor

@larskanis / @sgrif , this is really strange, its exactly what I did.

Checked out master, benched and the merged in typecast-ruby-pg and benched (both cases had the pre release of pg installed)

@larskanis can you rebase against master and I will repeat it, I must have done something wrong or have something weird in my env.

The very big rss increase in your bench is a concern rss_kb: 280872 -> rss_kb: 335780 , worth hunting down if some sort of leak is in place.

Back in a previous life I solved the IN problem. What I did in Dapper (https://github.com/StackExchange/dapper-dot-net) was cached multiple flavors of in queries so you have IN ($1 , $2 , $3) cached and IN ($1, $2) cached and so on, it stabalises really quick in prod and you can always chuck an LRU cache in front to ensure you don't bloat prepared statements, (https://github.com/SamSaffron/lru_redux)

@larskanis
Contributor

Sorry for the delay. Now I rebased the branch and wired all given notes in. I also added another commit, that makes use of prepared statements for queries without binds. This improves query times of the discourse benchmarks as seen in the rightmost column of the above table . This kind of CountedStatementPool probably could be used by the sqlite3 adapter equally.

I also did a try to do prepare and dealloc of statements asynchronously, but although it worked, I didn't get a noticeable performance improvement. Nevertheless it's here .

@sgrif sgrif self-assigned this Nov 30, 2014
@SamSaffron
Contributor

I can confirm a 5-10% improvement here across the board 👍

@sgrif
Member
sgrif commented Dec 1, 2014

Cool, I'll do a more in depth code review this week

On Sun, Nov 30, 2014, 9:18 PM Sam notifications@github.com wrote:

I can confirm a 5-10% improvement here across the board [image: 👍]


Reply to this email directly or view it on GitHub
#17650 (comment).

@SamSaffron
Contributor

@larskanis I was thinking about memory consumption in pg and was wondering if we could start freezing column names and making use of frozen string pools in the gem, is that something you looked at yet? I mention this cause I notice strings like "created_at" can be allocated 100 times or so during a request in some conditions.

@larskanis
Contributor

@SamSaffron This is already solved with the pull request. In pg-0.18.0 the field names are frozen and allocated only once for each PG::Result, and they are reused for all methods, which return field names. However there is no reuse for the next query.

I did some experiments with frozen string pooling, but found out, that String allocation is very fast in MRI (at least for embedded strings <=23 bytes). A string pool would actually slow down pg, because of st_table lookups are more expensive.

This all doesn't really matter with this pull request, because all queries are cached now on the second usage, and the field names and (more importantly) the field types are stored alongside the prepared statement handle. So for repeated queries the PG::Result object receives only two calls: PG::Result#type_map= and #values.

So you should not see these string allocations with this patch anymore.

@SamSaffron
Contributor

@Lars sorry I mean simply using string #freeze on the column names no need
for too much fancy here... see:
http://www.sitepoint.com/unraveling-string-key-performance-ruby-2-2/

On Tue, Dec 2, 2014 at 7:08 PM, Lars Kanis notifications@github.com wrote:

@SamSaffron https://github.com/SamSaffron This is already solved with
the pull request. In pg-0.18.0 the field names are frozen and allocated
only once for each PG::Result, and they are reused for all methods, which
return field names. However there is no reuse for the next query.

I did some experiments with frozen string pooling, but found out, that
String allocation is very fast in MRI (at least for embedded strings <=23
bytes). A string pool would actually slow down pg, because of st_table
lookups are more expensive.

This all doesn't really matter with this pull request, because all queries
are cached now on the second usage, and the field names and (more
importantly) the field types are stored alongside the prepared statement
handle. So for repeated queries the PG::Result object receives only two
calls: PG::Result#type_map= and #values.

So you should not see these string allocations with this patch anymore.


Reply to this email directly or view it on GitHub
#17650 (comment).

@SamSaffron
Contributor

nice about that change though ... will profile memory and confirm.

On Tue, Dec 2, 2014 at 7:48 PM, Sam Saffron sam.saffron@gmail.com wrote:

@Lars sorry I mean simply using string #freeze on the column names no need
for too much fancy here... see:
http://www.sitepoint.com/unraveling-string-key-performance-ruby-2-2/

On Tue, Dec 2, 2014 at 7:08 PM, Lars Kanis notifications@github.com
wrote:

@SamSaffron https://github.com/SamSaffron This is already solved with
the pull request. In pg-0.18.0 the field names are frozen and allocated
only once for each PG::Result, and they are reused for all methods, which
return field names. However there is no reuse for the next query.

I did some experiments with frozen string pooling, but found out, that
String allocation is very fast in MRI (at least for embedded strings <=23
bytes). A string pool would actually slow down pg, because of st_table
lookups are more expensive.

This all doesn't really matter with this pull request, because all
queries are cached now on the second usage, and the field names and (more
importantly) the field types are stored alongside the prepared statement
handle. So for repeated queries the PG::Result object receives only two
calls: PG::Result#type_map= and #values.

So you should not see these string allocations with this patch anymore.


Reply to this email directly or view it on GitHub
#17650 (comment).

@larskanis larskanis referenced this pull request Feb 19, 2015
@sgrif sgrif Remove most PG specific type subclasses
The latest version of the PG gem can actually convert the primitives for
us in C code, which gives a pretty substantial speed up. A few cases
were only there to add the `infinity` method, which I just put on the
range type (which is the only place it was used). Floats also needed to
parse `Infinity` and `NaN`, but it felt reasonable enough to put that on
the generic form.
aafee23
larskanis added some commits Mar 21, 2015
@larskanis larskanis PostgreSQL, Fix OID based type casts in C for primitive types.
The type map was introduced in aafee23, but wasn't properly filled.

This mainly adjusts many locations, that expected strings instead of
integers or boolean.

add_pg_decoders is moved after setup of the StatementPool, because
execute_and_clear could potentially make use of it.
1d8d5a7
@larskanis larskanis PostgreSQL, Add input type casts for primitive types.
Ruby-pg's default way to serialize values for transmission to the database is to
call #to_s . This however creates a temporary String object for each value.
Setting a class based type map avoids the allocation of this additional String.

The performance benefit is measurable in microbenchmarks, but not with
the overhead of activerecord. However it's free to use and has no drawback.
9f2f268
@larskanis larskanis PostgreSQL, Move logging after send_query, to better use the idle time
while waiting for the query result.

This gives a little speedup of around 5%.

Also restore the ability to CTRL-C a running prepared query.
f7c80ff
@larskanis larskanis PostgreSQL, Use ruby-pg's built-in capabilities for array en-/decodin…
…g in C.

This obsoletes the ruby based implementations.
647eb2c
@larskanis larskanis PostgreSQL, Add infrastructure for column based type casts in ruby-pg.
This allows us to move serialisation and/or deserialisation of types into
the C functions for query execution. Doing this avoids creation of temporary
String objects with the (de-)serialized value and it can avoid unnecessary
copying of string data internally in ruby-pg.

For now, this is only used for the bytea type directly, so that we no longer
need the Hash form (with format: 1). This makes the bytea handling
symmetric (binary in and out), in contrast to the previous asymmetric handling
(binary in and text out). The received binary data is therefore no longer
copied and stored as text *and* as binary ruby String object, but directly
returned as binary String form C.

For Array types this allows more efficient (de-)serialisation of float,
boolean and integer elements. Since many test cases depends on serialisation
and deserialisation within OID::Array, I however did not move the (de-)
serialisation of array types to the C functions for query execution.
This also makes sense by the fact, that (in contrast to the bytea type)
deserialisation of array types can make the query result retrieval slower
compared to the on-demand deserialisation in OID::Array.

This patch extends the statement pool, so that it stores several non changing
parameters in addition to the prepared statement key of the cached query. This
way we can also cache field_names, types and the new column based type maps.

Unescaping of bytea data is done in ruby-pg now, with no need to alter
the encoding in deserialize. This therefore moves the bytea result encoding
check to test_read_value.
366d26d
@larskanis @larskanis larskanis PostgreSQL, Use prepared statements for queries with and without bind…
…s, based

on the usage count of the given SQL string.

Currently all queries without binds are not cached as prepared statement.
On the other hand all queries with binds are cached as prepared statement on
it's first use. Using prepared statements based on the usage count of a given
query, reduces network round trips for preparation and deallocation of
single-use queries. Most notably however, it allows creation of prepared
statements of repeated queries without binds.

The CountedStatementPool is optimized for performance and low memory footprint.
It is a very simple hash implementation based on an Array.
It takes 8 bytes per entry instead of around 50 of a Hash based one on 64-bit MRI.
It depends on String#hash, so that collisions are expected and can occur.
But they have a minor performance impact only, so they are no real issue.

There are two new configuration variables, that are hard coded for now:
The size of the CountedStatementPool (8192) and the usage count of a query
to be moved into the PreparedStatementPool (2). These could be added as another
connection parameter.

This patch also tunes PreparedStatementPool insofar, that queries are removed
in LRU order, when the pool is full, which typically should be a little bit
faster than the current least-recently-created method. And the deallocation
is done in bulks of each 5% of the pool size, in order to save database
round-trips.
41b1401
@larskanis
Contributor

I've done a rebase to master and added much better descriptions to the single commits. This hopefully give a better understanding of why these changes should be done. The commits are also rearranged, so that the least controvertible comes first (according to my understanding of course). I made sure, that each single commit passes the activerecord test suite, in case you don't like all of them.

The performance improvements are still equal to that of November (column all queries cached) . That means 5% for the first 5 commits and another 5% for the CountedStatementPool implementation measured with the Discourse benchmark.

@sgrif
Member
sgrif commented Mar 25, 2015

@larskanis Thank you for your hard work on this. I'll give it a deeper look today.

@SamSaffron
Contributor

I am super excited about these changes! awesome work

On Thu, Mar 26, 2015 at 1:42 AM, Sean Griffin notifications@github.com
wrote:

@larskanis https://github.com/larskanis Thank you for your hard work on
this. I'll give it a deeper look today.


Reply to this email directly or view it on GitHub
#17650 (comment).

@larskanis
Contributor

I think, it's more the hard work of Sean and Yves to continuously cleanup and revamp the type system since the days of rails-4.0. The type cast optimizations wouldn't be possible without that.

@sgrif
sgrif commented on 9f2f268 Mar 26, 2015

However it's free to use and has no drawback.

I disagree. Adding code that serves no immediate purpose is a drawback by itself.

On the other hand it looks like I merged this one on accident, and I'm too lazy to revert. So you win. ;)

@sgrif sgrif added a commit that referenced this pull request Mar 26, 2015
@sgrif sgrif Partially merge #17650
Merges 1d8d5a7. The pull request as a
whole is quite large, and I'm reviewing the smaller pieces individually.
3b50a7a
@sgrif
sgrif commented on f7c80ff Mar 26, 2015

This seems fine to me, but I'd rather get some other eyes on it. Does other code in your PR directly depend on this change? It seems pretty isolated. I'd rather have a separate PR here so we can discuss in isolation.

@sgrif sgrif added a commit that referenced this pull request Mar 26, 2015
@sgrif sgrif Partially merge #17650
Merges 647eb2c. The pull request as a
whole is quite large, and I'm reviewing the smaller pieces individually.
cd09261
@sgrif
sgrif commented on 366d26d Mar 26, 2015

The goal has been to reduce the number of database specific type classes for generic primitives. This is in direct opposition to that. I think we should separate out the usage of PG encoder/decoders from the Active Record type system, and move it into something more directly tied to the column and connection adapter.

@sgrif
sgrif commented on 41b1401 Mar 26, 2015

I'm going to hold off on reviewing this one, since I'm pretty sure it's affected by my comments on the last commit.

@sgrif
Member
sgrif commented Mar 26, 2015

@larskanis If you want to go ahead and rebase this, I merged all of the commits that were good to go, and commented on the ones that weren't.

@sgrif
Member
sgrif commented Oct 20, 2015

I'm going to close this due to inactivity. The biggest pieces of this were merged in isolation. If you or someone else wants to take the reigns on picking the other pieces back up, feel free to open new PRs (the remaining commits have comments on what would need to be changed, and they should each be able to be a separate PR)

@sgrif sgrif closed this Oct 20, 2015
@rafaelfranca rafaelfranca modified the milestone: 5.0.0 [temp], 5.0.0 Dec 30, 2015
@mattclar

@larskanis I understand that this is advantageous to many activerecord users as the C implementation of this will be speedier. HOWEVER this completely breaks array support for jruby users! could you please look at reinstating the old behaviour for jruby users?!

@larskanis I think this is related to #26991

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment