-
Notifications
You must be signed in to change notification settings - Fork 181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a numeric decoder written in C #25
Conversation
This is about 10% faster than a pure-ruby decoder. Passing an Integer to BigDecimal is not significantly different than passing a String, and passing a Flonum is actually slower. Integers probably aren't faster because BigDecimal converts them to cstrs internally: https://github.com/ruby/ruby/blob/trunk/ext/bigdecimal/bigdecimal.c#L290-L292 https://github.com/ruby/ruby/blob/trunk/ext/bigdecimal/bigdecimal.c#L302-L306 Flonums are slower because BigDecimal converts them to Rational and then does some processing on the Rational: https://github.com/ruby/ruby/blob/trunk/ext/bigdecimal/bigdecimal.c#L247-L278 Here's the code I originally had that tested Integer and Flonum approaches (I removed it when I benchmarked and found those approachs were the same speed or slower): ``` VALUE bd; char *found; if ((found = strchr(val, '.'))) { if (len <= 16) { bd = DBL2NUM(rb_cstr_to_dbl(val, 1)); len -= found - val - 1; return rb_funcall(rb_cObject, s_id_BigDecimal, 2, bd, INT2NUM(len)); } bd = rb_str_new(val, len); } else { bd = pg_text_dec_integer(conv, val, len, tuple, field, enc_idx); } return rb_funcall(rb_cObject, s_id_BigDecimal, 1, bd); ``` This also includes a text encoder written in ruby. Example test/benchmark code: ```ruby require 'pg' require 'bigdecimal' require 'benchmark/ips' int_max = 10**18 small_float_range = 0...15 large_float_range = 0...1000 log_cols = ((ENV['BENCH_LOG_COLS'] || 6).to_i) log_rows = ((ENV['BENCH_LOG_ROWS'] || 6).to_i) cols = 2**log_cols raise "BENCH_LOG_COLS must be <= 10" if log_cols > 10 raise "BENCH_LOG_ROWS must be <= 10" if log_rows > 10 if ENV['PURE_RUBY'] == '1' class NumericDecoder < PG::SimpleDecoder def decode(string, tuple=nil, field=nil) BigDecimal(string) end end class NumericEncoder < PG::SimpleEncoder def encode(decimal) decimal.to_s('F') end end PG::BasicTypeRegistry.register_type(0, 'numeric', NumericEncoder, NumericDecoder) end conn = PG.connect unless ENV['ALL_STRINGS'] == '1' conn.type_map_for_queries = PG::BasicTypeMapForQueries.new conn conn.type_map_for_results = PG::BasicTypeMapForResults.new conn end conn.exec("BEGIN") at_exit{conn.exec("ROLLBACK")} conn.exec("CREATE TABLE int_numeric_test (#{(0...cols).map{|i| "d#{i} numeric(40, 2) DEFAULT '#{(rand*int_max).to_i}'"}.join(', ')})") conn.exec("CREATE TABLE small_float_numeric_test (#{(0...cols).map{|i| "d#{i} numeric(15, 2) DEFAULT '#{s = small_float_range.map{rand(10)}.join; s[-3] = '.'; s}'"}.join(', ')})") conn.exec("CREATE TABLE large_float_numeric_test (#{(0...cols).map{|i| "d#{i} numeric(1000, 10) DEFAULT '#{s = large_float_range.map{rand(10)}.join; s[-10] = '.'; s}'"}.join(', ')})") conn.exec("INSERT INTO int_numeric_test DEFAULT VALUES") conn.exec("INSERT INTO small_float_numeric_test DEFAULT VALUES") conn.exec("INSERT INTO large_float_numeric_test DEFAULT VALUES") log_rows.times do conn.exec("INSERT INTO int_numeric_test SELECT * FROM int_numeric_test") conn.exec("INSERT INTO small_float_numeric_test SELECT * FROM small_float_numeric_test") conn.exec("INSERT INTO large_float_numeric_test SELECT * FROM large_float_numeric_test") end ['int_numeric_test', 'small_float_numeric_test', 'large_float_numeric_test'].each do |v| conn.exec( "SELECT d0 FROM #{v} LIMIT 1" ) do |res| v = res.getvalue(0, 0) print "Example #{v} value: #{v.is_a?(BigDecimal) ? v.to_s('F') : v}\n" end end puts small = '123456790123.12' large = ('123456790'*10) << '.' << ('012345679') puts "Basic Tests" numeric_tests = [ '1', '1.0', '1.2', small, large, ] numeric_tests.each do |d| conn.exec("SELECT #{d}::numeric") do |res| v = res.getvalue(0, 0) print "Test decimal value: #{d} should equal #{v.is_a?(BigDecimal) ? v.to_s('F') : v}\n" end end conn.exec_params("SELECT $1::numeric, $2::numeric", [BigDecimal(1), BigDecimal(large)]) do |res| v = res.getvalue(0, 0) print "Test bigdecimal text encoder values: 1 should equal #{v.is_a?(BigDecimal) ? v.to_s('F') : v}\n" v = res.getvalue(0, 1) print "Test bigdecimal text encoder values: #{large} should equal #{v.is_a?(BigDecimal) ? v.to_s('F') : v}\n" end Benchmark.ips do |x| x.warmup = -1 ['int_numeric_test', 'small_float_numeric_test', 'large_float_numeric_test'].each do |v| sql = "SELECT * FROM #{v}" x.report(v) do conn.exec(sql) do |res| ntuples = res.ntuples recnum = 0 while recnum < ntuples converted_rec = {} fieldnum = 0 while fieldnum < cols res.getvalue(recnum, fieldnum) fieldnum += 1 end recnum += 1 end end end end end =begin Integer Numeric Strings: ~275 ips Pure Ruby BigDecimal: ~110 ips C BigDecimal: ~120 ips Small Float Numeric Strings: ~300 ips Pure Ruby BigDecimal: ~115 ips C BigDecimal: ~126 ips Large Float Numeric Strings: ~10 ips Pure Ruby BigDecimal: ~7.3 ips C BigDecimal: ~7.3 ips =end ```
Another important note here is that this is the first time PG is getting default decimal mapping, which is big cause |
spec/pg/basic_type_mapping_spec.rb
Outdated
@@ -166,6 +179,27 @@ | |||
end | |||
end | |||
|
|||
it "should do numeric type conversions" do | |||
[0].each do |format| | |||
small = '123456790123.12' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something's up with the indentation here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I just pushed another commit to fix that.
Thank you - merged! Out of curiosity: Is this related to Sequel in any way? |
No, Sequel doesn't use |
Anyway, your contributions are greatly appreciated! Sequel still uses the query params typecasting (which is |
I wouldn't be against accepting a patch for that, and may implement it if I have time. Sequel still supports old versions of pg, though, so it would have to test for support before using it. |
This works since pg-0.18.0, but compatibility to postgres-pr and jdbc surely needs to be considered. |
I think Sequel supports back to pg-0.8.0. postgres-pr and jdbc don't matter in this case as parameters are not used on postgres-pr, and jdbc uses the jdbc adapter and not the postgres adapter. |
But sequel_pg requires pg >= 0.18.0, so that it could be added there? |
sequel_pg doesn't handle anything related to input parameters, it only handles decoding results, so I wouldn't make sense to add it to sequel_pg. It should be added to Sequel, but made conditional ( |
Add Ruby 2.5, and add ruby-head as allow_failure.
This is required for rails#39063 to use `PG::TextDecoder::Numeric`. Ref ged/ruby-pg#25. The pg gem 1.1.0 was released at August 24, 2018, so I think it is good timing to bump the required version for improving and cleaning up the code base. https://rubygems.org/gems/pg/versions
This is about 10% faster than a pure-ruby decoder.
Passing an Integer to BigDecimal is not significantly different than
passing a String, and passing a Flonum is actually slower. Integers
probably aren't faster because BigDecimal converts them to cstrs
internally:
https://github.com/ruby/ruby/blob/trunk/ext/bigdecimal/bigdecimal.c#L290-L292
https://github.com/ruby/ruby/blob/trunk/ext/bigdecimal/bigdecimal.c#L302-L306
Flonums are slower because BigDecimal converts them to Rational
and then does some processing on the Rational:
https://github.com/ruby/ruby/blob/trunk/ext/bigdecimal/bigdecimal.c#L247-L278
Here's the code I originally had that tested Integer and Flonum
approaches (I removed it when I benchmarked and found those approachs
were the same speed or slower):
This also includes a text encoder written in ruby.
Example test/benchmark code: