New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use `ActiveRecord::ConnectionAdapters::Quoting.quote` to quote bounds for a PostgreSQL range #30725

Merged
merged 1 commit into from Sep 27, 2017

Conversation

Projects
None yet
6 participants
@tcannonfodder
Contributor

tcannonfodder commented Sep 26, 2017

Summary

PostgreSQL 9.1+ introduced range types, and Rails added support for using this datatype in ActiveRecord. However, the serialization of PostgreSQL::OID::Range was incomplete, because it did not properly quote the bounds that make up the range. A clear example of this is a tsrange.

Normally, ActiveRecord quotes Date/Time objects to include the milliseconds. However, the way PostgreSQL::OID::Range serialized its bounds, the milliseconds were dropped. This meant that the value was incomplete and not equal to the submitted value.

An example of normal timestamps vs. a tsrange. Note how the bounds for the range do not include their milliseconds (they were present in the ruby Range):

    UPDATE "timespans" SET "updated_at" = $1, "range" = $2 WHERE
    "timespans"."id" = $3
    [["updated_at", "2017-09-23 17:07:01.304864"],
    ["range", "[2017-09-23 00:00:00 UTC,2017-09-23 23:59:59 UTC]"],
    ["id", 1234]]

PostgreSQL::OID::Range serialized the range by interpolating a string for the range, which works for most cases, but does not work for timestamps:

    def serialize(value)
      if value.is_a?(::Range)
        from = type_cast_single_for_database(value.begin)
        to = type_cast_single_for_database(value.end)
        "[#{from},#{to}#{value.exclude_end? ? ')' : ']'}"
      else
        super
      end
    end

    (byebug) from = type_cast_single_for_database(value.begin)
    2010-01-01 13:30:00 UTC

    (byebug) to = type_cast_single_for_database(value.end)
    2011-02-02 19:30:00 UTC

    (byebug) "[#{from},#{to}#{value.exclude_end? ? ')' : ']'}"
    "[2010-01-01 13:30:00 UTC,2011-02-02 19:30:00 UTC)"

By using ActiveRecord::ConnectionAdapters::Quoting.quote, we ensure that the range bounds are properly quoted for the database:

    def serialize(value)
      if value.is_a?(::Range)
        from = type_cast_single_for_database(value.begin)
        to = type_cast_single_for_database(value.end)
        "[#{quote(from)},#{quote(to)}#{value.exclude_end? ? ')' : ']'}"
      else
        super
      end
    end

    (b) from
    2010-01-01 13:30:00 UTC
    (b) to
    2011-02-02 19:30:00 UTC

    "[#{quote(from)},#{quote(to)}#{value.exclude_end? ? ')' : ']'}"
    #=> "['2010-01-01 13:30:00.670277','2011-02-02 19:30:00.745125')"

Other Information

This commit includes tests to make sure the milliseconds are preserved in tsrange and tstzrange columns.

@rails-bot

This comment has been minimized.

rails-bot commented Sep 26, 2017

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @georgeclaghorn (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

@matthewd

This comment has been minimized.

Member

matthewd commented Sep 26, 2017

Pulling in Quoting directly seems problematic.. like it might work for this type, but may not for others.

r? @sgrif

@rails-bot rails-bot assigned sgrif and unassigned georgeclaghorn Sep 26, 2017

@sgrif

This comment has been minimized.

Member

sgrif commented Sep 26, 2017

Yeah, quoting at all is definitely wrong here. I'm not sure what I was thinking in 02579b5, but what this is doing is definitely in the wrong place. It should probably just be returning a range of subtype.serialize(value.begin) and subtype.serialize(value.end), and letting the adapter handle the rest.

@tcannonfodder

This comment has been minimized.

Contributor

tcannonfodder commented Sep 26, 2017

@sgrif Thanks for the feedback! I just implemented your suggestion in 25c282c ; the code is a lot more sensible now :)

activerecord/lib/active_record/connection_adapters/postgresql/quoting.rb Outdated
@@ -133,6 +137,10 @@ def encode_array(array_data)
result
end
def encode_range(range)
"[#{quote(range.first)},#{quote(range.last)}#{range.exclude_end? ? ')' : ']'}"

This comment has been minimized.

@sgrif

sgrif Sep 26, 2017

Member

I don't think we need to call quote here

This comment has been minimized.

@tcannonfodder

tcannonfodder Sep 26, 2017

Contributor

Calling quote makes sure the values are properly formatted for queries. For example: removing quote, causes the microseconds to be dropped

(byebug) from = type_cast_single_for_database(range.first)
2010-01-01 13:30:00 UTC

(byebug) to = type_cast_single_for_database(range.last)
2011-02-02 19:30:00 UTC

(byebug) "[#{from},#{to}#{value.exclude_end? ? ')' : ']'}"
"[2010-01-01 13:30:00 UTC,2011-02-02 19:30:00 UTC)"

(byebug) "[#{quote(from)},#{quote(to)}#{value.exclude_end? ? ')' : ']'}"
"['2010-01-01 13:30:00.670277','2011-02-02 19:30:00.745125')"

Let me know if there's another place where we should call quote on the values, but we need to call quote in order to make sure the values are properly formatted and nothing is lost. I think calling it in encode_range makes the most sense, because it's part of building the range.

This comment has been minimized.

@sgrif

sgrif Sep 26, 2017

Member

Calling type_cast should do the same thing

activerecord/test/cases/adapters/postgresql/quoting_test.rb Outdated
def test_quote_range_date
range = (Date.parse("2010-01-01")...Date.parse("2010-01-01"))
type = OID::Range.new(Type::DateTime.new, :tsrange)
assert_equal "'[''2010-01-01'',''2010-01-01'')'", @conn.quote(type.serialize(range))

This comment has been minimized.

@sgrif

sgrif Sep 26, 2017

Member

The round trip tests are sufficient, I don't think we need to test this quoting behavior.

This comment has been minimized.

@tcannonfodder

tcannonfodder Sep 26, 2017

Contributor

Gotcha, so remove all the tests introduced in this commit?

This comment has been minimized.

@sgrif

sgrif Sep 26, 2017

Member

Just the ones in this file. The ones in range_test.rb all look good

@sgrif

This comment has been minimized.

Member

sgrif commented Sep 26, 2017

Don't forget that you have a failing test, too

@tcannonfodder

This comment has been minimized.

Contributor

tcannonfodder commented Sep 26, 2017

@sgrif removed the extra quoting_tests.rb and fixed the failing test in 1fb70c1

@sgrif

This comment has been minimized.

Member

sgrif commented Sep 26, 2017

LGTM. Can you please squash your commits?

@sgrif

This comment has been minimized.

Member

sgrif commented Sep 26, 2017

This needs a changelog entry as well (probably just mention that tsrange now preserves subsecond precision)

@sgrif

This comment has been minimized.

Member

sgrif commented Sep 27, 2017

Can you rebase as well so it's a single commit?

`Postgres::OID::Range` serializes to a `Range`, quote in `Quoting`
PostgreSQL 9.1+ introduced range types, and Rails added support for
using this datatype in ActiveRecord. However, the serialization of
`PostgreSQL::OID::Range` was incomplete, because it did not properly
quote the bounds that make up the range. A clear example of this is a
`tsrange`.

Normally, ActiveRecord quotes Date/Time objects to include the
milliseconds. However, the way `PostgreSQL::OID::Range` serialized its
bounds, the milliseconds were dropped. This meant that the value was
incomplete and not equal to the submitted value.

An example of normal timestamps vs. a `tsrange`. Note how the bounds
for the range do not include their milliseconds (they were present in
the ruby Range):

    UPDATE "iterations" SET "updated_at" = $1, "range" = $2 WHERE
    "iterations"."id" = $3
    [["updated_at", "2017-09-23 17:07:01.304864"],
    ["range", "[2017-09-23 00:00:00 UTC,2017-09-23 23:59:59 UTC]"],
    ["id", 1234]]

`PostgreSQL::OID::Range` serialized the range by interpolating a
string for the range, which works for most cases, but does not work
for timestamps:

    def serialize(value)
      if value.is_a?(::Range)
        from = type_cast_single_for_database(value.begin)
        to = type_cast_single_for_database(value.end)
        "[#{from},#{to}#{value.exclude_end? ? ')' : ']'}"
      else
        super
      end
    end

    (byebug) from = type_cast_single_for_database(value.begin)
    2010-01-01 13:30:00 UTC

    (byebug) to = type_cast_single_for_database(value.end)
    2011-02-02 19:30:00 UTC

    (byebug) "[#{from},#{to}#{value.exclude_end? ? ')' : ']'}"
    "[2010-01-01 13:30:00 UTC,2011-02-02 19:30:00 UTC)"

@sgrif (the original implementer for Postgres Range support) provided
some feedback about where the quoting should occur:

  Yeah, quoting at all is definitely wrong here. I'm not sure what I
  was thinking in 02579b5, but what this is doing is definitely in the
  wrong place. It should probably just be returning a range of
  subtype.serialize(value.begin) and subtype.serialize(value.end), and
  letting the adapter handle the rest.

`Postgres::OID::Range` now returns a `Range` object, and
`ActiveRecord::ConnectionAdapters::PostgreSQL::Quoting` can now encode
and quote a `Range`:

    def encode_range(range)
      "[#{type_cast(range.first)},#{type_cast(range.last)}#{range.exclude_end? ? ')' : ']'}"
    end

    ...

    encode_range(range)
    #=> "['2010-01-01 13:30:00.670277','2011-02-02 19:30:00.745125')"

This commit includes tests to make sure the milliseconds are
preserved in `tsrange` and `tstzrange` columns
@tcannonfodder

This comment has been minimized.

Contributor

tcannonfodder commented Sep 27, 2017

Yep, sorry about that!

@sgrif sgrif merged commit 48ef5e0 into rails:master Sep 27, 2017

1 of 2 checks passed

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
codeclimate All good!
Details
@alexdilley

This comment has been minimized.

alexdilley commented Dec 7, 2017

Leaving here for reference in case anyone else stumbles upon something similar.

Prior to this change I had the following:

scope :overlaps, (lambda { |tsrange|
  where Arel::Nodes::InfixOperation.new(
    '&&',
    arel_table[:interval],
    Arel.sql("'#{arel_table[:interval].type_cast_for_database(tsrange)}'")
  )
})

In Rails 5.2 this will need to be more like:

scope :overlaps, (lambda { |tsrange|
  where Arel::Nodes::InfixOperation.new(
    '&&',
    arel_table[:interval],
    Arel::Nodes::Quoted.new(tsrange)
  )
})

*which admittedly is a cleaner solution anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment