Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ActiveRecord::ConnectionAdapters::Quoting.quote to quote bounds for a PostgreSQL range #30725

Merged

Conversation

tcannonfodder
Copy link
Contributor

@tcannonfodder tcannonfodder commented Sep 26, 2017

Summary

PostgreSQL 9.1+ introduced range types, and Rails added support for using this datatype in ActiveRecord. However, the serialization of PostgreSQL::OID::Range was incomplete, because it did not properly quote the bounds that make up the range. A clear example of this is a tsrange.

Normally, ActiveRecord quotes Date/Time objects to include the milliseconds. However, the way PostgreSQL::OID::Range serialized its bounds, the milliseconds were dropped. This meant that the value was incomplete and not equal to the submitted value.

An example of normal timestamps vs. a tsrange. Note how the bounds for the range do not include their milliseconds (they were present in the ruby Range):

    UPDATE "timespans" SET "updated_at" = $1, "range" = $2 WHERE
    "timespans"."id" = $3
    [["updated_at", "2017-09-23 17:07:01.304864"],
    ["range", "[2017-09-23 00:00:00 UTC,2017-09-23 23:59:59 UTC]"],
    ["id", 1234]]

PostgreSQL::OID::Range serialized the range by interpolating a string for the range, which works for most cases, but does not work for timestamps:

    def serialize(value)
      if value.is_a?(::Range)
        from = type_cast_single_for_database(value.begin)
        to = type_cast_single_for_database(value.end)
        "[#{from},#{to}#{value.exclude_end? ? ')' : ']'}"
      else
        super
      end
    end

    (byebug) from = type_cast_single_for_database(value.begin)
    2010-01-01 13:30:00 UTC

    (byebug) to = type_cast_single_for_database(value.end)
    2011-02-02 19:30:00 UTC

    (byebug) "[#{from},#{to}#{value.exclude_end? ? ')' : ']'}"
    "[2010-01-01 13:30:00 UTC,2011-02-02 19:30:00 UTC)"

By using ActiveRecord::ConnectionAdapters::Quoting.quote, we ensure that the range bounds are properly quoted for the database:

    def serialize(value)
      if value.is_a?(::Range)
        from = type_cast_single_for_database(value.begin)
        to = type_cast_single_for_database(value.end)
        "[#{quote(from)},#{quote(to)}#{value.exclude_end? ? ')' : ']'}"
      else
        super
      end
    end

    (b) from
    2010-01-01 13:30:00 UTC
    (b) to
    2011-02-02 19:30:00 UTC

    "[#{quote(from)},#{quote(to)}#{value.exclude_end? ? ')' : ']'}"
    #=> "['2010-01-01 13:30:00.670277','2011-02-02 19:30:00.745125')"

Other Information

This commit includes tests to make sure the milliseconds are preserved in tsrange and tstzrange columns.

@rails-bot
Copy link

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @georgeclaghorn (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

@matthewd
Copy link
Member

Pulling in Quoting directly seems problematic.. like it might work for this type, but may not for others.

r? @sgrif

@rails-bot rails-bot assigned sgrif and unassigned georgeclaghorn Sep 26, 2017
@sgrif
Copy link
Contributor

sgrif commented Sep 26, 2017

Yeah, quoting at all is definitely wrong here. I'm not sure what I was thinking in 02579b5, but what this is doing is definitely in the wrong place. It should probably just be returning a range of subtype.serialize(value.begin) and subtype.serialize(value.end), and letting the adapter handle the rest.

@tcannonfodder
Copy link
Contributor Author

@sgrif Thanks for the feedback! I just implemented your suggestion in 25c282c ; the code is a lot more sensible now :)

@@ -133,6 +137,10 @@ def encode_array(array_data)
result
end

def encode_range(range)
"[#{quote(range.first)},#{quote(range.last)}#{range.exclude_end? ? ')' : ']'}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to call quote here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling quote makes sure the values are properly formatted for queries. For example: removing quote, causes the microseconds to be dropped

(byebug) from = type_cast_single_for_database(range.first)
2010-01-01 13:30:00 UTC

(byebug) to = type_cast_single_for_database(range.last)
2011-02-02 19:30:00 UTC

(byebug) "[#{from},#{to}#{value.exclude_end? ? ')' : ']'}"
"[2010-01-01 13:30:00 UTC,2011-02-02 19:30:00 UTC)"

(byebug) "[#{quote(from)},#{quote(to)}#{value.exclude_end? ? ')' : ']'}"
"['2010-01-01 13:30:00.670277','2011-02-02 19:30:00.745125')"

Let me know if there's another place where we should call quote on the values, but we need to call quote in order to make sure the values are properly formatted and nothing is lost. I think calling it in encode_range makes the most sense, because it's part of building the range.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling type_cast should do the same thing

def test_quote_range_date
range = (Date.parse("2010-01-01")...Date.parse("2010-01-01"))
type = OID::Range.new(Type::DateTime.new, :tsrange)
assert_equal "'[''2010-01-01'',''2010-01-01'')'", @conn.quote(type.serialize(range))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The round trip tests are sufficient, I don't think we need to test this quoting behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, so remove all the tests introduced in this commit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the ones in this file. The ones in range_test.rb all look good

@sgrif
Copy link
Contributor

sgrif commented Sep 26, 2017

Don't forget that you have a failing test, too

@tcannonfodder
Copy link
Contributor Author

@sgrif removed the extra quoting_tests.rb and fixed the failing test in 1fb70c1

@sgrif
Copy link
Contributor

sgrif commented Sep 26, 2017

LGTM. Can you please squash your commits?

@sgrif
Copy link
Contributor

sgrif commented Sep 26, 2017

This needs a changelog entry as well (probably just mention that tsrange now preserves subsecond precision)

@tcannonfodder tcannonfodder force-pushed the record-milliseconds-in-postgres-timestamp-ranges branch 2 times, most recently from 9d0d840 to 23d9d5b Compare September 26, 2017 23:53
@sgrif
Copy link
Contributor

sgrif commented Sep 27, 2017

Can you rebase as well so it's a single commit?

PostgreSQL 9.1+ introduced range types, and Rails added support for
using this datatype in ActiveRecord. However, the serialization of
`PostgreSQL::OID::Range` was incomplete, because it did not properly
quote the bounds that make up the range. A clear example of this is a
`tsrange`.

Normally, ActiveRecord quotes Date/Time objects to include the
milliseconds. However, the way `PostgreSQL::OID::Range` serialized its
bounds, the milliseconds were dropped. This meant that the value was
incomplete and not equal to the submitted value.

An example of normal timestamps vs. a `tsrange`. Note how the bounds
for the range do not include their milliseconds (they were present in
the ruby Range):

    UPDATE "iterations" SET "updated_at" = $1, "range" = $2 WHERE
    "iterations"."id" = $3
    [["updated_at", "2017-09-23 17:07:01.304864"],
    ["range", "[2017-09-23 00:00:00 UTC,2017-09-23 23:59:59 UTC]"],
    ["id", 1234]]

`PostgreSQL::OID::Range` serialized the range by interpolating a
string for the range, which works for most cases, but does not work
for timestamps:

    def serialize(value)
      if value.is_a?(::Range)
        from = type_cast_single_for_database(value.begin)
        to = type_cast_single_for_database(value.end)
        "[#{from},#{to}#{value.exclude_end? ? ')' : ']'}"
      else
        super
      end
    end

    (byebug) from = type_cast_single_for_database(value.begin)
    2010-01-01 13:30:00 UTC

    (byebug) to = type_cast_single_for_database(value.end)
    2011-02-02 19:30:00 UTC

    (byebug) "[#{from},#{to}#{value.exclude_end? ? ')' : ']'}"
    "[2010-01-01 13:30:00 UTC,2011-02-02 19:30:00 UTC)"

@sgrif (the original implementer for Postgres Range support) provided
some feedback about where the quoting should occur:

  Yeah, quoting at all is definitely wrong here. I'm not sure what I
  was thinking in 02579b5, but what this is doing is definitely in the
  wrong place. It should probably just be returning a range of
  subtype.serialize(value.begin) and subtype.serialize(value.end), and
  letting the adapter handle the rest.

`Postgres::OID::Range` now returns a `Range` object, and
`ActiveRecord::ConnectionAdapters::PostgreSQL::Quoting` can now encode
and quote a `Range`:

    def encode_range(range)
      "[#{type_cast(range.first)},#{type_cast(range.last)}#{range.exclude_end? ? ')' : ']'}"
    end

    ...

    encode_range(range)
    #=> "['2010-01-01 13:30:00.670277','2011-02-02 19:30:00.745125')"

This commit includes tests to make sure the milliseconds are
preserved in `tsrange` and `tstzrange` columns
@tcannonfodder tcannonfodder force-pushed the record-milliseconds-in-postgres-timestamp-ranges branch from ede37b4 to 51b6c34 Compare September 27, 2017 00:04
@tcannonfodder
Copy link
Contributor Author

Yep, sorry about that!

@sgrif sgrif merged commit 48ef5e0 into rails:master Sep 27, 2017
@alexdilley
Copy link

Leaving here for reference in case anyone else stumbles upon something similar.

Prior to this change I had the following:

scope :overlaps, (lambda { |tsrange|
  where Arel::Nodes::InfixOperation.new(
    '&&',
    arel_table[:interval],
    Arel.sql("'#{arel_table[:interval].type_cast_for_database(tsrange)}'")
  )
})

In Rails 5.2 this will need to be more like:

scope :overlaps, (lambda { |tsrange|
  where Arel::Nodes::InfixOperation.new(
    '&&',
    arel_table[:interval],
    Arel::Nodes::Quoted.new(tsrange)
  )
})

*which admittedly is a cleaner solution anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants