Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix SchemaStatements#initialize_type_map #309

Merged
merged 1 commit into from
Aug 10, 2020

Conversation

thibaudgg
Copy link
Contributor

Since the Rails 6 upgrade, we started seeing some random PostGIS
serialization/deserialization issues, debugging the issues shows that the
ActiveRecord connection type_map isn't initialized properly.
It does not contain the PostGIS OID type mappings after initialize_type_map has ended.

Usually, Rails would add missing OID mappings when encountered, however,
there are code paths that go around this (See lookup_cast_type_from_column).
This is most likely a bug in Rails internals.

This patch fixes the initialize_type_map method by ensuring that
the specific PostGIS types are registered before calling super which
will then call load_additional_types and registers the other PostGIS OIDs.

This patch should solve #307 and #308.

Since the Rails 6 upgrade, we started seeing some random PostGIS
serialization/deserialization issues, debugging the issues shows that the
ActiveRecord connection `type_map` isn't initialized properly.
It does not contain the PostGIS OID type mappings after `initialize_type_map` has ended.

Usually, Rails would add missing OID mappings when encountered, however,
there are code paths that go around this (See [`lookup_cast_type_from_column`](https://github.com/rails/rails/blob/15748f6a052d4df68b6acf66456c181b42d9fe25/activerecord/lib/active_record/connection_adapters/postgresql/quoting.rb#L78)).
This is most likely a bug in Rails internals.

This patch fixes the [`initialize_type_map`](https://github.com/rgeo/activerecord-postgis-adapter/blob/cc0e3f53520feed0b43800d8ee86561ee03de413/lib/active_record/connection_adapters/postgis/schema_statements.rb#L87-L105) method by ensuring that
the specific PostGIS types are registered before calling `super` which
will then call [`load_additional_types`](https://github.com/rgeo/activerecord-postgis-adapter/blob/cc0e3f53520feed0b43800d8ee86561ee03de413/lib/active_record/connection_adapters/postgis/schema_statements.rb#L87-L105) and registers the other PostGIS OIDs.

This patch should solve rgeo#307 and rgeo#308.
senny added a commit to rails/rails that referenced this pull request Feb 25, 2020
…illaume-Gentil & Yves Senn]

After upgrading to Rails 6 we have encountered sporadic type casting
issues in our staging system. Our production installation did not
exhibit the same behavior.

The error behavior looked something like this:

```
TypeError: can't quote RGeo::Geographic::SphericalPointImpl

[GEM_ROOT]/gems/activerecord-6.0.2.1/lib/active_record/connection_adapters/abstract/quoting.rb:231:in `_quote'
[GEM_ROOT]/gems/activerecord-6.0.2.1/lib/active_record/connection_adapters/postgresql/quoting.rb:144:in `_quote'
[GEM_ROOT]/gems/activerecord-6.0.2.1/lib/active_record/connection_adapters/abstract/quoting.rb:18:in `quote'
[GEM_ROOT]/gems/activerecord-postgis-adapter-6.0.0/lib/active_record/connection_adapters/postgis_adapter.rb:105:in `quote'
[GEM_ROOT]/gems/activerecord-6.0.2.1/lib/arel/collectors/substitute_binds.rb:17:in `add_bind'
...
```

or this:

```
NoMethodError: undefined method `factory' for "0101000020E61000006094A0BFD023014089601C5C3AB64440":String
[GEM_ROOT]/gems/rgeo-1.1.1/lib/rgeo/feature/types.rb:160:in `cast'
[GEM_ROOT]/gems/rgeo-1.1.1/lib/rgeo/geographic/spherical_feature_methods.rb:31:in `distance'
...
```

Both indicate that data for PostGIS columns was not properly type
casted. In the first case when writing/querying and in the second when
reading.

Through instrumentation on the staging system, we were able to
replicate the flow of events that caused our model to enter this
state.

1. PostGIS ActiveRecord Adapter

We've identified a [bug in the
`activerecord-postgis-adapter`](rgeo/activerecord-postgis-adapter#309)
that results in a partially initialized `type_map`. This is a
prerequisite for the bug outlined in this ticket to ocurr. Note that
the bug is not exclusive to PostGIS, it could happen with other custom
defined types in PostgreSQL as well. Important is that the `type_map`
does not include a mappin for that OID type after
[`initialize_type_map`](https://github.com/rails/rails/blob/8d57cb39a88787bb6cfb7e1c481556ef6d8ede7a/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb#L526)
ran.

2. `type_map` lookups

There are two instances where the PostgreSQL adapter uses the
`type_map`:

- (A) [`get_oid_type`](https://github.com/rails/rails/blob/master/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb#L514-L516)
- (B) [`lookup_cast_type_from_column`](https://github.com/rails/rails/blob/15748f6a052d4df68b6acf66456c181b42d9fe25/activerecord/lib/active_record/connection_adapters/postgresql/quoting.rb#L77-L79)

What's important to note here is that `get_oid_type` will attmpt to
add missing OID mappings, while `lookup_cast_type_from_column` assums
the `type_map` is complete and does a simple access.

3. Attribute definitions in ModelSchema.

The attributes of a model are defined on first access in
[`load_schema!`](https://github.com/rails/rails/blob/master/activerecord/lib/active_record/model_schema.rb#L491-L498)
using `lookup_cast_type_from_column` (the `type_map` lookup that does
not try to load missing OID mappings). However, in most scenarios,
this does not cause an issue because of the preceeding line
[`connection.schema_cache.columns_hash(table_name)`](https://github.com/rails/rails/blob/master/activerecord/lib/active_record/model_schema.rb#L490). Fetching
the table structure will go through `get_oid_type` and add the missing
OID just before `lookup_cast_type_from_column` happens.

4. Shared data

The issue only occurs because some information is shared between
threads, while other data is local to a connection:

- `type_map`: Local to a connection. Fresh connections can lack
  certain OID mappings.
- `schema_cache`: Per connection pool.
- `ModelSchema`: Global.

5. Flow of events that cause the problem

Given the behavior outlined above, there is a sequence of events that
can cause the ModelSchema to define an attribute with
`ActiveModel::Type::Value` (default fallback) instead of a specific
OID type. The sequence that we have seen goes something like this:

1. Connection A caches the `columns_hash` for table `examples` in the
`schema_cache`. It does not yet define the `ModelSchema` for
Example. Loading the column definitions goes through `get_oid_type`
and registeres the missing OID.
2. Connection B (a fresh connection) defines the `ModelSchema` for `Example`. Since
`schema_cache` returns cached data for `examples` (`get_oid_type` is
not used). The `type_map` lookup in `lookup_cast_type_from_column`
will not find the OID and use `ActiveModel::Type::Value` instead.

It's very likely that Connection B later goes through `get_oid_type`
and completes it's `type_map`. However, since the attribute
definitions are shared for all threads. This process is now in a
corrupted state with invalid type casting.

6. The patch

Making sure that `lookup_cast_type_from_column` has the same semantics
as `get_oid_type` will prevent `type_map` lookups of OIDs that could
be loaded by Rails. It's worth noting that in our case, the main issue
was that the activerecord-postgis-adapter left the `type_map` only
partially initialized. Fixing that, will shadow the potential conflict
outlined in this ticket.

Co-authored-by: Thibaud Guillaume-Gentil <thibaud@thibaud.gg>
@ugisozols
Copy link

I've been using this patch for a couple of days and haven't seen the random serialization/deserialization issue since. TY for fixing it!

@thibaudgg
Copy link
Contributor Author

@teeparham anything I could do to help to merge this?

@frankhock
Copy link

Would love to see this merged

@scottfromsf
Copy link

I also would love to see this merged. The fix is small and straight-forward (literally just moving a call to super) so I hope it can get past review soon.

@annikoff
Copy link

I am also waiting for this to be merged.

@phlegx
Copy link
Contributor

phlegx commented Jul 8, 2020

Please merge this PR. Thx!

@Zelnox
Copy link

Zelnox commented Aug 10, 2020

I think I'm getting random failures in CI caused this.

@keithdoggett keithdoggett merged commit 20b6b4f into rgeo:master Aug 10, 2020
@keithdoggett
Copy link
Member

Thanks for the patience on this everyone and thanks for the PR @thibaudgg

@Zelnox
Copy link

Zelnox commented Aug 10, 2020

Do we know if there will be a new gem version soonish? I can switch to use master for now.

@thibaudgg
Copy link
Contributor Author

Thanks for merging it @keithdoggett! 🎉

@keithdoggett
Copy link
Member

@Zelnox I'd like to get a release out by the end of this week

@keithdoggett
Copy link
Member

keithdoggett commented Aug 16, 2020

Just released v6.0.1 with this patch 🎉

@thibaudgg
Copy link
Contributor Author

Awesome, thanks @keithdoggett!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants