-
Notifications
You must be signed in to change notification settings - Fork 21.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change sqlite3 boolean serialization to use 1 and 0 #29699
Conversation
Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @rafaelfranca (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review. Please see the contribution instructions for more information. |
a5bc765
to
ebd461d
Compare
PostgreSQL has a real boolean type; it shouldn't need any configuration or migration (unless we're particularly concerned about people who've explicitly marked a I'm generally dubious about going out of our way to unify the adapters on a particular non-standard serialization, though. Strictly speaking the most correct form for the abstract adapter would be the (unquoted) TRUE and FALSE literals from SQL:1999... but few backends support that. I guess my claim is that while a lot of DBMSes choose to emulate booleans with some form of 1/0 pair, it's still a decision they're each making individually. |
You're right. Testing on PostgreSQL
shows that the boolean type means that old data will work. I do think the change is worthwhile for SQLite for the reasons outlined in the original issue, and I think it's worthwhile to make the change at the abstract adapter level. I don't think it's going very far out of our way, and I contest the idea that 1 and 0 are a non-standard serialization of boolean values - they are the most standard serialization. I'll write up a change to my documentation to make clear the differences between SQLite and PostgreSQL. |
To be clear, I definitely agree we should make the change for SQLite -- I'm only questioning which adapter layer's responsibility it should be.
They are the most common serialization; SQL:1999 (through SQL:2016) defines the standard representation. While a conforming BOOLEAN type is rare amongst the major RDBMSes, some research suggests that MySQL does provide enough aliases to pretty effectively emulate it. Given that, I'm inclined to argue we should in fact be using TRUE and FALSE in the abstract adapter, and overriding them to 1 & 0 only for SQLite: we would thus be giving properly-conformant PostgreSQL what it wants, and be delegating the decision on how to emulate it to MySQL. Thus we only dirty ourselves with a manual approximation for SQLite. To break out my thinking a little: I'd prefer not to default to assuming abstract adapter subclasses want to use 1 & 0 to represent booleans (though a NumericBooleans module, or something, would seem like a fine idea). I'd strongly prefer not to give Postgres ones and zeroes.. it just feels wrong to force it to accomodate the lowest common denominator, when it has a real actual type available -- and it seems too easy for the abstraction to leak anyway: anyone truly storing 0/1 will be happy with |
@matthewd Thanks for the extra explanation. My one concern with standardizing on def _quote(value)
case value
when true then unquoted_true
when false then unquoted_false
...
end which feels really wrong. What do you see as the best path forwards? |
Ah, excellent point. Hmm... I think it would be right for Interestingly, it looks like the MySQL adapter already does a variant on that: As for SQLite would indeed want what MySQL currently has: |
a5ae466
to
a93fffa
Compare
@matthewd thanks for all your help with this - I think it's in a far better state now. |
46437ed
to
033b1bb
Compare
Don't re-review yet. I'm working on changing the configuration to use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, this looks great!
Thank you for doing all the work -- the "sit on the side-lines and throw opinions" role is always much easier 😊
@@ -4,6 +4,8 @@ | |||
module ActiveRecord | |||
module ConnectionAdapters # :nodoc: | |||
module Quoting | |||
QUOTED_TRUE, QUOTED_FALSE = "TRUE".freeze, "FALSE".freeze |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's skip the constants.. with modern string freezing, it's actually (infinitesimally) slower than just doing it inline
# true. Conversion can be accomplshed by setting up a rake task which runs | ||
# | ||
# ExampleModel.where(boolean_column: true).update_all(boolean_column: 1) | ||
# ExampleModel.where(boolean_column: false).update_all(boolean_column: 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth using "boolean_column = 't'"
here, so it's valid regardless of the setting? (And actually, does this definitely work? Doesn't update_all
do a typecast on the value?)
Either way, maybe we should plan to include a gist in the upgrade guide that'll generate this by scanning a schema. Especially as this only affects SQLite, it seems reasonable to assume their DB is small enough to obviate any more complex piecemeal migration strategy.
Perhaps a single update per table, in fact? UPDATE example_models SET boolean_one = CASE boolean_one WHEN 't' THEN 1 WHEN 'f' THEN 0 ELSE boolean_one END, boolean_two = CASE .. END;
? It's uglier, but presumably more efficient... especially if no-one ever has to actually see it. 🤷♀️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example code definitely works, I've used it against a SQLite db. (Including your suggested modification - I like that it removes the order dependency between running it and setting the flag.) My concern with providing a gist that generates the code by examining the schema is that it necessarily makes assumptions that I'm not comfortable making. By providing example code, the responsibility of writing the conversion is with the people who understand their particular configuration, but we give them a good starting point.
initializer "active_record.check_represent_sqlite3_boolean_as_integer" do | ||
unless config.active_record.represent_sqlite3_boolean_as_integer | ||
ActiveSupport::Deprecation.warn <<-MSG | ||
Setting the flag `represent_sqlite3_boolean_as_integer` to false is deprecated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This sounds like our usual phrasing, but it occurs to me it could be confusing for people: almost everyone seeing this will be seeing it because they're not setting the flag at all -- so their (well, my, at least) first-instinct response to such a message (grep
) won't help.
Any thoughts on an alternate phrasing? "Leaving the flag ..
set to false", perhaps?
Setting the flag `represent_sqlite3_boolean_as_integer` to false is deprecated. | ||
SQLite databases have used 't' and 'f' to serialize boolean values and must have | ||
old data converted to 1 and 0 (its native boolean serialization) before setting | ||
this flag to true. Conversion can be accomplshed by setting up a rake task which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: accomplshed
assert_equal "f", @conn.type_cast(false) | ||
|
||
ActiveRecord::Base.represent_sqlite3_boolean_as_integer = true | ||
assert_equal 0, @conn.type_cast(false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests should make sure to leave the flag in the state they found it.. while it's likely nothing else cares now, it could cause an accidental test-ordering dependency later.
0d5bfe0
to
cc8d96b
Compare
@matthewd ready for another look. 🙂 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sync up the three descriptive blobs, and add a changelog entry, and I think this is good to go 👌🏻
cc8d96b
to
9ba5a33
Compare
Updated. |
Do we need another changelog entry here describing how the default serialization for booleans provided by I don't know how changes that only affect third-party database adapters are typically documented, so the answer to my question may well be "no". |
As a one of third party adapter Oracle enhanced adapter maintainer, let me show one of 3rd party adapter implementation. Just FYI. |
@yahonda That probably means you already override quoted/unquoted true/false and will be unaffected? |
Yes. Oracle enhanced adapter implements (overrides) these 4 methods. |
I think in this case it is fine to leave it outside the CHANGELOG. If a 3rd-party does not support native boolean type they would be already overriding the abstract adapter implementation, so it would be surprising if changing the behavior of the abstract adapter affects them. |
guides/source/configuring.md
Outdated
|
||
``` | ||
ActiveRecord::ConnectionAdapters::SQLite3Adapter.represent_boolean_as_integer = true | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this! I think the indentation coupled with the backticks are creating odd markdown for this part. If you click "view" on this file it shows the backticks inside a codeblock. I think if you un-indent this the extra backticks will go away. If not it's a bug in github's markdown and I'll report it to the appropriate team.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for that catch! The problem was the unindented paragraph above this block which broke out of the list item. The format of the code block was correct for inside a list item, but incorrect for outside of one.
Abstract boolean serialization has been using 't' and 'f', with MySQL overriding that to use 1 and 0. This has the advantage that SQLite natively recognizes 1 and 0 as true and false, but does not natively recognize 't' and 'f'. This change in serialization requires a migration of stored boolean data for SQLite databases, so it's implemented behind a configuration flag whose default false value is deprecated. The flag itself can be deprecated in a future version of Rails. While loaded models will give the correct result for boolean columns without migrating old data, where() clauses will interact incorrectly with old data. While working in this area, also change the abstract adapter to use `"TRUE"` and `"FALSE"` as quoted values and `true` and `false` for unquoted. These are supported by PostreSQL, and MySQL remains overriden.
9ba5a33
to
52e050e
Compare
👏 |
Awesome, thanks for doing this. We no longer use SQLite3 for testing but this is a good change none-the-less and should fix a host of issues (anyone trying to do boolean queries in the DB). |
Since rails#29699, abstract boolean serialization has been changed to use `TRUE` and `FALSE` literals. MySQL also support the literals. So we can use the abstract boolean serialization even for MySQL.
Just migrated to Looking at the PR it is only set to true in |
@thisismydesign thanks for the note. That should only be set if you call |
@matthewd Thanks. Yes I did. I think rather the default value of the setting is confusing. Going through the docs again it doesn't explicitly state what's the current default but the wording ("migrate and then explicitly set to |
Yeah, the ambiguity comes from the fact that the default is intended to differ between people generating a new 5.2 application (
|
Thanks everyone for your effort to fix this issue. It looks like the final solution is a great one given the context. |
# (its native boolean serialization) before setting this flag to true. | ||
# Conversion can be accomplished by setting up a rake task which runs | ||
# | ||
# ExampleModel.where("boolean_column = 't'").update_all(boolean_column: 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note for anyone else that might have run into this and have tried to migrate a databare from f/t to 0/1:
If your migration specified a default for your boolean column this will continue to produce f/t. The columns default value also needs to be updated.
Summary
Abstract boolean serialization has been using 't' and 'f', with MySQL overriding that to use 1 and 0. This change is a first step towards unifying the behaviour in the abstract adapter to always use 1 and 0. Fixes #17062
Other Information
Beyond the goal of unifying the representations, this has the advantage that SQLite natively recognizes 1 and 0 as true and false, but does not natively recognize 't' and 'f'. PostgreSQL natively recognizes both representations (and others).
This change in serialization requires a migration of stored boolean data for SQLite/PostgreSQL databases, so it's implemented behind a configuration flag whose default false value is deprecated. The flag itself can be deprecated in a future version of Rails. While loaded models will give the correct result for boolean columns without migrating old data, where() clauses will interact incorrectly with old data.
r? @rafaelfranca