-
Notifications
You must be signed in to change notification settings - Fork 21.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Schema cache in YAML #27042
Schema cache in YAML #27042
Conversation
r? @eileencodes (@rails-bot has picked a reviewer for you, use r? to override) |
It would be great to get some feedback about the code while I'm fixing the CI failures. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if it is a good thing to have this PersistedSchemaCache
class. It is duplicating a lot of code that we already have in SchemaCache
and there is no real gain for the users and neither in terms of code maintenance in the Rails codebase. Now instead of one place to change the schema cache you have to change two places.
def data_source_exists?(name) | ||
@data_sources.include?(name) | ||
end | ||
alias table_exists? data_source_exists? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you don't need those aliases here.
end | ||
|
||
# Clears out internal caches | ||
def clear! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not being called so why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because it's a public method that someone may call, and since this class is immutable, we'd like to avoid that call.
Another thing. Instead of custom methods we should implement |
That's a good point that I was going to discuss here. That said, to make In a contrast, Another reason that we can't simply implement |
That is exactly the behavior we want. If the SchemaCache is loaded by a old dump we don't want the application to raise an exception, so it is better that it query that table. The schema cache should behave like a cache, not as the ultimate source of truth. The database should be the ultimate source of truth.
I'd expect that the SchemaCache already have all information about the tables of that connection when duping and what is exactly what was being done https://github.com/rails/rails/pull/27042/files#diff-28a5ae383b291583c513ad8eeed99a3aL274, you just moved this "hack" to inside of the new class.
I don't get it. We can write any format we want with |
Imagine we get rid of
This one would work, beside the fact that we'll still need to query But I don't see a way to use def init_with(coder)
columns_by_table = {}
columns_hash = {}
payload.fetch(:columns).each do |table_name, columns|
columns_by_table[table_name] = columns.map do |column_meta|
# here, we need to call `new_column_from_field` but we don't have access to connection
# from init_with
connection.new_column_from_field(table_name, column_meta)
end
columns_hash[table_name] = Hash[columns_by_table[table_name].map { |col|
[col.name, col]
}]
end
end Because |
self.connection_pool.schema_cache = cache.dup | ||
cache = YAML.load(File.read(filename)) | ||
if cache.fetch(:version) == ActiveRecord::Migrator.current_version | ||
schema_cache = ActiveRecord::ConnectionAdapters::PersistedSchemaCache.build_from_yaml(connection, cache) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think we need this line anymore
self.connection.schema_cache = cache | ||
self.connection_pool.schema_cache = cache.dup | ||
cache = YAML.load(File.read(filename)) | ||
if cache.fetch(:version) == ActiveRecord::Migrator.current_version |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Think this is just cache.version
again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code wise LGTM
Is there any backward compatibility concerns by replacing the dump file?
@@ -267,17 +267,17 @@ db_namespace = namespace :db do | |||
namespace :cache do | |||
desc "Creates a db/schema_cache.dump file." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
schema_cache.dump
-> schema_cache.yml
open(filename, "wb") { |f| f.write(Marshal.dump(con.schema_cache)) } | ||
conn.schema_cache.clear! | ||
conn.data_sources.each { |table| conn.schema_cache.add(table) } | ||
open(filename, "wb") { |f| f.write(YAML.dump(conn.schema_cache)) } | ||
end | ||
|
||
desc "Clears a db/schema_cache.dump file." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
schema_cache.dump
-> schema_cache.yml
@eugeneius thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of rubocop rules are broken in this PR. autocorrect should fix them.
Is there any backward compatibility concerns by replacing the dump file?
No. This actually make it backwards compatible. The only non-backward compatible problem we have now is Rails 5.1 not being able to load the dump file using marshal and the first deploy will have to generate the yaml version.
private | ||
|
||
def schema_dump_path | ||
'test/assets/schema_dump_50.yml' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name of this file should be 'test/assets/schema_dump_5_1.yml'
. I don't think it is going to be safe to backport this.
@rafaelfranca got it 👍 |
@rafaelfranca thanks for review! I updated the PR. |
ping @rafaelfranca |
BAM 🖐 |
WTF is schema cache: Rails has support for storing the schema information in
db/schema_cache.dump
to avoid hitting database withSHOW FULL FIELDS
. This feature was introduced in #5162.The problem now is that we dump all many things into Marshal, including internal column classes.
When you try to read schema cache generated by Rails 4.2 in Rails 5.0:
(because obviously, internal AR classes have been refactored).
Solution proposal
By serializing basic schema information into YAML instead of Marshal dump, we could make schema cache compatible between Rails versions and avoid exceptions like a described above.
Concerns
connection.column_definitions
method publicSchemaCache
classes and its dynamic cache methods, I introducedPersistedSchemaCache
that would be immutable.review @rafaelfranca @sgrif