Skip to content
This repository

[Proposal] Schema cache dump #5162

Merged
merged 5 commits into from about 2 years ago

4 participants

Toshinori Kajihara Aaron Patterson David Heinemeier Hansson José Valim
Toshinori Kajihara
Collaborator

In my experience, if we had many models (ex. one hundred), Rails boot was slowly.
According to production log, it seems that AR's schema data loading is slowly especially.

Thus I've implemented schema cache dumping. Please review it.
I guess this implementation has many fixing point ;)

Usage:

$ edit config/environments/production.rb
config.use_schema_cache_dump = true
$ RAILS_ENV=production bundle rake db:schema:cache:dump
=> generate db/schema_cache.dump
$ RAILS_ENV=production rails s
Aaron Patterson
Owner

I like this idea, but can we change a few things?

First, can we just implement marshal_dump and marshal_load on the SchemaCache object? Second, I'm not sure that loading every model is the best idea for the schema cache. What about asking for all the tables and populating the cache that way? For example:

schema_cache.connection.tables.each do |table|
  schema_cache.populate(table)
end

Maybe not a populate method, but something. I don't really like the idea of requiring every model in order to get the schema cache.

I have another idea that is related to this: can we enable schema caching by default? We can use the migration version to determine if the cache should be expired. Maybe add a version method to the schema cache object.

Anyway, I really like this feature.

Toshinori Kajihara
Collaborator

Thank you for comment ! I'll improve the implement :)

Toshinori Kajihara
Collaborator

Hi @tenderlove

Done!
Please review new some commits.

Toshinori Kajihara

A hash with default_proc can't be dumped.

Aaron Patterson tenderlove merged commit 447ecb0 into from March 07, 2012
Aaron Patterson tenderlove closed this March 07, 2012
José Valim

This configuration should not be here. It is specific to Active Record and therefore should be defined in Active Record railtie.

Collaborator

Certainly, I agree with you.. Do you mean kennyj@82bd05a ?

David Heinemeier Hansson
Owner

Can you provide some benchmarks for this optimization? How much does it actually speed things up?

Toshinori Kajihara
Collaborator

I'll provide it, but I've many works during this week. Please, just wait a moment a few days.

Toshinori Kajihara
Collaborator

Sorry for keeping you waiting for this reply.

I tested about this performance.
But this result was not expected one.

・building environment steps
https://gist.github.com/3730757
・test result
https://gist.github.com/3730759

In my experience on Oracle, the queries to data dictionary were very slow when having many data.
Thus, by similar approache, we solved that problem.

I'll try to research a little more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
23  activerecord/CHANGELOG.md
Source Rendered
... ...
@@ -1,5 +1,28 @@
1 1
 ## Rails 4.0.0 (unreleased) ##
2 2
 
  3
+*   Added the schema cache dump feature.
  4
+
  5
+    `Schema cache dump` feature was implemetend. This feature can dump/load internal state of `SchemaCache` instance
  6
+    because we want to boot rails more quickly when we have many models.
  7
+
  8
+    Usage notes:
  9
+
  10
+      1) execute rake task.
  11
+      RAILS_ENV=production bundle exec rake db:schema:cache:dump
  12
+      => generate db/schema_cache.dump
  13
+
  14
+      2) add config.use_schema_cache_dump = true in config/production.rb. BTW, true is default.
  15
+
  16
+      3) boot rails.
  17
+      RAILS_ENV=production bundle exec rails server
  18
+      => use db/schema_cache.db
  19
+
  20
+      4) If you remove clear dumped cache, execute rake task.
  21
+      RAILS_ENV=production bundle exec rake db:schema:cache:clear
  22
+      => remove db/schema_cache.dump
  23
+
  24
+    *kennyj*
  25
+
3 26
 *   Added support for partial indices to PostgreSQL adapter
4 27
 
5 28
     The `add_index` method now supports a `where` option that receives a
5  activerecord/lib/active_record/connection_adapters/abstract_adapter.rb
@@ -86,6 +86,11 @@ def lease
86 86
         end
87 87
       end
88 88
 
  89
+      def schema_cache=(cache)
  90
+        cache.connection = self
  91
+        @schema_cache = cache
  92
+      end
  93
+
89 94
       def expire
90 95
         @in_use = false
91 96
       end
64  activerecord/lib/active_record/connection_adapters/schema_cache.rb
... ...
@@ -1,26 +1,17 @@
1 1
 module ActiveRecord
2 2
   module ConnectionAdapters
3 3
     class SchemaCache
4  
-      attr_reader :columns, :columns_hash, :primary_keys, :tables
5  
-      attr_reader :connection
  4
+      attr_reader :columns, :columns_hash, :primary_keys, :tables, :version
  5
+      attr_accessor :connection
6 6
 
7 7
       def initialize(conn)
8 8
         @connection = conn
9  
-        @tables     = {}
10 9
 
11  
-        @columns = Hash.new do |h, table_name|
12  
-          h[table_name] = conn.columns(table_name)
13  
-        end
14  
-
15  
-        @columns_hash = Hash.new do |h, table_name|
16  
-          h[table_name] = Hash[columns[table_name].map { |col|
17  
-            [col.name, col]
18  
-          }]
19  
-        end
20  
-
21  
-        @primary_keys = Hash.new do |h, table_name|
22  
-          h[table_name] = table_exists?(table_name) ? conn.primary_key(table_name) : nil
23  
-        end
  10
+        @columns      = {}
  11
+        @columns_hash = {}
  12
+        @primary_keys = {}
  13
+        @tables       = {}
  14
+        prepare_default_proc
24 15
       end
25 16
 
26 17
       # A cached lookup for table existence.
@@ -30,12 +21,22 @@ def table_exists?(name)
30 21
         @tables[name] = connection.table_exists?(name)
31 22
       end
32 23
 
  24
+      # Add internal cache for table with +table_name+.
  25
+      def add(table_name)
  26
+        if table_exists?(table_name)
  27
+          @primary_keys[table_name]
  28
+          @columns[table_name]
  29
+          @columns_hash[table_name]
  30
+        end
  31
+      end
  32
+
33 33
       # Clears out internal caches
34 34
       def clear!
35 35
         @columns.clear
36 36
         @columns_hash.clear
37 37
         @primary_keys.clear
38 38
         @tables.clear
  39
+        @version = nil
39 40
       end
40 41
 
41 42
       # Clear out internal caches for table with +table_name+.
@@ -45,6 +46,37 @@ def clear_table_cache!(table_name)
45 46
         @primary_keys.delete table_name
46 47
         @tables.delete table_name
47 48
       end
  49
+
  50
+      def marshal_dump
  51
+        # if we get current version during initialization, it happens stack over flow.
  52
+        @version = ActiveRecord::Migrator.current_version
  53
+        [@version] + [:@columns, :@columns_hash, :@primary_keys, :@tables].map do |val|
  54
+          self.instance_variable_get(val).inject({}) { |h, v| h[v[0]] = v[1]; h }
  55
+        end
  56
+      end
  57
+
  58
+      def marshal_load(array)
  59
+        @version, @columns, @columns_hash, @primary_keys, @tables = array
  60
+        prepare_default_proc
  61
+      end
  62
+
  63
+      private
  64
+
  65
+      def prepare_default_proc
  66
+        @columns.default_proc = Proc.new do |h, table_name|
  67
+          h[table_name] = connection.columns(table_name)
  68
+        end
  69
+
  70
+        @columns_hash.default_proc = Proc.new do |h, table_name|
  71
+          h[table_name] = Hash[columns[table_name].map { |col|
  72
+            [col.name, col]
  73
+          }]
  74
+        end
  75
+
  76
+        @primary_keys.default_proc = Proc.new do |h, table_name|
  77
+          h[table_name] = table_exists?(table_name) ? connection.primary_key(table_name) : nil
  78
+        end
  79
+      end
48 80
     end
49 81
   end
50 82
 end
17  activerecord/lib/active_record/railtie.rb
@@ -107,7 +107,7 @@ class Railtie < Rails::Railtie
107 107
       config.watchable_files.concat ["#{app.root}/db/schema.rb", "#{app.root}/db/structure.sql"]
108 108
     end
109 109
 
110  
-    config.after_initialize do
  110
+    config.after_initialize do |app|
111 111
       ActiveSupport.on_load(:active_record) do
112 112
         ActiveRecord::Base.instantiate_observers
113 113
 
@@ -115,6 +115,21 @@ class Railtie < Rails::Railtie
115 115
           ActiveRecord::Base.instantiate_observers
116 116
         end
117 117
       end
  118
+
  119
+      ActiveSupport.on_load(:active_record) do
  120
+        if app.config.use_schema_cache_dump
  121
+          filename = File.join(app.config.paths["db"].first, "schema_cache.dump")
  122
+          if File.file?(filename)
  123
+            cache = Marshal.load(open(filename, 'rb') { |f| f.read })
  124
+            if cache.version == ActiveRecord::Migrator.current_version
  125
+              ActiveRecord::Base.connection.schema_cache = cache
  126
+            else
  127
+              warn "schema_cache.dump is expired. Current version is #{ActiveRecord::Migrator.current_version}, but cache version is #{cache.version}."
  128
+            end
  129
+          end
  130
+        end
  131
+      end
  132
+
118 133
     end
119 134
   end
120 135
 end
19  activerecord/lib/active_record/railties/databases.rake
@@ -372,6 +372,25 @@ db_namespace = namespace :db do
372 372
     task :load_if_ruby => 'db:create' do
373 373
       db_namespace["schema:load"].invoke if ActiveRecord::Base.schema_format == :ruby
374 374
     end
  375
+
  376
+    namespace :cache do
  377
+      desc 'Create a db/schema_cache.dump file.'
  378
+      task :dump => :environment do
  379
+        con = ActiveRecord::Base.connection
  380
+        filename = File.join(Rails.application.config.paths["db"].first, "schema_cache.dump")
  381
+
  382
+        con.schema_cache.clear!
  383
+        con.tables.each { |table| con.schema_cache.add(table) }
  384
+        open(filename, 'wb') { |f| f.write(Marshal.dump(con.schema_cache)) }
  385
+      end
  386
+
  387
+      desc 'Clear a db/schema_cache.dump file.'
  388
+      task :clear => :environment do
  389
+        filename = File.join(Rails.application.config.paths["db"].first, "schema_cache.dump")
  390
+        FileUtils.rm(filename) if File.exists?(filename)
  391
+      end
  392
+    end
  393
+
375 394
   end
376 395
 
377 396
   namespace :structure do
15  activerecord/test/cases/connection_adapters/schema_cache_test.rb
@@ -39,6 +39,21 @@ def test_clearing
39 39
         assert_equal 0, @cache.tables.size
40 40
         assert_equal 0, @cache.primary_keys.size
41 41
       end
  42
+
  43
+      def test_dump_and_load
  44
+        @cache.columns['posts']
  45
+        @cache.columns_hash['posts']
  46
+        @cache.tables['posts']
  47
+        @cache.primary_keys['posts']
  48
+
  49
+        @cache = Marshal.load(Marshal.dump(@cache))
  50
+
  51
+        assert_equal 12, @cache.columns['posts'].size
  52
+        assert_equal 12, @cache.columns_hash['posts'].size
  53
+        assert @cache.tables['posts']
  54
+        assert_equal 'id', @cache.primary_keys['posts']
  55
+      end
  56
+
42 57
     end
43 58
   end
44 59
 end
3  railties/lib/rails/application/configuration.rb
@@ -11,7 +11,7 @@ class Configuration < ::Rails::Engine::Configuration
11 11
                     :force_ssl, :helpers_paths, :logger, :log_tags, :preload_frameworks,
12 12
                     :railties_order, :relative_url_root, :secret_token,
13 13
                     :serve_static_assets, :ssl_options, :static_cache_control, :session_options,
14  
-                    :time_zone, :reload_classes_only_on_change
  14
+                    :time_zone, :reload_classes_only_on_change, :use_schema_cache_dump
15 15
 
16 16
       attr_writer :log_level
17 17
       attr_reader :encoding
@@ -41,6 +41,7 @@ def initialize(*)
41 41
         @file_watcher                  = ActiveSupport::FileUpdateChecker
42 42
         @exceptions_app                = nil
43 43
         @autoflush_log                 = true
  44
+        @use_schema_cache_dump         = true
44 45
 
45 46
         @assets = ActiveSupport::OrderedOptions.new
46 47
         @assets.enabled                  = false
26  railties/test/application/initializers/frameworks_test.rb
@@ -193,5 +193,31 @@ def from_bar_helper
193 193
       require "#{app_path}/config/environment"
194 194
       assert_nil defined?(ActiveRecord::Base)
195 195
     end
  196
+
  197
+    test "use schema cache dump" do
  198
+      Dir.chdir(app_path) do
  199
+        `rails generate model post title:string`
  200
+        `bundle exec rake db:migrate`
  201
+        `bundle exec rake db:schema:cache:dump`
  202
+      end
  203
+      require "#{app_path}/config/environment"
  204
+      ActiveRecord::Base.connection.drop_table("posts") # force drop posts table for test.
  205
+      assert ActiveRecord::Base.connection.schema_cache.tables["posts"]
  206
+    end
  207
+
  208
+    test "expire schema cache dump" do
  209
+      Dir.chdir(app_path) do
  210
+        `rails generate model post title:string`
  211
+        `bundle exec rake db:migrate`
  212
+        `bundle exec rake db:schema:cache:dump`
  213
+
  214
+        `bundle exec rake db:rollback`
  215
+      end
  216
+      silence_warnings {
  217
+        require "#{app_path}/config/environment"
  218
+        assert !ActiveRecord::Base.connection.schema_cache.tables["posts"]
  219
+      }
  220
+    end
  221
+
196 222
   end
197 223
 end
19  railties/test/application/rake_test.rb
@@ -138,5 +138,24 @@ def test_rake_dump_structure_should_respect_db_structure_env_variable
138 138
       end
139 139
       assert File.exists?(File.join(app_path, 'db', 'my_structure.sql'))
140 140
     end
  141
+
  142
+    def test_rake_dump_schema_cache
  143
+      Dir.chdir(app_path) do
  144
+        `rails generate model post title:string`
  145
+        `rails generate model product name:string`
  146
+        `bundle exec rake db:migrate`
  147
+        `bundle exec rake db:schema:cache:dump`
  148
+      end
  149
+      assert File.exists?(File.join(app_path, 'db', 'schema_cache.dump'))
  150
+    end
  151
+
  152
+    def test_rake_clear_schema_cache
  153
+      Dir.chdir(app_path) do
  154
+        `bundle exec rake db:schema:cache:dump`
  155
+        `bundle exec rake db:schema:cache:clear`
  156
+      end
  157
+      assert !File.exists?(File.join(app_path, 'db', 'schema_cache.dump'))
  158
+    end
  159
+
141 160
   end
142 161
 end
Commit_comment_tip

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.