Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix db/schema.rb generation #504

Merged
merged 5 commits into from
Sep 24, 2020
Merged

Conversation

NickLaMuro
Copy link
Member

@NickLaMuro NickLaMuro commented Aug 26, 2020

This is a first pass at the "Fix rake db:schema:load" option for #502

What this does

The first part of the changes in this PR are a refactoring that takes much of what is coded into individual migrations and the lib/migration_helper.rb, and applies them to the ActiveRecord::Base.connection.

The other part of the changes is creating introspection for the db/schema.rb generation via the MiqSchemaDumper, which is a collection of additions for everything that was refactored into the .connection to generate the schema.rb with our custom functions/contrainst/views/etc.

The functions that were created on (moved to) the .connection:

  • .add_trigger
  • .create_miq_metric_view
  • .add_miq_metric_table_inheritance

(and the drop_* equivalent functions for each of those)

The in addtion, there were a few introspection methods added there as well:

  • .connection.triggers
  • .connection.views

Which can be used to query the database to determine what exist for the triggers and views.

In addtion there is a additional introspection method in the MiqSchemaDumper called determine_table_parent, which is used to determine the table parent for a given metrics partition table.

Things for Reviewers to consider

  • Do we just want to leverage gems for things (like scenic for VIEWS) instead, or are the MIQ specifics enough to warrant me rolling it ourselves?
  • Could some of my introspection queries be done better in a more "PostgreSQL DB Admin Approved" fashion?
  • I used two different naming conventions for methods (with/without the miq_ prefix). Should we standardize?
  • Are things like the METRIC_ROLLUP_TABLE_REGEXP too specific and do we want to make them more generic, or is it fine to specifically target tables for these additions to be more "surgical" about these changes?

Testing

I have been testing by just running the following two commands:

$ RAILS_ENV=development rake db:environment:set evm:db:destroy db:migrate > /dev/null
$ RAILS_ENV=development rake db:environment:set evm:db:destroy db:schema:load db:seed > /dev/null

And also inspected the contents of the db/schema.rb, which should now include entries for the above functions. It should look something like this at the bottom:

  add_miq_metric_table_inheritance "metric_rollups_01", "metric_rollups_base", :conditions => ["capture_interval_name != ? AND EXTRACT(MONTH FROM timestamp) = ?", "realtime", "01"]
  add_miq_metric_table_inheritance "metric_rollups_02", "metric_rollups_base", :conditions => ["capture_interval_name != ? AND EXTRACT(MONTH FROM timestamp) = ?", "realtime", "02"]
  add_miq_metric_table_inheritance "metric_rollups_03", "metric_rollups_base", :conditions => ["capture_interval_name != ? AND EXTRACT(MONTH FROM timestamp) = ?", "realtime", "03"]
  # ...
  add_miq_metric_table_inheritance "metrics_00", "metrics_base", :conditions => ["capture_interval_name = ? AND EXTRACT(HOUR FROM timestamp) = ?", "realtime", "00"]
  add_miq_metric_table_inheritance "metrics_01", "metrics_base", :conditions => ["capture_interval_name = ? AND EXTRACT(HOUR FROM timestamp) = ?", "realtime", "01"]
  add_miq_metric_table_inheritance "metrics_02", "metrics_base", :conditions => ["capture_interval_name = ? AND EXTRACT(HOUR FROM timestamp) = ?", "realtime", "02"]
  add_miq_metric_table_inheritance "metrics_03", "metrics_base", :conditions => ["capture_interval_name = ? AND EXTRACT(HOUR FROM timestamp) = ?", "realtime", "03"]
  add_miq_metric_table_inheritance "metrics_04", "metrics_base", :conditions => ["capture_interval_name = ? AND EXTRACT(HOUR FROM timestamp) = ?", "realtime", "04"]
  # ...

  create_miq_metric_view "metrics"

  create_miq_metric_view "metric_rollups"

  add_trigger "insteadof", "metrics", "metrics_partition", <<-SQL
          CASE EXTRACT(HOUR FROM NEW.timestamp)
            WHEN 0 THEN
              INSERT INTO metrics_00 VALUES (NEW.*);
           ...
  SQL

  add_trigger "insteadof", "metric_rollups", "metric_rollups_partition", <<-SQL
          CASE EXTRACT(MONTH FROM NEW.timestamp)
            WHEN 1 THEN
              INSERT INTO metric_rollups_01 VALUES (NEW.*);
          ....
  SQL

@miq-bot miq-bot changed the title [POC] [WIP] Fix db/schema.rb generation [WIP] [POC] [WIP] Fix db/schema.rb generation Aug 26, 2020
@miq-bot miq-bot added the wip label Aug 26, 2020
@NickLaMuro NickLaMuro changed the title [WIP] [POC] [WIP] Fix db/schema.rb generation [POC] [WIP] Fix db/schema.rb generation Aug 26, 2020
@@ -0,0 +1,21 @@
module ManageIQ
module Schema
module MiqCommandRecorder
Copy link
Member Author

@NickLaMuro NickLaMuro Aug 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Most of these are not used, but I think this is intended for def change methods in a migration, so that they can be inverted if needed. So they are here more to make this "feature complete" for future migrations, if any of these methods were to be used again, but arguably that is outside the scope of this PR.

lib/migration_helper.rb Outdated Show resolved Hide resolved
@Fryguy
Copy link
Member

Fryguy commented Sep 2, 2020

I love this PR...let's get this over the finish line.

@Fryguy
Copy link
Member

Fryguy commented Sep 2, 2020

Is it possible for us to remove the miq-specific stuff from the schema dumper? So, for example, we have add_miq_metric_table_inheritance, but nothing in that method seems miq specific, so can we name it add_table_inheritance? The only thing seems to be the name of the actual constraint, but generically that can be a constraint_name parameter that defaults to #{table}_inheritance_check if not passed (or we just don't allow passing it at all going the convention-over-configuration route)

Do we just want to leverage gems for things (like scenic for VIEWS) instead, or are the MIQ specifics enough to warrant me rolling it ourselves?

As I'm reading this PR I'm seeing a gem to do this generically. So I'd expect we use gems for anything that exists like views, and write our own gem for other things (like maybe triggers if there's no gem for it). Even so, I like what's here now, so I'd prefer merging what we have, let it get some battle-testing in, then extract / change to gems after.

@Fryguy
Copy link
Member

Fryguy commented Sep 2, 2020

Testing

Can you also verify running db:migrate to a particular VERSION= dumps the schema appropriately at that point in time?

@NickLaMuro
Copy link
Member Author

Is it possible for us to remove the miq-specific stuff from the schema dumper? ...

@Fryguy This one is a bit of a mind melter, so I have to get my head back into the headspace for this one, but I think I specifically chose to do this for add_miq_metric_table_inheritance since it was something that I was thinking might need the METRIC_ROLLUP_TABLE_REGEXP treatment done to it as well. That regex was the part of this PR that I was least proud of, but couldn't really find a better solution for it.

However, I agree that the add_miq_ bit is kinda gross, but I wanted to make sure we were distinguishing via method names what is unique to MIQ, and that the current implementation is very specific to our needs. I do think this is code that needs to be pretty static, since the thought of editing previous migrations due to a bug makes my skin crawl. However, maybe we could do something like what ActiveRecord does and have "versioned" methods for this:

https://github.com/rails/rails/blob/90551566/activerecord/lib/active_record/migration/compatibility.rb

Anyway, I will look into this some more later today.

I love this PR...let's get this over the finish line.

I was actually thinking that based on this #502 (comment) you would have had issue with this, so in this case, feels good to be "wrong" 😄

@Fryguy
Copy link
Member

Fryguy commented Sep 2, 2020

I actually want both since they are 2 different things. This PR, for me, is less about performance, and more about getting back to "normal Rails" where you just use the dumped schema directly. That you get a performance boost in using normal Rails is more of a side effect. Completely separately is the performance aspect of a migration from scratch and the collapsed initial migration, which I still think is good to do also.

My intent in #502 (comment) was that the collapsed initial migration will, I think, actually remove some of the work in this PR. That is, the migration I will collapse into is 20190726204302_remove_local_default_embedded_ansible_repos.rb, which comes after 20190122213042_use_views_for_metrics.rb. So, I was thinking this might then remove or perhaps simplify some of the metrics table stuff, so it would be easier to do this PR after the collapse.

@NickLaMuro NickLaMuro changed the title [POC] [WIP] Fix db/schema.rb generation [WIP] Fix db/schema.rb generation Sep 2, 2020
By overriding the `ActiveRecord::SchemaDumper`, we can ensure that the
schema for the inherited metrics tables definitions:

    create_table "metric_rollups_01", id: :bigint, default: -> { "nextval('metric_rollups_base_id_seq'::regclass)" }, force: :cascade do |t|
      t.datetime "timestamp"
      t.integer "capture_interval"
      t.string "resource_type"
      t.bigint "resource_id"
      t.float "cpu_usage_rate_average"
      t.float "cpu_usagemhz_rate_average"
      t.float "mem_usage_absolute_average"
      t.float "disk_usage_rate_average"
      t.float "net_usage_rate_average"
      # ...
    end

and produces the following error:

    $ rake db:schema:load
    ...
    -- create_table("metric_rollups_01", {:id=>:bigserial, :default=>#<Proc:...>, :force=>:cascade})
    NOTICE:  table "metric_rollups_01" does not exist, skipping
    rake aborted!
    ActiveRecord::StatementInvalid: PG::SyntaxError: ERROR:  multiple default values specified for column "id" of table "metric_rollups_01"
    : CREATE TABLE "metric_rollups_01" (
      "id" bigserial DEFAULT nextval('metric_rollups_base_id_seq'::regclass) NOT NULL PRIMARY KEY,
      "timestamp" timestamp,
      "capture_interval" integer,
      "resource_type" character varying,
      ...
    )
    activerecord-5.2.4.3/lib/active_record/connection_adapters/postgresql/database_statements.rb:75:in `exec'
    activerecord-5.2.4.3/lib/active_record/connection_adapters/postgresql/database_statements.rb:75:in `block (2 levels) in execute'
    activesupport-5.2.4.3/lib/active_support/dependencies/interlock.rb:48:in `block in permit_concurrent_loads'
    activesupport-5.2.4.3/lib/active_support/concurrency/share_lock.rb:187:in `yield_shares'

doesn't add the `:default` key for the `create_table` definition.

* * *

Note:  This doesn't fix the missing table alterations, views, and
triggers that are still needed for a proper `db:schema:load`, but it is
a start of how we could fix it, and where we can go from here to address
it further using this approach.
In short:  this change moves the `add_trigger` and `drop_trigger`
functionality into the `ActiveRecord::Base.connection` directly (using
some shared conventions from other gems), and also adds support into the
SchemaDumper for database introspection to allow support for triggers
via the `db_schema` (using the same functions now available on the
`.connection` object).

Specifically, two methods needed to be added to support this:

- `ActiveRecord::Base.connection.triggers`
- `ActiveRecord::ConnectionAdapters::SchemaDumper.triggers`

The former provides introspection into the existing database to fetch
the existing triggers that have been created for our database, including
the `direction`, `table`, `name`, and `body`, all of which are the
arguments for our `ActiveRecord::Base.connection.add_trigger` function.

The latter writes the `trigger` statements to the `db/schema.rb` in the
format that the function takes them, formatting the data from the db
into a format that `add_trigger` expects.

* * *

This is a bit dense, but it follows a lot of the conventions for
modifying activerecord that were done in the gem `scenic` to add SQL
view support to postgres:

  https://github.com/scenic-views/scenic/tree/048e0805

So perusing the source code there might help.
Similar to the additions of `add_trigger` and `drop_trigger`
functionality to the `ActiveRecord::Base.connection` and
`ActiveRecord::ConnectionAdapters::SchemaDumper`, this adds new methods
`create_miq_metric_view` and `drop_miq_metric_view`.

* * *

This is a bit dense, but it follows a lot of the conventions for
modifying activerecord that were done in the gem `scenic` to add SQL
view support to postgres:

  https://github.com/scenic-views/scenic/tree/048e0805

So perusing the source code there might help
Similar to the additions of `add_trigger` and `drop_trigger`
functionality to the `ActiveRecord::Base.connection` and
`ActiveRecord::ConnectionAdapters::SchemaDumper`, this adds the
following two new methods:

- `add_miq_metric_table_inheritance`
- `drop_miq_metric_table_inheritance`

Since these functions where incredibly "metrics specific", it didn't
seem correct to name them `.add_table_inheritance` as a generic
function, so prefixing with `add_miq...` was how that was handled.

This might make sense for other functions, but left it isolated to this
commit for comparing and contrasting.

* * *

This is a bit dense, but it follows a lot of the conventions for
modifying activerecord that were done in the gem `scenic` to add SQL
view support to postgres:

  https://github.com/scenic-views/scenic/tree/048e0805

So perusing the source code there might help
@NickLaMuro
Copy link
Member Author

NickLaMuro commented Sep 2, 2020

Testing

Can you also verify running db:migrate to a particular VERSION= dumps the schema appropriately at that point in time?

@Fryguy So I ran the following:

VERSION=20190726204302 RAILS_ENV=development rake db:environment:set evm:db:destroy db:migrate

And then compared it with your collapsed migration. They were pretty close minus a few things you must have changed manually.

db/schema.rb vs 20190726204302_collapsed_initial_migration.rb diff
--- db/schema.rb	2020-09-02 18:06:26.000000000 -0500
+++ json_migration.rb	2020-09-02 18:12:44.000000000 -0500
@@ -1,18 +1,3 @@
-# This file is auto-generated from the current state of the database. Instead
-# of editing this file, please use the migrations feature of Active Record to
-# incrementally modify your database, and then regenerate this schema definition.
-#
-# Note that this schema.rb definition is the authoritative source for your
-# database schema. If you need to create the application database on another
-# system, you should be using db:schema:load, not running all the migrations
-# from scratch. The latter is a flawed and unsustainable approach (the more migrations
-# you'll amass, the slower it'll run and the greater likelihood for issues).
-#
-# It's strongly recommended that you check this file into your version control system.
-
-ActiveRecord::Schema.define(version: 2019_07_26_204302) do
-
-  # These are extensions that must be enabled in order to support this database
   enable_extension "plpgsql"
 
   create_table "accounts", force: :cascade do |t|
@@ -1203,6 +1188,7 @@
     t.index ["resource_id", "resource_type"], name: "index_conversion_hosts_on_resource_id_and_resource_type"
     t.index ["resource_type", "resource_id"], name: "index_conversion_hosts_on_resource_type_and_resource_id"
   end
+  change_column_comment "conversion_hosts", "id", "The internal database ID."
 
   create_table "custom_attributes", force: :cascade do |t|
     t.string "section"
@@ -5972,6 +5958,7 @@
     t.string "zone", comment: "The zone in which the task should be executed, or any zone if null."
     t.index ["miq_server_id"], name: "index_miq_tasks_on_miq_server_id"
   end
+  change_column_comment "miq_tasks", "id", "[builtin] The internal record ID. This is the primary key."
 
   create_table "miq_user_roles", force: :cascade do |t|
     t.string "name"
@@ -7503,6 +7490,7 @@
     t.index ["type"], name: "index_vms_on_type"
     t.index ["uid_ems"], name: "index_vms_on_vmm_uuid"
   end
+  change_column_comment "vms", "id", "The internal database ID."
 
   create_table "volumes", force: :cascade do |t|
     t.string "name"
@@ -7579,7 +7567,7 @@
 
   create_miq_metric_view "metric_rollups"
 
-  add_trigger "insteadof", "metrics", "metrics_partition", <<-SQL
+  add_trigger "insteadof", "metrics", "metrics_partition", <<~SQL
           CASE EXTRACT(HOUR FROM NEW.timestamp)
             WHEN 0 THEN
               INSERT INTO metrics_00 VALUES (NEW.*);
@@ -7633,7 +7621,7 @@
           RETURN NEW;
   SQL
 
-  add_trigger "insteadof", "metric_rollups", "metric_rollups_partition", <<-SQL
+  add_trigger "insteadof", "metric_rollups", "metric_rollups_partition", <<~SQL
           CASE EXTRACT(MONTH FROM NEW.timestamp)
             WHEN 1 THEN
               INSERT INTO metric_rollups_01 VALUES (NEW.*);
@@ -7662,5 +7650,3 @@
           END CASE;
           RETURN NEW;
   SQL
-
-end

Note: I did delete some lines (the def down contents for example) from the collapsed migration file I had locally and unindented it as well so it would diff -u nicely, but other than that, I didn't modify anything in the schema.

@Fryguy
Copy link
Member

Fryguy commented Sep 2, 2020

Yeah so it turns out Rails schema dumper doesn't dump comments on the id column. We happened to have a spec for that which is what caught it, so I added it manually.

@NickLaMuro
Copy link
Member Author

Yeah so it turns out Rails schema dumper does dump comments on the id column.

s/does dump/does drop/ ?

Is this something that I should override as part of #504 ?

@kbrock
Copy link
Member

kbrock commented Sep 4, 2020

the collapsed initial migration will, I think, actually remove some of the work in this PR
-- Fryguy

good catch. removing the old metrics migration will buy us a lot

And then compared it with your collapsed migration. They were pretty close minus a few things you must have changed manually.
-- NickLaMuro

that is one beautiful diff

this is a very exciting pr

@Fryguy
Copy link
Member

Fryguy commented Sep 9, 2020

Ugh..typing fail ... s/does dump/doesn't dump/ (edited original comment)

Yes, I think that is something the schema dumper in this PR will need to add back in (in fact, I'd consider it a bug to be pushed into upstream Rails too)

@NickLaMuro
Copy link
Member Author

NickLaMuro commented Sep 11, 2020

@Fryguy Sorry for taking a while to fix, but addressed the above with the most most recent commit. Ran the QA steps again with that code in place to confirm that db:schema:load re-adds the comment:

$ bin/rails c
** ManageIQ master, codename: Kasparov
Loading development environment (Rails 5.2.4.3)
irb(main):001:0> Vm.columns.detect {|c| c.name == ActiveRecord::Base.connection.primary_key("vms")}.comment
PostgreSQLAdapter#log_after_checkout, connection_pool: size: 5, connections: 1, in use: 1, waiting_in_queue: 0
#=> "The internal database ID."
irb(main):002:0> MiqTask.columns.detect {|c| c.name == ActiveRecord::Base.connection.primary_key("miq_tasks")}.comment
#=> "[builtin] The internal record ID. This is the primary key."
irb(main):003:0>

@NickLaMuro NickLaMuro changed the title [WIP] Fix db/schema.rb generation Fix db/schema.rb generation Sep 11, 2020
@miq-bot miq-bot removed the wip label Sep 11, 2020
Copy link
Member

@kbrock kbrock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I think I said before, this is one great PR

lib/manageiq/schema/schema_dumper.rb Show resolved Hide resolved
Comment on lines +15 to +16
add_id_column_comment(table, stream)
track_miq_metric_table_inheritance(table)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a little surprised that you only needed to add comments to the id.
If rails already takes care of comments for all columns but the id, does it make sense to just drop our comments on the id and just use what is built into rails?

Sometimes nice to just go with the flow of rails

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think the reason :id isn't include because also the "table description" is included as part of the create_table method arguments AND that method signature is usually what defines our :id columns.

For example, for conversion_hosts:

  create_table "conversion_hosts", comment: "Conversion Hosts", force: :cascade do |t|
    t.string "name", comment: "A symbolic name for the conversion host."
    t.string "address", comment: "The IP address for the conversion host. If present, must be one of the associated resource's IP addresses."
    
    # ...

  end

  change_column_comment "conversion_hosts", "id", "The internal database ID."

The comment: that is there is for the table itself, and the id column is just inferred by the table definition.

lib/manageiq/schema/schema_dumper.rb Outdated Show resolved Hide resolved
lib/manageiq/schema/schema_dumper.rb Outdated Show resolved Hide resolved
For some reason this isn't handled by Rails, so this fixes that
oversight/bug.
@miq-bot
Copy link
Member

miq-bot commented Sep 14, 2020

Some comments on commits NickLaMuro/manageiq-schema@ca2a5db~...44e0fc8

lib/manageiq/schema/schema_dumper.rb

  • ⚠️ - 26 - Detected puts. Remove all debugging statements.
  • ⚠️ - 27 - Detected puts. Remove all debugging statements.
  • ⚠️ - 44 - Detected puts. Remove all debugging statements.
  • ⚠️ - 49 - Detected puts. Remove all debugging statements.
  • ⚠️ - 54 - Detected puts. Remove all debugging statements.
  • ⚠️ - 55 - Detected puts. Remove all debugging statements.
  • ⚠️ - 78 - Detected puts. Remove all debugging statements.
  • ⚠️ - 82 - Detected puts. Remove all debugging statements.

@miq-bot
Copy link
Member

miq-bot commented Sep 14, 2020

Checked commits NickLaMuro/manageiq-schema@ca2a5db~...44e0fc8 with ruby 2.6.3, rubocop 0.69.0, haml-lint 0.28.0, and yamllint
5 files checked, 3 offenses detected

lib/manageiq/schema/schema_dumper.rb

lib/manageiq/schema/schema_statements.rb

@Fryguy Fryguy merged commit 9c15e8f into ManageIQ:master Sep 24, 2020
@NickLaMuro NickLaMuro mentioned this pull request Oct 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants