New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes #21691 - clean installed packages on upgrade #7069
Conversation
Issues: #21691 |
@@ -0,0 +1,56 @@ | |||
class CleanupInstalledPackages < ActiveRecord::Migration | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Extra empty line detected at class body beginning.
2ef951a
to
ec7d97b
Compare
Setting.where(:name => 'bulk_query_installed_packages', :category => 'Setting::Content').delete_all | ||
end | ||
|
||
def down |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am debating if you should do
def down
raise ActiveRecord::IrreversibleMigration
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that is needed? The db should be in the same state as before, just with de-duplicate data.
end | ||
|
||
#Copy unique entires to new table | ||
ActiveRecord::Base.connection.execute('insert into katello_installed_packages_new(nvra, name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couldn't you just run execute(....)
instead of ActiveRecord::Base.connection.execute
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess you can!
select distinct(nvra) as nvra, name from katello_installed_packages') | ||
|
||
#copy associations | ||
ActiveRecord::Base.connection.execute(' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
def up | ||
create_table "katello_installed_packages_new", force: :cascade do |t| | ||
t.string "name", limit: 255, null: false | ||
t.string "nvra", limit: 255, null: false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding a unique index to this column to speed up the search
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that it would help, but i can time it and see
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tested and with the index it took about ~4 minutes more.
t.integer "host_id", null: false | ||
t.integer "installed_package_id", null: false | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about a unique index for the combination columns here. This should speed up the lookup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There arne't any lookups being done on this table?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I guess add it on the katello_installed_packages_new
Can you give me like general estimate on how long it took to run this on a huge packages DB, like 10s of millions of rows. I am trying to figure out if we should spend cycles trying to fine tune this. |
@parthaa using a db with 100 million rows in katello_installed_packages it took about 2 hours. I'll see if adding that one index prior will help. |
sorry, hit enter too soon. I tried a variety of other methods. Attempting to cache more data resulted in OOMs, trying to do stuff in 'chunks' took wayyyyyy longer (still running after ~10 hours). My box only had ~7GB, so a user with 100Million rows is likely to have a lot more, and hopefully it would take less time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have any real complaints about the code. I did test the migration and the rollback (with host packages, of course), and it appears to work fine. ACK pending another reviewer.
ec7d97b
to
d918a59
Compare
ACK |
No description provided.