-
Notifications
You must be signed in to change notification settings - Fork 873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add indexes to improve performance #4706
base: master
Are you sure you want to change the base?
Conversation
Hey, can you validate that adding : This index: CREATE INDEX "nn_FilesetentryFile" on FilesetEntry ("FilesetID","FileID"); should meet the same responsibility as your FilesetEntryFilesetID index. Also any new database changes need to be in their own subsequent .sql file. So you should add your changes to a 13. Additional index.sql or something as any existing databases created with the current canary would need to have another version so they know to apply the new indexes. I think as long as that .sql file hasn't been added to a release it is ok to include it in the same file but that isn't the case for this. |
Thanks for the feedback. I have moved the indexes to a separate file and incremented the db version to 13. Yes, as mentioned in the comments for Issue 4645, these new queries are necessary. I figured your nn_FilesetentryFile index would cover the same ground as my FilesetEntryFilesetID index, but for whatever reason, mine is necessary as well. I'm curious whether you see this using the query I posted in 4645 against your 10G db. |
@jedthe3rd Running |
This is what I was thinking was the case. Ya, I definitely think we should run analyze before each backup. I actually made this change in my code but haven't gotten around to creating the pull request because I am working on a few changes I wanted to batch. Just change this request to add analyze to run before the backup runs and that should be good. This should have a minimal time impact overall. Can you also see if you can update the backup status to show it's "optimizing the database" as a step. I think that would be a good addition to this change. If it seems like a bunch of work then we can leave that for a different time. |
I'm not in love with the idea of analyzing before each run. It's not best practice and could set a bad precedent for others looking to do something similar. While analyzing before each run may not be excessive overhead, ideally the analyze would occur immediately after adding the index. But since your PR has already been released to canary, I think that ship has sailed. As goofy as it looks, I think adding version 13 to do an analyze is the most viable option. Hmm. |
Ya, I didn't think that thought all the way through. Actually the best bet is probably to add an analyze after a database is upgraded here: https://github.com/duplicati/duplicati/blob/master/Duplicati/Library/SQLiteHelper/DatabaseUpgrader.cs |
@AxeOfMen & @jedthe3rd Do you want to include the changes here, or did you have a better fix in mind? |
These indexes are basically duplicates. The real solution to the issue he is facing is to add an analyze when the database is upgraded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still relevant in the same way, the comments on the pull request seem to indicate the addition of analyze on upgrade would be useful. That is true, should that be part of this chrnge?
Also note that vacuum appears to be called during operation startup as well, I cant recall without looking it uf if that does any analysis work.
@@ -82,6 +82,7 @@ CREATE TABLE "FilesetEntry" ( | |||
/* Improved reverse lookup for joining Fileset and File table */ | |||
CREATE INDEX "FilesetentryFileIdIndex" on "FilesetEntry" ("FileID"); | |||
CREATE INDEX "nn_FilesetentryFile" on FilesetEntry ("FilesetID","FileID"); | |||
CREATE INDEX "FilesetEntryFilesetID" ON "FilesetEntry" ("FilesetID") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line above has an index on the same leading column. The cost of maintaining this index will be higher shan using the original one as it will skipscan.
Is there a specific reason for this index or crn it be removed?
@@ -0,0 +1,4 @@ | |||
CREATE INDEX "FileLookupMetadataID" ON "FileLookup" ("MetadataID"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is value in adding sql comments indicating the operations this is known to improve so others working on this later can see the reason for the addition.
This pull request has been mentioned on Duplicati. There might be relevant details there: |
Add performance index