Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve query to cleanup the previews #40974

Merged
merged 7 commits into from
Sep 27, 2023
Merged

Improve query to cleanup the previews #40974

merged 7 commits into from
Sep 27, 2023

Conversation

jvillafanez
Copy link
Member

Description

Optimize the query to cleanup the previews to improve the performance of the job

Related Issue

https://github.com/owncloud/enterprise/issues/5978
#40971

Motivation and Context

Previous query was taking too much time to run, and it was causing problems.

How Has This Been Tested?

Manually tested with mysql. Need to test with postgresql and oracle.

  1. Upload some images to multiple users
  2. Remove some of those images, also from the trashbin.
  3. Run the command.

Screenshots (if appropriate):

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Database schema changes (next release will require increase of minor version instead of patch)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Technical debt
  • Tests only (no source changes)

Checklist:

  • Code changes
  • Unit tests added
  • Acceptance tests added
  • Documentation ticket raised:
  • Changelog item, see TEMPLATE

@jvillafanez jvillafanez force-pushed the improve_preview_cleanup branch 2 times, most recently from 570c551 to b8a39ec Compare September 7, 2023 10:40
@jvillafanez
Copy link
Member Author

@phil-davis could you check the test "the previews from shared files are cleaned up" in https://drone.owncloud.com/owncloud/core/39041/112/16 ? There might something wrong with that test, at least now with the code in the PR.

I've run the test manually an the result I get is the following:

root@8bb6dc4c97db:/var/www/owncloud# occ previews:cleanup
86 - Brian: deleted
87 - Brian: deleted
2 orphaned previews deleted
root@8bb6dc4c97db:/var/www/owncloud# occ previews:cleanup
68 - Alice: deleted
69 - Alice: deleted
68 - Brian: deleted
69 - Brian: deleted
4 orphaned previews deleted

The first execution is when Brian has deleted the 2 files from the trashbin. I guess those are the thumbnails generated when Brian deletes the files from the trashbin, because the files in the Brian's trashbin are copies of the original, so their file id is different.
Maybe the test doesn't access / generate those previews, so the test passes having deleted 0 thumbnails. This should be ok.

For the second execution, the Alice's thumbnails are deleted as expected for both files, and Brian's thumbnails should be the ones generated when the files were still shared. Since they were still linked to a valid file id, it's ok that they're deleted now.
So it makes sense that the test removes 4 previews, but I don't know is where the other 2 missing previews come from.

@jvillafanez
Copy link
Member Author

Same manually steps with master branch:

root@6d9676c87804:/var/www/owncloud# occ previews:cleanup
58 - Brian: deleted
57 - Brian: deleted
2 orphaned previews deleted
root@6d9676c87804:/var/www/owncloud# occ previews:cleanup
20 - Alice: deleted
21 - Alice: deleted
20 - Brian: deleted
21 - Brian: deleted
20 - Brian: cache cleared
21 - Brian: cache cleared
6 orphaned previews deleted

The reason for the different results is due to how the query is made.
Old query

+--------+------+---------+
| fileid | name | user_id |
+--------+------+---------+
|     29 | 26   | Alice   |
|     72 | 71   | Alice   |
|     29 | 26   | Brian   |
|     46 | 26   | Brian   |
|     72 | 71   | Brian   |
|     75 | 71   | Brian   |
+--------+------+---------+

Note that file ids 29 and 72 appear for Alice and Brian. This happens because Brian can access to those file ids by the created share

For the new query:

+--------+------+---------+
| fileid | name | storage |
+--------+------+---------+
|     29 | 26   |       5 |
|     72 | 71   |       5 |
|     46 | 26   |       7 |
|     75 | 71   |       7 |
+--------+------+---------+

In this case we're interested in the storage (we can figure out the owner later), so the entries coming from the share don't appear.

That should explain the different results, so it makes sense to adjust the test

@jvillafanez jvillafanez force-pushed the improve_preview_cleanup branch 2 times, most recently from 9c81c62 to 6978f0b Compare September 7, 2023 12:30
@jvillafanez jvillafanez force-pushed the improve_preview_cleanup branch from 6978f0b to 2f1902c Compare September 7, 2023 12:38
@phil-davis phil-davis self-assigned this Sep 8, 2023
@jvillafanez
Copy link
Member Author

@phil-davis I think I can adjust the test to make it pass, or do you want to change the tests in another way?
Other than that, I think this is ready. There is another failing test, but there is a connection problem to the external server at the moment.

@jvillafanez
Copy link
Member Author

Rise chunk size to 500. Cron job is expected to run up to 1 minute approximately.

@phil-davis
Copy link
Contributor

I think I can adjust the test to make it pass, or do you want to change the tests in another way?

I am quite busy with other things, so yes, adjust the tests to expect what you mentioned in the comments above. Ping me if you need advice...

@phil-davis
Copy link
Contributor

Note: issue #40979
We know that the core php sqlite Windows CI pipeline is not working. So the fail of that pipeline is "expected". Hopefully a decision can be made about what to do for that "real soon now".

@pako81
Copy link

pako81 commented Sep 14, 2023

@phil-davis @jvillafanez should we force-merge this?

@jvillafanez
Copy link
Member Author

If we need to rush it, I think we can force merge it, but I'd rather wait until CI is fixed if possible.

@phil-davis
Copy link
Contributor

@phil-davis @jvillafanez should we force-merge this?

only the known sqlite-windows unit test pipeline is failing, and I don't think that this PR has any impact on that.
IMO someone with the needed privs can force-merge.

@jnweiger
Copy link
Contributor

I've re-triggered the CI. The sqlite-windows unit test pipeline should now succeed! 👓

@jnweiger
Copy link
Contributor

@mrow4a @DeepDiver1975 Review here? CI is passing now.

@pako81
Copy link

pako81 commented Sep 20, 2023

@jvillafanez changelog is missing.

@jvillafanez
Copy link
Member Author

Changelog added

@owncloud owncloud deleted a comment from update-docs bot Sep 25, 2023
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

88.0% 88.0% Coverage
0.0% 0.0% Duplication

@pako81 pako81 requested a review from IljaN September 27, 2023 14:23
@pako81 pako81 merged commit 8b68054 into master Sep 27, 2023
@delete-merged-branch delete-merged-branch bot deleted the improve_preview_cleanup branch September 27, 2023 14:36
@IljaN
Copy link
Contributor

IljaN commented Sep 27, 2023

Tested the query with pgsql and cast to BIGINT ✅

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants