Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of DB Metadata impl #22779

Closed
wezell opened this issue Aug 17, 2022 · 5 comments
Closed

Improve performance of DB Metadata impl #22779

wezell opened this issue Aug 17, 2022 · 5 comments

Comments

@wezell
Copy link
Contributor

wezell commented Aug 17, 2022

When rolling out the metadata changes, we provided two impls, one a fs based impl and one a db impl. The DB impl was intended to be our default impl because it helped lower our dependency on the NFS share but we found that in actuality, it caused a bottleneck because it had to do a DB lookup, then write a temp file before it could return the metadata.

We should improve/tune up this impl so that it can become our default impl and just return the metadata using a single db query without having to write a tmp file.

@wezell
Copy link
Contributor Author

wezell commented Aug 17, 2022

@jdotcms
Copy link
Contributor

jdotcms commented Oct 25, 2022

Waiting for Fabro's QA

@fabrizzio-dotCMS
Copy link
Contributor

PR is #22834

@fabrizzio-dotCMS
Copy link
Contributor

fabrizzio-dotCMS commented Oct 29, 2022

I performed a load test using the tool called "siege"
https://medium.com/@guy.callaghan/load-testing-with-siege-3fa6a1e8118d
Feeding it with a file with 103 transactions. Each transaction is meant to generate metadata for 50 different binaries.
metadata-load-testing.txt

The test environment was seeded with 5300 separate file assets
the test was executed against both implementation DB vs FILE_SYSTEM in 3 rounds. On every round, the intermediary files were completely removed and the whole cache was flushed.

the results can be seen here https://docs.google.com/spreadsheets/d/1Qr0EByhNI_RUmtmhd8D6_oXRB0B24dQFyr8zaXq9Ako/edit#gid=0
But in summary

apparently now it is cheaper to copy the previously generated metadata from the database using the intermediate files than generating them directly from felix and then putting them as files and then also in the cache.

@stale
Copy link

stale bot commented Jan 2, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jan 2, 2023
jdotcms added a commit that referenced this issue Jan 10, 2023
dsilvam pushed a commit that referenced this issue Jan 30, 2023
* #22779 adding first changes

* #22779 the file can holds a binary buffer or a file

* #22779 adding more changes

* #22779 just adding a comment for future

* #22779 adding a normalization of the file repo manager, now it is grouping on directories

* #22779 minor changes to handle the hashes on folders

* #22779 adding more unit test

* #22779 adding more unit test

* #22779 adding a fix for an unit test

* #22779 adding more doc

---------

Co-authored-by: fabrizzio-dotCMS <fabrizzio@dotCMS.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

No branches or pull requests

5 participants