Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor scanners #447

Merged
merged 44 commits into from Sep 23, 2022
Merged

Refactor scanners #447

merged 44 commits into from Sep 23, 2022

Conversation

vgarleanu
Copy link
Member

@vgarleanu vgarleanu commented May 20, 2022

This PR refactors the scanners with the goal of using sqlite transactions more intelligently and efficiently. This new scanner improves the throughput of how many items we can scan and has been re-architected to be easily extendable with metadata providers and filename parsers.

Some stuff has been deleted such as the insertion of posters into the asset fetching queue and the sending of websocket events to clients. This has been done in favour of implementing a CDC based approach to sending events using sqlite commit hooks.

Potential future improvements:

  • Have a shared instance of all metadata providers accessible at the global level, so that rate-limiting and caching of data works at a global level than per scanner instance. (This would also decrease the chance of us being rate-limited)
  • Instead of inserting all mediafiles from a directory into the db then scanning them, we can do this in parallel to provide more UX feedback to clients. This could be achieved by using message channels and some sort of buffering middleware.
  • Performance could further be improved by having a scheduler manage the scanning of video and audio metadata using ffprobe, therefore the scanner's main job will be simply fetching metadata, video metadata would be inserted in batches as a separate step. This might actually decrease throughput as we'd be using separate sqlite transactions, but it could be optimized perhaps with begin concurrent. Begin concurrent would be an appropriate candidate here because there is no possibility for racy insertion of mediafilles as they already exist, the only thing we'd be doing is modifying the database rows.

Breaking changes:

  • This PR introduces no breaking REST API changes but it does modify some old schema migrations which means that subsequently to this PR, old databases will need to be deleted and youd have to delete the old database.
  • The code in this PR makes use of some unstable features to make everything work and look nicely, name min_specialization. As such a rust-toolchain file has been added to ensure dim is compiled with the latest nightly build.

Notes:
The rustc version pinned atm is nightly-2022-05-31 due to a ICE caused by a MIR bug in latest nightly which causes dim to fail compiling in release mode. Once this is fixed the latest nightly version will be bumped.

@vgarleanu vgarleanu force-pushed the rfc/scanner-rework branch 7 times, most recently from 81664f6 to a1893f5 Compare June 27, 2022 22:24
@vgarleanu vgarleanu changed the title Draft: Refactor scanners Refactor scanners Sep 23, 2022
@vgarleanu vgarleanu merged commit dffbcf3 into master Sep 23, 2022
@vgarleanu vgarleanu deleted the rfc/scanner-rework branch September 25, 2022 16:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant