Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search: Performance Optimizations #1438

Closed
graciousgrey opened this issue Jul 19, 2021 · 7 comments
Closed

Search: Performance Optimizations #1438

graciousgrey opened this issue Jul 19, 2021 · 7 comments
Assignees
Labels
enhancement Refactoring, improvement or maintenance task performance Performance Optimization released Available in the stable release

Comments

@graciousgrey
Copy link
Member

graciousgrey commented Jul 19, 2021

As a user I want to have the best performance possible, so that I enjoy scrolling through my library.

This ticket is meant to collect possible performance optimizations:

@Dulanic
Copy link

Dulanic commented Oct 19, 2021

Just adding some recommendations I am seeing for commonly used queries. Will edit and add more as I find them.

  1. Change this where to a in, OR is horrible for performance. (files.file_type = 'jpg' OR files.file_video = 1) should be for example... file_type IN ('jpg','mov','mp4','mts','wmv')
  2. Get rid of offset etc... maybe a subquery would be better? Or pull the list of Id's and store it in the program and pull just those 60 Id's you want instead of all and then pulling back just 60 from that? Or you could use a cursor, but I'm not sure how well SQLLite works with cursors. EX: You could pull back the Id's in batches of 1000 store the Ids in memory, that would allow people to scroll up to 1000 before you needed to request more ids? I'd say use subqueries, but Maria DB doesn't like order by/limit with subqueries yet.

You load 1000 photos to memory /w a simple query like so:

SELECT id, taken_at
FROM photos
order by taken_at
LIMIT 1000;

Then your query gets an additional where... which would run 10x faster than the current queries.
AND photos.id IN (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14)

@graciousgrey graciousgrey changed the title Performance Optimizations Performance: Optimizations Nov 2, 2021
@nautatva
Copy link

nautatva commented Jan 15, 2022

User needs to wait for a couple of seconds before the archive button tick archives the photo. What I think can be done is showing the change as done on front-end and roll it back to view incase of error. This change can be applied for all the operations (Face release is one other I see takes time).

@lastzero
Copy link
Member

One of the issues here is that many caches need to be invalidated as well as thumbnails / covers replaced / checked. Also every user should see the same and not one user the change as done and others not. It's a pretty complex problem we are happy to further optimize when we reach our funding goal and had time to work on the items already on our roadmap.

@lastzero lastzero self-assigned this Mar 30, 2022
lastzero added a commit that referenced this issue Mar 31, 2022
In addition, you can now change the batch size in settings.yml.
@lastzero lastzero added please-test Ready for acceptance test performance Performance Optimization labels Mar 31, 2022
@lastzero lastzero changed the title Performance: Optimizations Search: Performance Optimizations Mar 31, 2022
@lastzero
Copy link
Member

The new development preview is up to 5x faster than the previous version:

Screenshot 2022-03-31 at 10 54 52

lastzero added a commit that referenced this issue Mar 31, 2022
lastzero added a commit that referenced this issue Mar 31, 2022
lastzero added a commit that referenced this issue Apr 4, 2022
lastzero added a commit that referenced this issue Apr 4, 2022
Since caching all subject data proved too complex in the time available,
this implementation uses a simple key/value lookup table to cache
subject names and perform backward searches by uid.
lastzero added a commit that referenced this issue Apr 5, 2022
The file query functions have been refactored and now return pointers.
@rdevries20
Copy link

Hi,

First of all thanks for the great work, I really like the application. I'm also experiencing performance problems with ~ 20.000 images and SQL (MariaDB) database. Default docker setup. For example this query seems to load every time the app is visited in browser:

SELECT SUM(photo_type = 'video' AND photo_quality >= 0 AND photo_private = 0) AS videos, SUM(photo_type IN ('image','raw','live') AND photo_quality < 3 AND photo_quality >= 0 AND photo_private = 0) AS review, SUM(photo_quality = -1) AS hidden, SUM(photo_type IN ('image','raw','live') AND photo_private = 0 AND photo_quality >= 0) AS photos, SUM(photo_favorite = 1 AND photo_quality >= 0) AS favorites, SUM(photo_private = 1 AND photo_quality >= 0) AS private FROM `photos` WHERE (photos.id NOT IN (SELECT photo_id FROM files WHERE file_primary = 1 AND (file_missing = 1 OR file_error <> ''))) AND (deleted_at IS NULL) LIMIT 1;

According to MariaDB this query takes 1 minute and 47 seconds to complete. In that time the GUI already performs the same query again further adding to the delay.

I'm not that good with SQL but the following query reduces the load time to 7 seconds using a left join.

SELECT SUM(photo_type = 'video' AND photo_quality >= 0 AND photo_private = 0) AS videos, SUM(photo_type IN ('image','raw','live') AND photo_quality < 3 AND photo_quality >= 0 AND photo_private = 0) AS review, SUM(photo_quality = -1) AS hidden, SUM(photo_type IN ('image','raw','live') AND photo_private = 0 AND photo_quality >= 0) AS photos, SUM(photo_favorite = 1 AND photo_quality >= 0) AS favorites, SUM(photo_private = 1 AND photo_quality >= 0) AS private FROM `photos` LEFT JOIN files b ON photos.id = b.photo_id AND (b.file_primary  = 1 AND (b.file_missing = 1 OR file_error <> ''))  WHERE b.photo_id IS NULL AND (photos.deleted_at IS NULL) LIMIT 1;

The query maybe can be further improved using indexes?

@lastzero
Copy link
Member

Tested with our preview build? We have 100k+ files for testing and do not observe such long query times.

@rdevries20
Copy link

Was using the docker default (latest), switching to preview build (Build 220510-16c30fff-Linux-AMD64) via the docker-compose file fixed the long query time. GUI is now responding within seconds and no longer 100% CPU load on MariaDB. Thanks!

@graciousgrey graciousgrey added released Available in the stable release and removed please-test Ready for acceptance test labels May 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Refactoring, improvement or maintenance task performance Performance Optimization released Available in the stable release
Projects
Status: Release 🌈
Development

No branches or pull requests

5 participants