-
-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Improvements #350
Comments
Hi @amaleki -- thanks for this change. Do you mind opening a pull request? That will trigger the tests to run. Meanwhile, I will have a look at your changes. |
@amaleki (as mentioned in the raspberry pi performance thread) I tried this importing 86,000 images yesterday and it was done in 12 hours instead of it taking over 2 days and continuing on the master branch. However it wasn't an isolated test, I also used the --trash flag and the changes I've made in my fork to use a local DB for location requests instead of MapQuest. On a quick visual inspection is appears all my files still exist and the metadata extracted from exiftool looks correct, good work :) |
Hi @jmathai I've created a pull request. Looks like some tests have failed. I'll take a look to see what failed. |
Hey guys I did a comparison using the latest commits on both forks: Setup
ResultsamalekiSUMMARY Metric Count
Files:
jmathaiSUMMARY Metric Count
Files:
Comments
|
Similar improvements when running unit tests. amaleki: Ran 310 tests in 14.489s |
All PR #352 test issues have been resolved. I moved all changes to a branch off amaleki/elodie. You can find it here. Thanks @evanjt for bench-marking both commits. |
Thanks @amaleki for adding a technique to my armoury, I straight away found a superfluous file write in one of my programs and saved several seconds running time 👍 |
@TonyWhitley you bet! @evanjt I've created a new PR353 with some minor cache improvements. From my testing see ~15% reduction in time for long running imports. Can you see if you get similar process time improvements? |
Hi amaleki, I don't have the exact same folder for testing the images, but here's another test: Original folder: last commit on Amaleki:fs-process-file-media-set-order (0c1142e)Folder: Success 3795 83.34s user Last commits to jmathai:master (d8cee15)Folder: Success 3795 89.21s user |
I have been using Elodie to organize my photo library. A very impressive command line utility.
I did some profiling to see if I could improve the per image processing time.
Import 1 file with master branch. - 2.57s
python -m cProfile -o elodie.prof elodie.py import --destination="" <file>
"get_metadata" takes 74% of run time. Every image makes 8 calls to exiftool. Each call to exiftool costs ~200-300ms
Import 24 files with master branch - 73.7s
python -m cProfile -o elodie.prof elodie.py import --destination="<dst-dir>" --source="<src-dir>"
"get_metadata" takes 84% of run time. Again every image makes 8 calls to exiftool with each call taking ~200-300ms.
After looking over the code two things can be done to improve performance significantly.
I've implemented these changes in a fork I created
amaleki@afd5766#diff-5a2b1bdaf59cbc881a6ad218eaabdc42
Here are some results from my testing:
Importing 1 file with the above commits - .690s (73% Improvement)
"get_metadata" now takes 32% of run time.
Importing 24 files with the above commits - 3.17s (95% Improvement)
"get_metadata" now takes 37% of run time.
Results are from a Windows Docker Desktop container running Alpine Linux on a Xeon E2176M. Visualization are from Snakeviz
I would love to have the community review the code and see what kind of results they see. I'm hoping these aren't "too good to be true"!
elodie-master-profile.zip
elodie-amaleki-branch-commit-afd5766.zip
The text was updated successfully, but these errors were encountered: