-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extremely slow execution for torch models after pyinstaller packing #8211
Comments
Does it make any difference if you divert the program flow in the multi-processing worker sub-processes as soon as possible, instead of going through imports of I.e., if you reorganize the code as follows: # Divert the program in multiprocessing workers as soon as possible...
if __name__ == "__main__":
import multiprocessing
multiprocessing.freeze_support()
import time
import cv2
import numpy as np
import base64
from ultralytics import YOLO
if __name__ == "__main__":
with open('img.png', 'rb') as f:
image_data_binary = f.read()
image_puzzle = (base64.b64encode(image_data_binary)).decode('ascii')
nparr = np.frombuffer(base64.b64decode(image_puzzle), np.uint8)
img_np = cv2.imdecode(nparr, cv2.IMREAD_COLOR)
model = YOLO('yolov8n.pt') # this file will be downloaded automatically from github repo
print('model loaded')
start_time = time.time()
results = model.predict(source=img_np, show=False, save=False, save_conf=False, show_conf=False, save_txt=False)
print("predicted in %.3f seconds" % (time.time() - start_time))
time.sleep(5) |
@rokm thank you very much, now prediction is ~3.1 seconds. So it's on ~1.3 seconds higher, than original. Is there something we can do else to speed up? Difference is in 2 times, it's possible to work with that, but, of course, higher speed is always good. |
Hmm, afraid I'm out of other obvious optimization opportunities. FWIW, I can reproduce the original slowdown (although not to such drastic extent) on my Win10 x64 laptop, but the modified program seems to perform comparable to the original unfrozen code. The again, I'm using python.org python with all dependencies installed via Full reproduction steps:
This installs
Now, if I run the unfrozen script 5 times (on otherwise idle system):
With exception of the very first run, the total measured prediction time is ~2.2 seconds. If I freeze the original version of the program
and run the resulting executable five times:
The measured total prediction time is ~5.7 s. Finally, if I move the
This time, I get prediction time that is comparable (if not slightly better) than the unfrozen version. |
Can you try with pure python.org python + I suppose it would also be interesting to compare, on the same hardware, the speed of unfrozen script in python.org + pip environment vs. the conda environment that you are using. |
@rokm I tested on env and speed jump 2-4 seconds on clean script start. Very weird. After packing to pyinstaller, it's ~3 seconds. Anyway, thank you, I will keep testing. What python version you used? |
I used python.org python 3.11.7. If python.org + pip python is slower in unfrozen variant compared to conda, it could be that conda-installed packages ( Suppose I'll also check what happens on my laptop with (mini)conda environment. |
I've tested with miniconda-installed environment following your instructions, and on my laptop, the performance is largely unchanged. For the reference, here is the output of Output
And the results (using same image): Five runs of unfrozen script
And five runs of the frozen application rebuilt in this environment (which, as a side note, is now 1.19 GB compared to 618 MB when built with python.org and pip-installed dependencies).
So I get results that are largely consistent both between frozen and unfrozen version, and with earlier results with python.org + pip environment. No idea why it differs in your case. |
OS: Windows 11 x64
python: 3.11.7 (latest)
torch: 2.1.2 (latest)
torchvision: 0.16.2 (latest)
pyinstaller: 6.3.0 (latest)
ultralytics: 8.0.231 (latest)
Ultralytics is a torch lib for working with images and machine learning. When doing torch/ultralytics model.predict() (make machine learning image predictions) after pyinstaller packing, it goes extremely slow ~12 seconds. Without packing ~1.8 seconds.
Any ideas what can be wrong? With speed 12 seconds is unreal to work.
How I pack:
$ pyinstaller --onedir --hidden-import 'torch.jit' --collect-all ultralytics --collect-all torch --collect-all torchvision script.py
Minimal Reproducible Example
script.py:
img.png
can be any png image.The text was updated successfully, but these errors were encountered: