New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inference time 16 bit vs 32 bit #22
Comments
full precision is the default setting. To run with half precision, pass the argument |
with and without --half the model takes same time for a forward pass. Isn't the time for forward pass suppose to reduce with --half ? GPU: GeForce RTX 3080 |
I thought so too, but I got the same result on my Titan Xp. |
So: Is it because the weights by default are half precision and therefore I cannot measure how much time the model will take if it was 32 bit ? |
I wouldn't think so, but maybe? You could test it by instantiating a new model with half and full precision in the main of yolo.py and time the forward pass there. |
I tried that now:
Observations: |
The first pass is always slow. Ignore the first pass and then average over many passes (e.g., 100) using a for loop. Why are you expecting 1.5 ms? |
Sorry. I was expecting 15 ms. |
So its 11 ms for half and 20ms for 32 bit. |
Interesting. On my Titan Xp I got 20ms for half and 24 ms for full, so your hypothesis might be correct. Note that my numbers reported here are slower than what's reported in Table 1 because for Table 1 it's computed using rectangular images (which are only 1280 on one side and the other side is smaller). |
What i reported is for Kapao-S. Another question. |
All models in Table 1 were evaluated using float32. When I use the |
In the video.py file i hardcoded half = False so as to avoid any conversion to half precision.
But still the inference time was same ?
Is it because the weights by default are half precision and therefore I cannot measure how much time your model will take if it was 32 bit ?
The text was updated successfully, but these errors were encountered: