-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Questions on inference latency/cost #72
Comments
Great questions!
|
Hello, "one-at-a-time" means we cannot use batch size > 1, say 50, get the time spent on that batch, and then divide it by 50, right? @codyaustun Thanks! |
Yes, that is correct. For latency, you must use a batch size of 1. |
Thanks @codyaustun for your kind explanation. |
No problem |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello,
I am understanding the latency rule in DAWNBench:
• Latency: Use a model that has a top-5 validation accuracy of 93% or greater. Measure the total time needed to classify all 50,000 images in the ImageNet validation set one-at-a-time, and then divide by 50,000
I am not sure how to better understand "one-at-a-time" here, so I raised some questions here and need your confirmation:
Thanks.
The text was updated successfully, but these errors were encountered: