-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inference performance files #213
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments about code
Overall code looks okay. I see lots of comments which don't need to be there and repeated code that can be moved into a single file.
The suggested implementation in #201 states:
|
@Aniket-Parlikar I see several comments are not yet addressed, please let me know when this is ready for a second look |
In regards to this comment,
Whereas, multi_models_logs.csv contains information about the performance parameters of multiple models deployed in a Docker container. The cloudrun_logs.txt contains information about the performance parameters of multiple models deployed on Google cloud run service.
|
e6419f0
to
5d3e98a
Compare
|
||
CPU performance 53% for 10 requests and around 80% for 1000 requests | ||
|
||
Container count 30 for 10 requests and around 139 for 1000 requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be updated based on previous comment?
Yes, there are 10 active containers for 10 requests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these can be discarded
Request latencies: 10.69mins | ||
Container CPU utilization: 75.39% | ||
Container memory utilization: 46.55% | ||
Container startup latency: 4.68s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice summary!!
|
||
while (process.is_running()): | ||
# set the sleep time to monitor at an interval of every second. | ||
time.sleep(0.5) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just wondering why is it 0.5 here and 1 above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was generally set as the execution time of test_models.py is pretty less as compared to concurrent_inference.py due to which while profiling the program, we can try to get performance parameters more accurately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the updates, this is a good comprehensive overview!
I think the notebook may need to be updated to not have the duplicate result.
I left a couple small comments, once they are addressed this is ready to merge!
|
||
Time taken:300seconds | ||
Container instance count: max: active -151 | ||
Request latencies: 10.69mins |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is request latency is probably the biggest issue! I wonder why it is growing so fast
Updated the inference_log_files.txt by removing unwanted logs.
Added new files for measuring inference performance