-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add instructions to run benchmarks #480
Conversation
✅ Deploy Preview for gateway-api-inference-extension ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
/hold |
@@ -0,0 +1,60 @@ | |||
apiVersion: apps/v1 | |||
kind: Deployment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@achandrasekar How would one start another run? Should we use a Job here instead, something that runs to completion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this as well. A deployment is convenient in that it keeps the pods running so we can download the result files from the pod, otherwise we need to set up some persistent storage such as s3 or GCS, not every user has access to those. This is also aligns with the user guide of the lpg tool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can give users option to export the result to s3 or GCS in the job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the pod/job/files stays around after it completes, so should still be able to d/l the results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can give users option to export the result to s3 or GCS in the job.
I took the approach that requires minimal dependencies. Yes using a persistent volumes such as S3 works as well, but it requires additional configuration. We can add that option later.
I think the pod/job/files stays around after it completes, so should still be able to d/l the results?
You will need some persistent volume.
I updated the download-benchmark-result.sh script to tear down the deployment after it downloads the results.
This worked for me, thanks! A couple of things to improve:
|
A couple more things:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can now link to the configuration options: https://github.com/AI-Hypercomputer/inference-benchmark?tab=readme-ov-file#configuring-the-benchmark
You need to re-deploy the benchmark tool. Note further orchestration of the tool such as automating multiple runs is out of the scope for this PR. Though we can revisit.
Updated README |
Benchmark works for me! Thanks! Some smaller items to consider:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken the files and run them. Seems to work. Left some comments. Thanks!
@@ -0,0 +1,60 @@ | |||
apiVersion: apps/v1 | |||
kind: Deployment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the pod/job/files stays around after it completes, so should still be able to d/l the results?
Thanks! Updated as you suggested. |
/approve One nit to fix the link. However, this is still not very practical, we need a published helm chart so that users can run the benchmark without needing to fork. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ahg-g, liu-cong The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
makes sense, created #528 to follow up on this. |
/unhold |
/lgtm |
No description provided.