-
Notifications
You must be signed in to change notification settings - Fork 18
[SGLang Workflow] Upload benchmark results to AWS S3 #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
echo "⚠️ No benchmark results found in ${BENCHMARK_RESULTS}" >> $GITHUB_STEP_SUMMARY | ||
fi | ||
python3 .github/scripts/upload_benchmark_results.py \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that this workflow doesn't have a schedule yet. Are you planning to add one? I'm thinking daily, biweekly, or weekly depending on how frequent we need to look at sglang results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing it out, I have added it for weekly as of now, as we discussed initially. SGLang releases the new stable versions every monthly, but their pre-release versions mostly happen every week, so I think it would be fine to have it weekly for now. But, in future, we can update it to lesser/more as well based on the feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think weekly is a good starting point
"dataset_path": "./ShareGPT_V3_unfiltered_cleaned_split.json", | ||
"num_prompts": 200 | ||
} | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes upload the SGLang benchmark results to the AWS S3 bucket, which internally triggers the AWS Lambda function to upload them to the clickhouse database. Eventually, being rendered in the HUD Dashboard.
For SGLang benchmarking, currently, the
vllm bench serve
command was generating atest.pytorch.json
file containing the Pytorch formatted benchmark results, but it was hardcoding the benchmark name tovllm benchmark
. Due to that, whenever the SGLang benchmarking was being run, it was also allocating the same benchmark name. This was leading to issues in the HUD dashboard where theclickhouse sql queries
were trying to find thebenchmark name
asSGLang benchmark
, but due to this issue, the results were returning an empty array.So, added a check in the bash script to replace the benchmark name with correct one, as the original implementation is being done in the vllm code repository.
Testing:
Verified the changes from the Github workflow that the results are getting uploaded correctly to AWS S3
Verified the changes from the flambeau dashboard that the results are getting uploaded to Clickhouse database.