-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Target Determinator Indexer Workflow #118824
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/118824
Note: Links to docs will display an error until the docs builds have been completed. ✅ No FailuresAs of commit 781075a with merge base 55483fc ( This comment was automatically generated by Dr. CI and updates every 15 minutes. |
cd "${GITHUB_WORKSPACE}"/llm-target-determinator/assets | ||
|
||
zip -r indexer-files.zip indexer-files | ||
aws s3 cp \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note, it makes sense to add a timestamp here in indexer zip file, we could also setup a retention rule of N days on S3 to clean up old files. This could be done later though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point!
|
||
jobs: | ||
index: | ||
runs-on: linux.g5.4xlarge.nvidia.gpu # 1 GPU A10G 24GB each |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you will need to call setup-linux https://github.com/pytorch/pytorch/blob/main/.github/workflows/_linux-test.yml#L71-L72 and install NVIDIA driver https://github.com/pytorch/pytorch/blob/main/.github/workflows/_linux-test.yml#L94-L97 here. If this job is run on a newly launch g5 runner by chance, it would not have CUDA driver IIRC, so the indexing step would fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm that's strange, looks like indexing passed in this job on GPU: https://github.com/pytorch/pytorch/actions/runs/7849903772/job/21424841173?pr=118824. Will investigate in a follow-up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, the runner is reusable, as long as the indexing job is not the first one to run, the driver will be there. However, my understanding might be outdated, may be we already have included CUDA driver in the image (although I doubt that)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM! There is a small bug w.r.t NVIDIA driver installation, but that's an easy fix.
@pytorchbot merge |
Merge startedYour change will be merged once all checks pass (ETA 0-4 Hours). Learn more about merging in the wiki. Questions? Feedback? Please reach out to the PyTorch DevX Team |
As described in this talk and this repo, we are experimenting with using CodeLlama-powered information retrieval for target determination.
The idea is that we create embeddings for PyTorch test functions, and store this index in S3. Then when a new PR comes in, we create embedding(s) for that PR, compare them to the index of test embeddings, and run only the most relevant tests.
This PR creates a workflow that does the indexing part (creating embeddings for functions and store in S3). All the logic for running the indexer is in osalpekar/llm-target-determinator. This workflow just checks out the relevant repos, installs the dependencies, runs the torchrun command to trigger indexing, and uploads the artifacts to S3.