Target Determinator Indexer Workflow #118824

osalpekar · 2024-02-01T00:18:26Z

As described in this talk and this repo, we are experimenting with using CodeLlama-powered information retrieval for target determination.

The idea is that we create embeddings for PyTorch test functions, and store this index in S3. Then when a new PR comes in, we create embedding(s) for that PR, compare them to the index of test embeddings, and run only the most relevant tests.

This PR creates a workflow that does the indexing part (creating embeddings for functions and store in S3). All the logic for running the indexer is in osalpekar/llm-target-determinator. This workflow just checks out the relevant repos, installs the dependencies, runs the torchrun command to trigger indexing, and uploads the artifacts to S3.

pytorch-bot · 2024-02-01T00:18:30Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/118824

📄 Preview Python docs built from this PR
📄 Preview C++ docs built from this PR
❓ Need help or want to give feedback on the CI? Visit the bot commands wiki or our office hours

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 781075a with merge base 55483fc ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

huydhn · 2024-02-13T22:50:33Z

.github/workflows/target-determination-indexer.yml

+          cd "${GITHUB_WORKSPACE}"/llm-target-determinator/assets
+
+          zip -r indexer-files.zip indexer-files
+          aws s3 cp \


Just a note, it makes sense to add a timestamp here in indexer zip file, we could also setup a retention rule of N days on S3 to clean up old files. This could be done later though

Good point!

huydhn · 2024-02-13T22:55:33Z

.github/workflows/target-determination-indexer.yml

+
+jobs:
+  index:
+    runs-on: linux.g5.4xlarge.nvidia.gpu # 1 GPU A10G 24GB each


I think you will need to call setup-linux https://github.com/pytorch/pytorch/blob/main/.github/workflows/_linux-test.yml#L71-L72 and install NVIDIA driver https://github.com/pytorch/pytorch/blob/main/.github/workflows/_linux-test.yml#L94-L97 here. If this job is run on a newly launch g5 runner by chance, it would not have CUDA driver IIRC, so the indexing step would fail

Hmm that's strange, looks like indexing passed in this job on GPU: https://github.com/pytorch/pytorch/actions/runs/7849903772/job/21424841173?pr=118824. Will investigate in a follow-up.

Oh, the runner is reusable, as long as the indexing job is not the first one to run, the driver will be there. However, my understanding might be outdated, may be we already have included CUDA driver in the image (although I doubt that)

huydhn

Overall LGTM! There is a small bug w.r.t NVIDIA driver installation, but that's an easy fix.

osalpekar · 2024-02-14T03:56:54Z

@pytorchbot merge

pytorchmergebot · 2024-02-14T03:59:55Z

Merge started

Your change will be merged once all checks pass (ETA 0-4 Hours).

Learn more about merging in the wiki.

Questions? Feedback? Please reach out to the PyTorch DevX Team

Advanced Debugging

Check the merge workflow status
here

pytorch-bot bot added the topic: not user facing topic category label Feb 1, 2024

osalpekar had a problem deploying to target-determinator-env February 1, 2024 00:18 — with GitHub Actions Failure

osalpekar had a problem deploying to target-determinator-env February 1, 2024 00:26 — with GitHub Actions Failure

osalpekar force-pushed the td_gha branch from 4fd9cdd to 92f02ad Compare February 5, 2024 21:34

osalpekar had a problem deploying to target-determinator-env February 5, 2024 21:34 — with GitHub Actions Failure

osalpekar had a problem deploying to target-determinator-env February 5, 2024 21:52 — with GitHub Actions Failure

osalpekar had a problem deploying to target-determinator-env February 5, 2024 23:55 — with GitHub Actions Failure

osalpekar force-pushed the td_gha branch from 92f02ad to bc94d0a Compare February 6, 2024 00:04

osalpekar had a problem deploying to target-determinator-env February 6, 2024 00:04 — with GitHub Actions Failure

osalpekar had a problem deploying to target-determinator-env February 6, 2024 01:01 — with GitHub Actions Failure

osalpekar had a problem deploying to target-determinator-env February 6, 2024 01:05 — with GitHub Actions Failure

osalpekar force-pushed the td_gha branch from bc94d0a to a370f61 Compare February 6, 2024 01:10

osalpekar had a problem deploying to target-determinator-env February 6, 2024 01:10 — with GitHub Actions Failure

osalpekar had a problem deploying to target-determinator-env February 6, 2024 21:15 — with GitHub Actions Failure

osalpekar had a problem deploying to target-determinator-env February 6, 2024 21:56 — with GitHub Actions Failure

clee2000 had a problem deploying to target-determinator-env February 7, 2024 17:37 — with GitHub Actions Failure

clee2000 had a problem deploying to target-determinator-env February 7, 2024 19:21 — with GitHub Actions Failure

clee2000 had a problem deploying to target-determinator-env February 7, 2024 21:18 — with GitHub Actions Failure

clee2000 had a problem deploying to target-determinator-env February 7, 2024 22:13 — with GitHub Actions Error

clee2000 had a problem deploying to target-determinator-env February 7, 2024 22:16 — with GitHub Actions Error

clee2000 had a problem deploying to target-determinator-env February 7, 2024 22:18 — with GitHub Actions Failure

clee2000 had a problem deploying to target-determinator-env February 7, 2024 23:23 — with GitHub Actions Failure

clee2000 had a problem deploying to target-determinator-env February 7, 2024 23:52 — with GitHub Actions Failure

clee2000 had a problem deploying to target-determinator-env February 8, 2024 00:00 — with GitHub Actions Failure

clee2000 had a problem deploying to target-determinator-env February 8, 2024 00:07 — with GitHub Actions Failure

clee2000 temporarily deployed to target-determinator-env February 8, 2024 00:37 — with GitHub Actions Inactive

clee2000 had a problem deploying to target-determinator-env February 8, 2024 21:56 — with GitHub Actions Failure

clee2000 temporarily deployed to target-determinator-env February 8, 2024 23:13 — with GitHub Actions Inactive

clee2000 temporarily deployed to target-determinator-env February 9, 2024 00:32 — with GitHub Actions Inactive

clee2000 and others added 15 commits February 12, 2024 16:13

install codellama

6ab9b4c

update

0e566e8

update

8fb381f

update

d8b0ab7

update

a6f3bc4

use testing branch to determiine large inputs

3dab29e

upload to s3?

6609bd4

typo

52188ae

file level indexer

25f4219

better zip

a0432e4

other awscli version?

29c40f5

stop running pull

57a5721

Make mergeable

6caf798

Only run on file granularity, can experiment with others later

7c3ad9a

use v0.0.1 llm-td release

781075a

osalpekar force-pushed the td_gha branch from 47c7c27 to 781075a Compare February 13, 2024 00:13

izaitsevfb approved these changes Feb 13, 2024

View reviewed changes

huydhn reviewed Feb 13, 2024

View reviewed changes

huydhn approved these changes Feb 13, 2024

View reviewed changes

pytorch-bot bot added the ciflow/trunk Trigger trunk jobs on your pull request label Feb 14, 2024

pytorchmergebot added the merging label Feb 14, 2024

pytorchmergebot closed this in ca55468 Feb 14, 2024

pytorchmergebot added Merged and removed merging labels Feb 14, 2024

osalpekar mentioned this pull request Feb 14, 2024

Target Determinator Indexer Workflow #117695

Closed

clee2000 deleted the td_gha branch February 26, 2024 20:53

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Target Determinator Indexer Workflow #118824

Target Determinator Indexer Workflow #118824

Uh oh!

osalpekar commented Feb 1, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Feb 1, 2024 •

edited

Loading

Uh oh!

huydhn Feb 13, 2024

Uh oh!

osalpekar Feb 14, 2024

Uh oh!

huydhn Feb 13, 2024 •

edited

Loading

Uh oh!

osalpekar Feb 14, 2024

Uh oh!

huydhn Feb 14, 2024 •

edited

Loading

Uh oh!

huydhn left a comment

Uh oh!

osalpekar commented Feb 14, 2024

Uh oh!

pytorchmergebot commented Feb 14, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Target Determinator Indexer Workflow #118824

Target Determinator Indexer Workflow #118824

Uh oh!

Conversation

osalpekar commented Feb 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Feb 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/118824

✅ No Failures

Uh oh!

huydhn Feb 13, 2024

Choose a reason for hiding this comment

Uh oh!

osalpekar Feb 14, 2024

Choose a reason for hiding this comment

Uh oh!

huydhn Feb 13, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

osalpekar Feb 14, 2024

Choose a reason for hiding this comment

Uh oh!

huydhn Feb 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

huydhn left a comment

Choose a reason for hiding this comment

Uh oh!

osalpekar commented Feb 14, 2024

Uh oh!

pytorchmergebot commented Feb 14, 2024

Merge started

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

osalpekar commented Feb 1, 2024 •

edited

Loading

pytorch-bot bot commented Feb 1, 2024 •

edited

Loading

huydhn Feb 13, 2024 •

edited

Loading

huydhn Feb 14, 2024 •

edited

Loading