You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue is to keep track of some recent cases of wrong flaky detection, most likely they are linked to wrong or too generic failures extracted by log classification. They are:
Fixes#4741
This is to strengthen Dr.CI flaky classification in the case of the
generic GHA `Process completed with exit code 1` failure by comparing
the failure context of the last command executed in addition to the
failure itself. The error itself doesn't mean anything in this case.
The failure context has been gathered for a while and stored in Rockset
under `job.torchci_classification.context`. Now, it's the time to start
utilize it. The context is a list of the last N commands executed traced
backward from where the failure occurs, for example,
```
[
"+ python test/run_test.py --exclude-jit-executor --exclude-distributed-tests --shard 1 5 --verbose",
"+ [[ -z 5 ]]",
"+ test_python_shard 1",
"+ '[' -n '' ']'",
"+ pip install --progress-bar off --no-use-pep517 --user git+https://github.com/pytorch/vision.git@893b4abdc0c9df36c241c58769810f69e35dab48",
"+ pip_install --no-use-pep517 --user git+https://github.com/pytorch/vision.git@893b4abdc0c9df36c241c58769810f69e35dab48",
"+ '[' -n '' ']'",
"+ orig_preload=",
"+ commit=893b4abdc0c9df36c241c58769810f69e35dab48",
"++ cat .github/ci_commit_pins/vision.txt",
"++ get_pinned_commit vision",
"+ local commit",
]
```
This change extracts and compares the last command, i.e. `+ python
test/run_test.py --exclude-jit-executor --exclude-distributed-tests
--shard 1 5 --verbose`, in addition to job name and the failure string.
### Testing
Try this out on a pytorch/pytorch#112504 with
failures
```
curl --request POST \
--url "http://localhost:3000/api/drci/drci?prNumber=112504" \
--header "Authorization: TOKEN" \
--data 'repo=pytorch'
```
This issue is to keep track of some recent cases of wrong flaky detection, most likely they are linked to wrong or too generic failures extracted by log classification. They are:
The text was updated successfully, but these errors were encountered: