Description
_test_pat_against_repo in gittensor/validator/pat_handler.py collapses every non-200 status and every requests.RequestException into a single permanent error return. Callers (handle_pat_check, handle_pat_broadcast) then treat any error as a permanent invalid-PAT result.
This is the same asymmetry that #1106 documented for the /user (identity) side — fixed by #1107 — but on the sibling GraphQL test-query path. #1106 explicitly scoped itself to /user and did not cover the test-query call, so the gap remains: a GitHub 5xx burst, DNS hiccup, connection reset, or read timeout during the test query is reported to miners as an invalid PAT.
After #1107:
- Identity-check transient failure →
pat_valid = None, retry-oriented message. ✅
- Test-query transient failure →
pat_valid = False, "PAT test query failed: ..." message. ❌
In gitt miner check, the second branch tells the miner their PAT is invalid during a temporary GitHub problem — prompting them to rotate a working PAT.
Steps to Reproduce
- Patch
requests.post (the one inside gittensor.validator.pat_handler) to return a response with status_code=503.
- Store a valid PAT entry for
(uid=1, hotkey='hotkey_1', github_id='github_42') via pat_storage.save_pat(...).
- Call
handle_pat_check with a PatCheckSynapse whose dendrite.hotkey == 'hotkey_1'.
- Observe:
synapse.has_pat is True
synapse.pat_valid is False # ← should be None (inconclusive)
synapse.rejection_reason == 'PAT test query failed: GitHub GraphQL API returned 503'
Same shape applies for requests.Timeout, requests.ConnectionError, and HTTP 500/502/504.
For the broadcast path, the analogous reproduction patches requests.post to raise/return-5xx and calls handle_pat_broadcast with a valid PAT; the broadcast is rejected with "PAT test query failed: ..." even though the PAT itself was never actually evaluated against the GraphQL response.
Expected Behavior
Mirror the convention #1107 established for the identity path. Transient transport failures should be surfaced as inconclusive — not as permanent invalid-PAT results.
handle_pat_check on a transient test-query failure:
synapse.has_pat = True
synapse.pat_valid = None
synapse.rejection_reason = 'GitHub API temporarily unavailable; retry the check in a few minutes.'
pat_valid=None already fits the CLI's _pat_check_row_category() mapping → no_response, not invalid_pat.
handle_pat_broadcast on a transient test-query failure: reject with a retry-oriented message that does not imply the PAT is invalid; do not store the PAT (the existing save_pat call is already gated on a successful test query, so no storage-side behavior change is needed — only the rejection message).
Classification suggestion:
| Outcome |
error |
transient_failure |
| 200, viewer present |
None |
False |
| HTTP 5xx (500-504) |
set |
True |
requests.RequestException (timeout, ConnectionError, DNS) |
set |
True |
| HTTP 4xx |
set |
False |
200 with errors[0] |
set |
False |
200 with data.viewer == null (scope-less fine-grained PAT) |
set |
False |
Actual Behavior
_test_pat_against_repo returns Optional[str] — None on success, error string otherwise — with no transient distinction. Callers collapse every non-None return into:
handle_pat_check: synapse.pat_valid = False, synapse.rejection_reason = f'PAT test query failed: {test_error}'
handle_pat_broadcast: rejects with f'PAT test query failed: {test_error}'
Result: during a GitHub partial outage, miners running gitt miner check see their stored PAT marked invalid; miners running gitt miner post to broadcast a fresh PAT see the broadcast rejected with a message that points the finger at the PAT.
Environment
- OS: any (issue is in validator-side handler logic, OS-independent)
- Python version: matches project (3.10+)
- Commit/Version: reproducible against
test at the current HEAD (gittensor/validator/pat_handler.py:170-193)
Additional Context
- Identity-side equivalent: #1106 (closed by #1107).
- Recent unrelated refactor on the same function: #1164 (consolidated header construction; did not touch error-classification semantics).
- Suggested fix mirrors
GitHubCredentialValidation shape: return a PatTestResult dataclass with error: Optional[str] and transient_failure: bool so callers can branch the same way they do for validation.transient_failure after #1107.
- Economic impact is UX rather than emissions — miners with already-stored PATs continue earning regardless of
gitt miner check output. The harm is misleading miners into rotating working PATs and rejecting fresh broadcasts during transient outages.
Description
_test_pat_against_repoingittensor/validator/pat_handler.pycollapses every non-200 status and everyrequests.RequestExceptioninto a single permanent error return. Callers (handle_pat_check,handle_pat_broadcast) then treat any error as a permanent invalid-PAT result.This is the same asymmetry that #1106 documented for the
/user(identity) side — fixed by #1107 — but on the sibling GraphQL test-query path. #1106 explicitly scoped itself to/userand did not cover the test-query call, so the gap remains: a GitHub 5xx burst, DNS hiccup, connection reset, or read timeout during the test query is reported to miners as an invalid PAT.After #1107:
pat_valid = None, retry-oriented message. ✅pat_valid = False,"PAT test query failed: ..."message. ❌In
gitt miner check, the second branch tells the miner their PAT is invalid during a temporary GitHub problem — prompting them to rotate a working PAT.Steps to Reproduce
requests.post(the one insidegittensor.validator.pat_handler) to return a response withstatus_code=503.(uid=1, hotkey='hotkey_1', github_id='github_42')viapat_storage.save_pat(...).handle_pat_checkwith aPatCheckSynapsewhosedendrite.hotkey == 'hotkey_1'.Same shape applies for
requests.Timeout,requests.ConnectionError, and HTTP 500/502/504.For the broadcast path, the analogous reproduction patches
requests.postto raise/return-5xx and callshandle_pat_broadcastwith a valid PAT; the broadcast is rejected with"PAT test query failed: ..."even though the PAT itself was never actually evaluated against the GraphQL response.Expected Behavior
Mirror the convention #1107 established for the identity path. Transient transport failures should be surfaced as inconclusive — not as permanent invalid-PAT results.
handle_pat_checkon a transient test-query failure:pat_valid=Nonealready fits the CLI's_pat_check_row_category()mapping →no_response, notinvalid_pat.handle_pat_broadcaston a transient test-query failure: reject with a retry-oriented message that does not imply the PAT is invalid; do not store the PAT (the existingsave_patcall is already gated on a successful test query, so no storage-side behavior change is needed — only the rejection message).Classification suggestion:
errortransient_failureNoneFalseTruerequests.RequestException(timeout, ConnectionError, DNS)TrueFalseerrors[0]Falsedata.viewer == null(scope-less fine-grained PAT)FalseActual Behavior
_test_pat_against_reporeturnsOptional[str]—Noneon success, error string otherwise — with no transient distinction. Callers collapse every non-Nonereturn into:handle_pat_check:synapse.pat_valid = False,synapse.rejection_reason = f'PAT test query failed: {test_error}'handle_pat_broadcast: rejects withf'PAT test query failed: {test_error}'Result: during a GitHub partial outage, miners running
gitt miner checksee their stored PAT marked invalid; miners runninggitt miner postto broadcast a fresh PAT see the broadcast rejected with a message that points the finger at the PAT.Environment
testat the currentHEAD(gittensor/validator/pat_handler.py:170-193)Additional Context
GitHubCredentialValidationshape: return aPatTestResultdataclass witherror: Optional[str]andtransient_failure: boolso callers can branch the same way they do forvalidation.transient_failureafter #1107.gitt miner checkoutput. The harm is misleading miners into rotating working PATs and rejecting fresh broadcasts during transient outages.