feat: add scheduled issue labeler for type/domain triage#251
feat: add scheduled issue labeler for type/domain triage#251fengzhangchi-bytedance merged 12 commits intolarksuite:mainfrom
Conversation
Add a manual GitHub Actions workflow and script to poll issues and apply type/domain labels.
Improve domain detection and add safeguards to avoid overriding manual type triage by default. Refresh regression samples from real issues and document usage.
Run hourly on schedule with write mode by default while keeping manual dispatch dry-run by default.
Reduce scheduled scan window while keeping overlap for missed runs.
Set FORCE_JAVASCRIPT_ACTIONS_TO_NODE24 and use Node 24 for the script runtime to avoid upcoming Node 20 deprecation warnings.
Allow workflow_dispatch to override lookback_hours while keeping hourly schedule fixed.
Use actions/checkout@v6 and actions/setup-node@v6 to align with Node 24 runtime and avoid Node 20 deprecation warnings.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a scheduled/manual GitHub Actions workflow and a new Node.js issue-labeler with heuristics for assigning single-select Type and multi-select Domain labels, plus sample data, a test runner, and README documentation. Changes
Sequence DiagramsequenceDiagram
actor Trigger as Cron / Manual
participant Workflow as "GitHub Actions"
participant Script as "Node.js Labeler"
participant API as "GitHub REST API"
participant Repo as "GitHub Repo"
Trigger->>Workflow: start workflow
Workflow->>Workflow: evaluate inputs (dry_run, max-issues)
Workflow->>Script: run `node scripts/issue-labels/index.js --max-issues 300 [--dry-run --json]`
Script->>API: search/list unlabeled issues (paged)
API-->>Script: issues[]
loop per issue
Script->>Script: classifyIssueText(title, body)
Script->>Script: collectDomainsFromText(...)
Script->>Script: planIssueLabelChanges(...)
end
alt dry-run
Script-->>Workflow: emit JSON report / summary
else apply changes
Script->>API: add labels to issue
API-->>Script: add responses
Script->>API: remove labels from issue (handle 404)
API-->>Script: remove responses
end
Script-->>Workflow: final summary (scanned, updated, skipped)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Greptile SummaryThis PR adds an automated, hourly issue-labeling workflow that classifies unlabeled GitHub issues into Type ( All concerns from the prior review thread have been resolved in the current code:
Remaining findings are all P2:
Confidence Score: 5/5Safe to merge; all previously flagged P0/P1 issues are resolved and only P2 style/heuristic suggestions remain. The three critical issues from the prior review thread (shell injection, secondary rate-limit back-off, parseArgs ReferenceError) are all fixed in the current diff. The missing-label pre-flight guard prevents 422 errors. The remaining findings are heuristic tuning suggestions and an action-pinning best practice — none block correct execution or cause data loss. scripts/issue-labels/index.js — the loose title-match exclusion list and the broad config keyword pattern are worth a second look before the labeler runs on a large backlog. Important Files Changed
Sequence DiagramsequenceDiagram
participant GHA as GitHub Actions (hourly / dispatch)
participant Script as index.js
participant GH_API as GitHub REST API
GHA->>Script: node index.js --max-issues 300 [--dry-run --json]
Script->>GH_API: GET /repos/{repo}/labels (paginated)
GH_API-->>Script: existing labels
Script->>GH_API: GET /search/issues?q=repo:X is:issue no:label (paginated)
GH_API-->>Script: up to 300 unlabeled issues
loop For each issue
Script->>Script: classifyIssueText(title, body)
Script->>Script: planIssueLabelChanges()
alt Labels missing in repo
Script->>Script: skip + warn
else dry-run mode
Script->>Script: log planned changes only
else write mode
Script->>GH_API: POST /issues/{n}/labels
Script->>GH_API: DELETE /issues/{n}/labels/{name}
end
end
Script->>GHA: stdout summary (text or JSON)
Reviews (6): Last reviewed commit: "docs(issue-labels): clarify one-shot unl..." | Re-trigger Greptile |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.github/workflows/issue-labels.yml:
- Around line 39-42: The job "Sync managed issue labels" (id: sync_issue_labels)
currently sets continue-on-error: true which hides failures; remove that key (or
set it to false) so the scheduled write-mode job surfaces errors instead of
appearing successful, ensuring failed label syncs are visible and can be acted
on.
In `@scripts/issue-labels/index.js`:
- Around line 151-155: The bug keyword matcher array (the "bug" entry in
scripts/issue-labels/index.js) is missing the Chinese term "错误", so add it to
that matcher list (e.g., include a /\b错误\b/ or plain /错误/ pattern alongside the
existing Chinese terms) so the Chinese sample (used by node
scripts/issue-labels/test.js, e.g. sample `#185`) is correctly classified as a
bug; update the "bug" array in scripts/issue-labels/index.js accordingly.
In `@scripts/issue-labels/README.md`:
- Around line 28-30: The README wording for the --sync-domains flag overstates
behavior; update the two lines describing default behavior and --sync-domains to
clarify it is not an exact-sync mode by stating that the script only removes
existing domain/* labels when at least one domain in the issue matches (i.e., it
will not remove all stale domain labels if none of the configured domains
match), and add a short note warning operators that --sync-domains may remove
some manual labels but will not guarantee full cleanup of all stale domain/*
labels.
In `@scripts/issue-labels/test.js`:
- Around line 28-32: The test currently treats only undefined expectations as
assertions; change the logic to treat explicit expectations (including null for
expected_type and empty array for expected_domains) as real assertions by
checking whether the sample actually contains the keys rather than only whether
their values are undefined. For expected_type, use a presence check (e.g.,
sample.hasOwnProperty('expected_type')) and if present assert (result.type ||
null) === sample.expected_type (so explicit null requires result.type to be
null); for expected_domains, check presence of sample.expected_domains and, if
present, assert equality/emptiness by comparing sorted arrays (using sortArray)
or using isSubset appropriately so that an explicit [] requires result.domains
to be empty; update the matchType and matchDomains logic to follow these
presence-based assertions (referencing expectedType, matchType, expectedDomains,
matchDomains, sample.expected_type, sample.expected_domains, result.type,
result.domains, sortArray, isSubset).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: a90a35aa-8543-4fb9-8cb1-e50e8951f86b
📒 Files selected for processing (5)
.github/workflows/issue-labels.ymlscripts/issue-labels/README.mdscripts/issue-labels/index.jsscripts/issue-labels/samples.jsonscripts/issue-labels/test.js
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
scripts/issue-labels/index.js (1)
25-42: Derive domain regex alternations fromDOMAIN_SERVICESto prevent drift.The service alternation is duplicated in regex literals; a future service update can silently desync matching behavior from
DOMAIN_SERVICES.♻️ Suggested refactor
const DOMAIN_SERVICES = [ @@ ]; +const DOMAIN_ALIASES = ["docs"]; +const DOMAIN_REGEX_ALTERNATION = [...DOMAIN_SERVICES, ...DOMAIN_ALIASES].join("|"); const DOMAIN_LABELS = DOMAIN_SERVICES.map((s) => `domain/${s}`); @@ - const explicit = /\bdomain\/(im|doc|docs|base|sheets|calendar|mail|task|vc|whiteboard|minutes|wiki|event|auth|core)\b/gi; + const explicit = new RegExp(`\\bdomain\\/(${DOMAIN_REGEX_ALTERNATION})\\b`, "gi"); @@ - const cmd = /\blark[-\s]?cli\s+(im|doc|docs|base|sheets|calendar|mail|task|vc|whiteboard|minutes|wiki|event|auth|core)\b/gi; + const cmd = new RegExp(`\\blark[-\\s]?cli\\s+(${DOMAIN_REGEX_ALTERNATION})\\b`, "gi");Also applies to: 144-152
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@scripts/issue-labels/index.js` around lines 25 - 42, The code duplicates the service alternation in regex literals which can drift from DOMAIN_SERVICES; update the regex construction to derive alternations from DOMAIN_SERVICES (e.g., const serviceAlternation = DOMAIN_SERVICES.join('|')) and use that value to build the RegExp instances (or dynamic regex strings) instead of hard-coded alternations; replace the literal alternations found around the regex uses (including the occurrences referenced by DOMAIN_LABELS/DOMAIN_LABEL_SET and the second occurrence at lines ~144-152) with RegExp constructed from serviceAlternation so the domain matching always follows DOMAIN_SERVICES.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/issue-labels/index.js`:
- Around line 589-623: The parsing loop currently consumes argv[++i] for
value-taking flags (e.g., "--token", "--repo", "--max-pages", "--max-issues",
"--state"), which can swallow the next flag; update the logic so before
incrementing you peek at argv[i+1] and only consume it if it exists and does not
start with "-" (treat as missing/keep default or report error otherwise),
reusing toInt for numeric flags (--max-pages, --max-issues) and String() for
string flags; add a small helper or inline check (e.g., next = argv[i+1]; if
(!next || next.startsWith("-")) { /* handle missing */ } else { i++; args.xxx =
... }) to avoid flipping behavior by accidentally ingesting other flags.
---
Nitpick comments:
In `@scripts/issue-labels/index.js`:
- Around line 25-42: The code duplicates the service alternation in regex
literals which can drift from DOMAIN_SERVICES; update the regex construction to
derive alternations from DOMAIN_SERVICES (e.g., const serviceAlternation =
DOMAIN_SERVICES.join('|')) and use that value to build the RegExp instances (or
dynamic regex strings) instead of hard-coded alternations; replace the literal
alternations found around the regex uses (including the occurrences referenced
by DOMAIN_LABELS/DOMAIN_LABEL_SET and the second occurrence at lines ~144-152)
with RegExp constructed from serviceAlternation so the domain matching always
follows DOMAIN_SERVICES.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: bca06752-41ea-4dd6-9d9b-6f09fdbc48c8
📒 Files selected for processing (4)
.github/workflows/issue-labels.ymlscripts/issue-labels/README.mdscripts/issue-labels/index.jsscripts/issue-labels/test.js
✅ Files skipped from review due to trivial changes (1)
- .github/workflows/issue-labels.yml
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@scripts/issue-labels/index.js`:
- Around line 604-613: The helper function readFlagValue incorrectly references
and mutates the loop variable i which is block-scoped inside the for loop,
causing a ReferenceError; fix this by declaring let i; (or let i = 0;) in the
outer scope before the for (...) so readFlagValue and the loop share the same i
variable, leaving the rest of readFlagValue (checks, throw, i += 1, return
String(value)) and the for loop intact; alternatively, you can refactor
readFlagValue to accept the current index and return the new index/value pair,
but the simplest fix is moving the i declaration outside the for loop so
readFlagValue can access and increment it.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 62e268cb-330c-4c8a-9c74-90a8ad23ade6
📒 Files selected for processing (1)
scripts/issue-labels/index.js
🚀 PR Preview Install Guide🧰 CLI updatenpm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@e5f2d2f28e47f388ac931c7a1d5481bd1ff749d9🧩 Skill updatenpx skills add williamfzc/cli#main -y -g |
Summary
This PR adds an automated issue labeling workflow that polls issues and applies Type and Domain labels based on heuristics.
Changes
.github/workflows/issue-labels.ymlscheduleand writes labels by defaultworkflow_dispatchwithdry_run(default: true) andlookback_hoursoverrideactions/checkout@v6andactions/setup-node@v6for Node 24 runtime alignmentscripts/issue-labels/index.jsdomain/<service>; add-only by default, optional strict sync via--sync-domains--override-typeto enforce computed type labelsscripts/issue-labels/samples.jsonscripts/issue-labels/test.jsscripts/issue-labels/README.mdSafety / Behavior Notes
--sync-domainsis explicitly enabled.Test Plan
node scripts/issue-labels/test.jsSummary by CodeRabbit
New Features
Chores
Documentation
Tests