fix(issue-labels): reduce mislabeling and handle missing labels by fengzhangchi-bytedance · Pull Request #288 · larksuite/cli

fengzhangchi-bytedance · 2026-04-07T07:22:10Z

Summary

Make the issue labeler more conservative to reduce mislabeling, and avoid skipping entire issues when some managed labels are missing in the repo.

Changes

scripts/issue-labels/index.js:55: Rework type classification into a weighted + conservative scoring strategy (min score + min margin). Prefer returning null over applying an ambiguous type label.
scripts/issue-labels/index.js:720: When some managed labels do not exist in the repository, drop only the missing labels and still apply the rest (also record missingLabels in JSON output).
scripts/issue-labels/samples.json:1: Add more real-world issue samples for regression coverage (including labeled/unlabeled and previously disputed cases) and remove duplicate samples for the same source_url.

Test Plan

node scripts/issue-labels/test.js

Summary by CodeRabbit

Bug Fixes
- More conservative and accurate issue-type detection with weighted signals and clearer tie/threshold handling.
- Labeling now proceeds when some repository labels are missing: missing adds are skipped (logged and recorded) while other add/remove actions still apply.
Tests
- Updated and expanded sample data to validate revised classification behavior across types and languages.

Make type classification more conservative to avoid incorrect labels, and avoid skipping entire issues when some managed labels are missing.

Add labeled/unlabeled issue examples to cover question/bug/enhancement and domain inference.

Keep one sample per source_url to reduce confusion and maintain stable regression coverage.

CLAassistant · 2026-04-07T07:22:24Z

All committers have signed the CLA.

coderabbitai · 2026-04-07T07:22:28Z

📝 Walkthrough

Walkthrough

Replaced simple keyword checks with a weighted, regex-rule scoring classifier for issue types and added conservative decision thresholds. Title heuristics adjusted. Label-application now filters missing repository labels instead of skipping the issue. Samples dataset updated and expanded.

Changes

Cohort / File(s)	Summary
Issue Type Classification & Labeling `scripts/issue-labels/index.js`	Replaced boolean regex increments with weighted `{re, w}` scoring across types; added `TYPE_MIN_SCORE` and `TYPE_MIN_MARGIN` and changed `chooseTypeFromScores()` to return `null` for low/ambiguous scores. Strengthened title heuristics for bug/enhancement/docs. Changed main labeling flow to filter out missing `toAdd` labels (log and record missing labels) while still applying available adds and all removes; adjusted JSON output and skip/hasChange logic.
Test Samples / Expectations `scripts/issue-labels/samples.json`	Updated `expected_type` for multiple existing samples (various null↔bug/enhancement/question/documentation changes), removed one sample `body`, and added five new sample issue entries for classifier validation.

Sequence Diagram(s)

sequenceDiagram
  participant Runner as Issue-Labeler Script
  participant Scorer as Scoring Engine
  participant RepoAPI as Repository (labels & issues)
  participant Output as JSON/Logs

  Runner->>Scorer: parse title/body -> compute weighted scores
  Scorer-->>Runner: scores per type
  Runner->>Scorer: chooseTypeFromScores(scores, TYPE_MIN_*)
  Scorer-->>Runner: chosenType or null
  Runner->>RepoAPI: fetch existing labels
  RepoAPI-->>Runner: availableLabels
  Runner->>Runner: compute toAdd/toRemove (filter missing toAdd)
  Runner->>RepoAPI: apply add/remove label operations
  RepoAPI-->>Runner: apply results
  Runner->>Output: log missing labels, record changes (JSON mode)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

feat: add scheduled issue labeler for type/domain triage #251: Directly modifies the same issue-labeler script (scoring rules, thresholds, and missing-label handling), making it highly related.

Poem

🐰 I sniffed the text, then gave it weight,
Rules and thresholds kept confusion straight,
Missing tags I gently skip, not halt the run,
Samples multiplied beneath the sun,
A tiny hop — the classifier's done! 🌿

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main changes: reducing mislabeling via conservative scoring and handling missing labels gracefully.
Description check	✅ Passed	The description covers all required template sections with concrete details about changes and includes a test plan, though it omits the standard checklist format.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-04-07T07:26:49Z

🚀 PR Preview Install Guide

🧰 CLI update

npm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@0c3b6b0d6d195a045cb3ce2fe242fe28dcf29dd2

🧩 Skill update

npx skills add fengzhangchi-bytedance/larksuite-cli-fork#fix/issue-labeler-reduce-mislabels -y -g

greptile-apps · 2026-04-07T07:27:38Z

Greptile Summary

This PR makes the issue type classifier more conservative by requiring a minimum score of 2 and a margin of 1 between the top two candidates before applying a type label (preferring null over an ambiguous guess), and changes the missing-label behavior from skipping entire issues to dropping only the unavailable labels and continuing. The samples.json regression suite is also expanded with real-world cases.

The tie-breaker code in chooseTypeFromScores (lines 327–332) is dead when TYPE_MIN_MARGIN >= 1 and can be simplified.
In --process-all mode, records for issues where all desired labels are missing lack skipped/reason metadata, making them structurally inconsistent with the --only-missing path.

Confidence Score: 5/5

Safe to merge; all remaining findings are non-blocking P2 style issues

The core logic changes are sound — minimum-score and minimum-margin guards reduce mislabeling, and the partial missing-label fallback correctly applies whichever labels are available. Two P2 issues (dead tie-breaker code and missing JSON metadata in --process-all mode) do not affect correctness on the default usage path.

scripts/issue-labels/index.js: dead tie-breaker branches (lines 327–332) and --process-all JSON metadata gap (lines 831–866)

Important Files Changed

Filename	Overview
scripts/issue-labels/index.js	Reworked type scoring (min-score=2, min-margin=1) and graceful missing-label partial-apply; two P2 issues found: dead tie-breaker branches and inconsistent JSON metadata in --process-all mode
scripts/issue-labels/samples.json	Expanded with real-world regression samples including labeled/unlabeled and previously disputed cases; duplicates removed; no issues found

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Issue from searchUnlabeledIssues] --> B[classifyIssueText]
    B --> C[scoreTypeFromText]
    C --> D[chooseTypeFromScores]
    D -- score < MIN_SCORE=2 --> E[type = null]
    D -- margin < MIN_MARGIN=1 --> E
    D -- clear winner --> F[type = winner]
    B --> G[collectDomainsFromText]
    F --> H[planIssueLabelChanges]
    G --> H
    E --> H
    H --> I{managed labels\nmissing from repo?}
    I -- yes --> J[effectiveToAdd = toAdd minus missing]
    I -- no --> K[effectiveToAdd = toAdd]
    J --> L{hasChange?}
    K --> L
    L -- no AND onlyMissing\nAND json AND missingForIssue --> M[emit skipped record\nwith reason + missingLabels]
    L -- no AND onlyMissing --> N[continue silently]
    L -- yes --> O[build record, apply labels]
    O --> P{dryRun?}
    P -- no --> Q[addIssueLabels + removeIssueLabel]
    P -- yes --> R[skip API calls]

_{Greploops — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.
Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.}

_{Reviews (2): Last reviewed commit: "fix(issue-labels): include missing-label..." | Re-trigger Greptile}

scripts/issue-labels/index.js

Keep stderr and JSON output consistent under --only-missing when desired labels are missing from the repo.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

scripts/issue-labels/index.js (1)

809-829: ⚠️ Potential issue | 🟡 Minor

skippedIssue counter not incremented when --json is false.

When args.json is false, issues skipped due to missing managed labels are not counted in results.skippedIssue. The warning is logged at Line 803, but the final summary at Line 872 will undercount skipped issues.

🐛 Proposed fix

     if (args.onlyMissing && !hasChange) {
+      if (missingForIssue.length > 0) {
+        results.skippedIssue += 1;
+      }
       if (args.json && missingForIssue.length > 0) {
-        results.skippedIssue += 1;
         results.changes.push({
           issue: {
             number: issue.number,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/issue-labels/index.js` around lines 809 - 829, The skippedIssue
counter isn't incremented when args.onlyMissing && !hasChange and args.json is
false; update the block handling that condition (the if branch checking
args.onlyMissing && !hasChange) to always increment results.skippedIssue for
skipped issues regardless of args.json, then keep the existing
results.changes.push(...) behavior only when args.json is true and
missingForIssue.length > 0; touch the symbols args.onlyMissing, hasChange,
args.json, missingForIssue, and results.skippedIssue to ensure skipped counts
are correct in both JSON and non-JSON runs.

🧹 Nitpick comments (1)

scripts/issue-labels/index.js (1)

797-799: Minor: Redundant conditional wrapping the filter.

The ternary check missingForIssue.length > 0 is unnecessary since the filter expression !missingManagedLabels.has(name) would return toAdd unchanged when there are no missing labels.

♻️ Suggested simplification

-    const effectiveToAdd = missingForIssue.length > 0
-      ? toAdd.filter((name) => !missingManagedLabels.has(name))
-      : toAdd;
+    const effectiveToAdd = toAdd.filter((name) => !missingManagedLabels.has(name));

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scripts/issue-labels/index.js` around lines 797 - 799, Replace the redundant
ternary assigning effectiveToAdd: instead of checking missingForIssue.length >
0, always compute effectiveToAdd by filtering toAdd with the predicate
!missingManagedLabels.has(name); this removes the unnecessary conditional and
yields the same result (use the variables effectiveToAdd, toAdd,
missingManagedLabels, and missingForIssue from the current code to locate the
change).

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@scripts/issue-labels/index.js`:
- Around line 809-829: The skippedIssue counter isn't incremented when
args.onlyMissing && !hasChange and args.json is false; update the block handling
that condition (the if branch checking args.onlyMissing && !hasChange) to always
increment results.skippedIssue for skipped issues regardless of args.json, then
keep the existing results.changes.push(...) behavior only when args.json is true
and missingForIssue.length > 0; touch the symbols args.onlyMissing, hasChange,
args.json, missingForIssue, and results.skippedIssue to ensure skipped counts
are correct in both JSON and non-JSON runs.

---

Nitpick comments:
In `@scripts/issue-labels/index.js`:
- Around line 797-799: Replace the redundant ternary assigning effectiveToAdd:
instead of checking missingForIssue.length > 0, always compute effectiveToAdd by
filtering toAdd with the predicate !missingManagedLabels.has(name); this removes
the unnecessary conditional and yields the same result (use the variables
effectiveToAdd, toAdd, missingManagedLabels, and missingForIssue from the
current code to locate the change).

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 626f7826-cc48-4bd5-b383-9535bb25ff87

📥 Commits

Reviewing files that changed from the base of the PR and between d9e179a and 0c3b6b0.

📒 Files selected for processing (1)

scripts/issue-labels/index.js

fengzhangchi-bytedance added 3 commits April 7, 2026 14:49

fix(issue-labels): reduce mislabeling and handle missing labels

b2b2d9a

Make type classification more conservative to avoid incorrect labels, and avoid skipping entire issues when some managed labels are missing.

test(issue-labels): add more real-world issue samples

b5dd568

Add labeled/unlabeled issue examples to cover question/bug/enhancement and domain inference.

test(issue-labels): avoid duplicate issue samples

d9e179a

Keep one sample per source_url to reduce confusion and maintain stable regression coverage.

github-actions bot added the size/M Single-domain feat or fix with limited business impact label Apr 7, 2026

fengzhangchi-bytedance changed the title ~~Fix/issue labeler reduce mislabels~~ fix(issue-labels): reduce mislabeling and handle missing labels Apr 7, 2026

greptile-apps bot reviewed Apr 7, 2026

View reviewed changes

scripts/issue-labels/index.js Outdated Show resolved Hide resolved

fix(issue-labels): include missing-label-only items in JSON output

0c3b6b0

Keep stderr and JSON output consistent under --only-missing when desired labels are missing from the repo.

coderabbitai bot reviewed Apr 7, 2026

View reviewed changes

fengzhangchi-bytedance requested a review from liangshuo-1 April 7, 2026 07:38

liangshuo-1 approved these changes Apr 7, 2026

View reviewed changes

fengzhangchi-bytedance merged commit b064188 into larksuite:main Apr 7, 2026
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(issue-labels): reduce mislabeling and handle missing labels#288

fix(issue-labels): reduce mislabeling and handle missing labels#288
fengzhangchi-bytedance merged 4 commits intolarksuite:mainfrom
fengzhangchi-bytedance:fix/issue-labeler-reduce-mislabels

fengzhangchi-bytedance commented Apr 7, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

CLAassistant commented Apr 7, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Apr 7, 2026 •

edited

Loading

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

github-actions bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Apr 7, 2026 •

edited

Loading

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

fengzhangchi-bytedance commented Apr 7, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Test Plan

Summary by CodeRabbit

Uh oh!

CLAassistant commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Uh oh!

github-actions bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚀 PR Preview Install Guide

🧰 CLI update

🧩 Skill update

Uh oh!

greptile-apps bot commented Apr 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

fengzhangchi-bytedance commented Apr 7, 2026 •

edited by coderabbitai bot

Loading

CLAassistant commented Apr 7, 2026 •

edited

Loading

coderabbitai bot commented Apr 7, 2026 •

edited

Loading

github-actions bot commented Apr 7, 2026 •

edited

Loading

greptile-apps bot commented Apr 7, 2026 •

edited

Loading