Skip to content

fix(issue-labels): reduce mislabeling and handle missing labels#288

Merged
fengzhangchi-bytedance merged 4 commits intolarksuite:mainfrom
fengzhangchi-bytedance:fix/issue-labeler-reduce-mislabels
Apr 7, 2026
Merged

fix(issue-labels): reduce mislabeling and handle missing labels#288
fengzhangchi-bytedance merged 4 commits intolarksuite:mainfrom
fengzhangchi-bytedance:fix/issue-labeler-reduce-mislabels

Conversation

@fengzhangchi-bytedance
Copy link
Copy Markdown
Collaborator

@fengzhangchi-bytedance fengzhangchi-bytedance commented Apr 7, 2026

Summary

Make the issue labeler more conservative to reduce mislabeling, and avoid skipping entire issues when some managed labels are missing in the repo.

Changes

  • scripts/issue-labels/index.js:55: Rework type classification into a weighted + conservative scoring strategy (min score + min margin). Prefer returning null over applying an ambiguous type label.
  • scripts/issue-labels/index.js:720: When some managed labels do not exist in the repository, drop only the missing labels and still apply the rest (also record missingLabels in JSON output).
  • scripts/issue-labels/samples.json:1: Add more real-world issue samples for regression coverage (including labeled/unlabeled and previously disputed cases) and remove duplicate samples for the same source_url.

Test Plan

node scripts/issue-labels/test.js

Summary by CodeRabbit

  • Bug Fixes

    • More conservative and accurate issue-type detection with weighted signals and clearer tie/threshold handling.
    • Labeling now proceeds when some repository labels are missing: missing adds are skipped (logged and recorded) while other add/remove actions still apply.
  • Tests

    • Updated and expanded sample data to validate revised classification behavior across types and languages.

Make type classification more conservative to avoid incorrect labels, and avoid skipping entire issues when some managed labels are missing.
Add labeled/unlabeled issue examples to cover question/bug/enhancement and domain inference.
Keep one sample per source_url to reduce confusion and maintain stable regression coverage.
@CLAassistant
Copy link
Copy Markdown

CLAassistant commented Apr 7, 2026

CLA assistant check
All committers have signed the CLA.

@github-actions github-actions bot added the size/M Single-domain feat or fix with limited business impact label Apr 7, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 7, 2026

📝 Walkthrough

Walkthrough

Replaced simple keyword checks with a weighted, regex-rule scoring classifier for issue types and added conservative decision thresholds. Title heuristics adjusted. Label-application now filters missing repository labels instead of skipping the issue. Samples dataset updated and expanded.

Changes

Cohort / File(s) Summary
Issue Type Classification & Labeling
scripts/issue-labels/index.js
Replaced boolean regex increments with weighted {re, w} scoring across types; added TYPE_MIN_SCORE and TYPE_MIN_MARGIN and changed chooseTypeFromScores() to return null for low/ambiguous scores. Strengthened title heuristics for bug/enhancement/docs. Changed main labeling flow to filter out missing toAdd labels (log and record missing labels) while still applying available adds and all removes; adjusted JSON output and skip/hasChange logic.
Test Samples / Expectations
scripts/issue-labels/samples.json
Updated expected_type for multiple existing samples (various null↔bug/enhancement/question/documentation changes), removed one sample body, and added five new sample issue entries for classifier validation.

Sequence Diagram(s)

sequenceDiagram
  participant Runner as Issue-Labeler Script
  participant Scorer as Scoring Engine
  participant RepoAPI as Repository (labels & issues)
  participant Output as JSON/Logs

  Runner->>Scorer: parse title/body -> compute weighted scores
  Scorer-->>Runner: scores per type
  Runner->>Scorer: chooseTypeFromScores(scores, TYPE_MIN_*)
  Scorer-->>Runner: chosenType or null
  Runner->>RepoAPI: fetch existing labels
  RepoAPI-->>Runner: availableLabels
  Runner->>Runner: compute toAdd/toRemove (filter missing toAdd)
  Runner->>RepoAPI: apply add/remove label operations
  RepoAPI-->>Runner: apply results
  Runner->>Output: log missing labels, record changes (JSON mode)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 I sniffed the text, then gave it weight,
Rules and thresholds kept confusion straight,
Missing tags I gently skip, not halt the run,
Samples multiplied beneath the sun,
A tiny hop — the classifier's done! 🌿

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main changes: reducing mislabeling via conservative scoring and handling missing labels gracefully.
Description check ✅ Passed The description covers all required template sections with concrete details about changes and includes a test plan, though it omits the standard checklist format.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@fengzhangchi-bytedance fengzhangchi-bytedance changed the title Fix/issue labeler reduce mislabels fix(issue-labels): reduce mislabeling and handle missing labels Apr 7, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 7, 2026

🚀 PR Preview Install Guide

🧰 CLI update

npm i -g https://pkg.pr.new/larksuite/cli/@larksuite/cli@0c3b6b0d6d195a045cb3ce2fe242fe28dcf29dd2

🧩 Skill update

npx skills add fengzhangchi-bytedance/larksuite-cli-fork#fix/issue-labeler-reduce-mislabels -y -g

@greptile-apps
Copy link
Copy Markdown

greptile-apps bot commented Apr 7, 2026

Greptile Summary

This PR makes the issue type classifier more conservative by requiring a minimum score of 2 and a margin of 1 between the top two candidates before applying a type label (preferring null over an ambiguous guess), and changes the missing-label behavior from skipping entire issues to dropping only the unavailable labels and continuing. The samples.json regression suite is also expanded with real-world cases.

  • The tie-breaker code in chooseTypeFromScores (lines 327–332) is dead when TYPE_MIN_MARGIN >= 1 and can be simplified.
  • In --process-all mode, records for issues where all desired labels are missing lack skipped/reason metadata, making them structurally inconsistent with the --only-missing path.

Confidence Score: 5/5

Safe to merge; all remaining findings are non-blocking P2 style issues

The core logic changes are sound — minimum-score and minimum-margin guards reduce mislabeling, and the partial missing-label fallback correctly applies whichever labels are available. Two P2 issues (dead tie-breaker code and missing JSON metadata in --process-all mode) do not affect correctness on the default usage path.

scripts/issue-labels/index.js: dead tie-breaker branches (lines 327–332) and --process-all JSON metadata gap (lines 831–866)

Important Files Changed

Filename Overview
scripts/issue-labels/index.js Reworked type scoring (min-score=2, min-margin=1) and graceful missing-label partial-apply; two P2 issues found: dead tie-breaker branches and inconsistent JSON metadata in --process-all mode
scripts/issue-labels/samples.json Expanded with real-world regression samples including labeled/unlabeled and previously disputed cases; duplicates removed; no issues found

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Issue from searchUnlabeledIssues] --> B[classifyIssueText]
    B --> C[scoreTypeFromText]
    C --> D[chooseTypeFromScores]
    D -- score < MIN_SCORE=2 --> E[type = null]
    D -- margin < MIN_MARGIN=1 --> E
    D -- clear winner --> F[type = winner]
    B --> G[collectDomainsFromText]
    F --> H[planIssueLabelChanges]
    G --> H
    E --> H
    H --> I{managed labels\nmissing from repo?}
    I -- yes --> J[effectiveToAdd = toAdd minus missing]
    I -- no --> K[effectiveToAdd = toAdd]
    J --> L{hasChange?}
    K --> L
    L -- no AND onlyMissing\nAND json AND missingForIssue --> M[emit skipped record\nwith reason + missingLabels]
    L -- no AND onlyMissing --> N[continue silently]
    L -- yes --> O[build record, apply labels]
    O --> P{dryRun?}
    P -- no --> Q[addIssueLabels + removeIssueLabel]
    P -- yes --> R[skip API calls]
Loading

Greploops — Automatically fix all review issues by running /greploops in Claude Code. It iterates: fix, push, re-review, repeat until 5/5 confidence.
Use the Greptile plugin for Claude Code to query reviews, search comments, and manage custom context directly from your terminal.

Reviews (2): Last reviewed commit: "fix(issue-labels): include missing-label..." | Re-trigger Greptile

Keep stderr and JSON output consistent under --only-missing when desired labels are missing from the repo.
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
scripts/issue-labels/index.js (1)

809-829: ⚠️ Potential issue | 🟡 Minor

skippedIssue counter not incremented when --json is false.

When args.json is false, issues skipped due to missing managed labels are not counted in results.skippedIssue. The warning is logged at Line 803, but the final summary at Line 872 will undercount skipped issues.

🐛 Proposed fix
     if (args.onlyMissing && !hasChange) {
+      if (missingForIssue.length > 0) {
+        results.skippedIssue += 1;
+      }
       if (args.json && missingForIssue.length > 0) {
-        results.skippedIssue += 1;
         results.changes.push({
           issue: {
             number: issue.number,
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/issue-labels/index.js` around lines 809 - 829, The skippedIssue
counter isn't incremented when args.onlyMissing && !hasChange and args.json is
false; update the block handling that condition (the if branch checking
args.onlyMissing && !hasChange) to always increment results.skippedIssue for
skipped issues regardless of args.json, then keep the existing
results.changes.push(...) behavior only when args.json is true and
missingForIssue.length > 0; touch the symbols args.onlyMissing, hasChange,
args.json, missingForIssue, and results.skippedIssue to ensure skipped counts
are correct in both JSON and non-JSON runs.
🧹 Nitpick comments (1)
scripts/issue-labels/index.js (1)

797-799: Minor: Redundant conditional wrapping the filter.

The ternary check missingForIssue.length > 0 is unnecessary since the filter expression !missingManagedLabels.has(name) would return toAdd unchanged when there are no missing labels.

♻️ Suggested simplification
-    const effectiveToAdd = missingForIssue.length > 0
-      ? toAdd.filter((name) => !missingManagedLabels.has(name))
-      : toAdd;
+    const effectiveToAdd = toAdd.filter((name) => !missingManagedLabels.has(name));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scripts/issue-labels/index.js` around lines 797 - 799, Replace the redundant
ternary assigning effectiveToAdd: instead of checking missingForIssue.length >
0, always compute effectiveToAdd by filtering toAdd with the predicate
!missingManagedLabels.has(name); this removes the unnecessary conditional and
yields the same result (use the variables effectiveToAdd, toAdd,
missingManagedLabels, and missingForIssue from the current code to locate the
change).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@scripts/issue-labels/index.js`:
- Around line 809-829: The skippedIssue counter isn't incremented when
args.onlyMissing && !hasChange and args.json is false; update the block handling
that condition (the if branch checking args.onlyMissing && !hasChange) to always
increment results.skippedIssue for skipped issues regardless of args.json, then
keep the existing results.changes.push(...) behavior only when args.json is true
and missingForIssue.length > 0; touch the symbols args.onlyMissing, hasChange,
args.json, missingForIssue, and results.skippedIssue to ensure skipped counts
are correct in both JSON and non-JSON runs.

---

Nitpick comments:
In `@scripts/issue-labels/index.js`:
- Around line 797-799: Replace the redundant ternary assigning effectiveToAdd:
instead of checking missingForIssue.length > 0, always compute effectiveToAdd by
filtering toAdd with the predicate !missingManagedLabels.has(name); this removes
the unnecessary conditional and yields the same result (use the variables
effectiveToAdd, toAdd, missingManagedLabels, and missingForIssue from the
current code to locate the change).

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 626f7826-cc48-4bd5-b383-9535bb25ff87

📥 Commits

Reviewing files that changed from the base of the PR and between d9e179a and 0c3b6b0.

📒 Files selected for processing (1)
  • scripts/issue-labels/index.js

@fengzhangchi-bytedance fengzhangchi-bytedance merged commit b064188 into larksuite:main Apr 7, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size/M Single-domain feat or fix with limited business impact

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants