Skip to content

Conversation

@sanity
Copy link
Collaborator

@sanity sanity commented Oct 27, 2025

Summary

Implements automated issue labeling using OpenAI's GPT-5 mini to address the 39% unlabeled issue problem identified in #1980.

What Changed

GitHub Action Workflow (.github/workflows/auto-label-issues.yml)

  • Triggers on new issue creation
  • Uses GPT-5 mini with JSON response format for reliable parsing
  • Applies labels with ≥75% confidence threshold
  • Posts explanatory comment with confidence scores

Retroactive Labeling Script (~/code/freenet/retroactive-label-issues.sh)

  • Command-line tool to label existing unlabeled issues
  • Processes issues in batches with rate limiting
  • Supports dry-run mode for testing
  • Usage: ./retroactive-label-issues.sh [--dry-run] [--limit N]

Security Measures

Anti-spam:

  • Only processes issues from GitHub accounts >7 days old
  • Rate limited via concurrency group (1 at a time)
  • Only triggers on 'opened' event, not 'edited'

Cost control:

  • Truncates issue bodies >8000 chars
  • Uses GPT-5 mini (~$0.0005/issue, 20x cheaper than Claude)
  • Estimated cost: ~$0.10/month for ongoing issues, ~$0.10 for 197 existing unlabeled issues

Prompt injection protection:

  • User content wrapped in XML delimiters
  • Explicit anti-injection instructions in system prompt
  • JSON response format enforced

Testing Plan

  1. Merge this PR (workflow will activate automatically)
  2. Create test issue to verify auto-labeling works
  3. Run retroactive script with --dry-run --limit 10 to test
  4. Review results, adjust confidence threshold if needed
  5. Run full retroactive labeling on 197 unlabeled issues

Related

Part of label schema simplification effort in #1980


[AI-assisted debugging and comment]

Automatically labels new issues using OpenAI GPT-5 mini API when they lack a T- (Type) label.
Applies labels with >=75% confidence and posts explanatory comments.

Security measures against abuse:
- Only processes issues from accounts >7 days old (anti-spam)
- Rate limited via concurrency group (1 at a time)
- Only triggers on 'opened', not 'edited' (reduces duplicate API calls)
- Truncates issue bodies >8000 chars (cost control)
- Prompt injection protections with XML delimiters
- Uses org-level OPENAI_API_KEY
- JSON response format for reliable parsing

Cost estimate: ~$0.0005 per issue (~20x cheaper than Claude)

Includes retroactive labeling script for existing unlabeled issues.

Related to #1980

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
@sanity sanity added C-proposal A-developer-xp Area: developer experience labels Oct 27, 2025
@sanity sanity added this pull request to the merge queue Oct 27, 2025
@sanity sanity removed the C-proposal label Oct 27, 2025
Merged via the queue into main with commit 84f1f78 Oct 27, 2025
10 checks passed
@sanity sanity deleted the feat/auto-label-issues branch October 27, 2025 01:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-developer-xp Area: developer experience

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants