feat(classifier): 新增 notResource flag,AI 兜底拦表情包/dev URL/裸图片#21
Merged
Conversation
事故复盘:ChatBot listener 漏过表情包/GIF/自家 GitHub PR 链接,被 DeepSeek 走完分类后打 APPROVED 上架(#5/#18/#19)。listener 改了多轮黑名单(贴纸聚合站、媒体扩展名、self-org GitHub dev 子路径),但黑名单永远穷举不完。 改:ClassificationResult 加第 5 个 flag notResource,prompt 教模型识别'内容资源 vs 非资源':表情包/贴纸/GIF/裸图片/视频音频直链/登录墙/错误页/dev 子路径(PR/issue/commit)一律 notResource=true → 走 FLAGGED 进人工待审。仓库主页、文章、论文、项目主页等正常资源全 false 放行。 兼容性:parseResponse 用 .asBoolean(false) 读 notResource,旧模型/旧 cache 缺字段时降级为 false,不阻拦正常分享。flags map 多带一个 key,前端展示逻辑会自然 fallthrough。 测试:+1 场景 (notResource=true → FLAGGED),全 50 个 community.** 测试 pass。
Contributor
There was a problem hiding this comment.
Pull request overview
该 PR 为链接分类结果新增 notResource 兜底标记,用于在 listener 黑名单覆盖不到的情况下,通过 AI 审核识别“非可分享资源”(表情包/GIF/裸图片页/登录墙/dev 通知页等)并进入人工复核队列,避免这类链接被自动上架。
Changes:
ClassificationResult新增boolean notResource,并纳入anyFlagSet()与降级工厂方法返回值ClassificationService更新 prompt 规则与 JSON 模板,并在解析时兼容旧响应缺字段默认falseSharedLinkEnrichmentWorkerTests增加notResource=true → FLAGGED的单测覆盖
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/main/java/com/involutionhell/backend/community/service/ClassificationResult.java | 新增 notResource 字段并将其纳入 flag 聚合与降级返回 |
| src/main/java/com/involutionhell/backend/community/service/ClassificationService.java | prompt/模板/解析逻辑支持 notResource 且对旧模型输出保持兼容 |
| src/test/java/com/involutionhell/backend/community/service/SharedLinkEnrichmentWorkerTests.java | 新增 notResource 场景测试,验证会路由到 FLAGGED 且 flags map 包含新 key |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ChatBot listener 改了几轮黑名单(贴纸 host / 媒体扩展名 / self-org GitHub dev 子路径),把 `#5 mmbiz 图片` / `#18 klipy GIF` / `#19 自家 PR` 这种非资源链接挡掉。但黑名单永远穷举不完——backend 这层 AI 兜底必须能 catch listener 漏的。
改法
`ClassificationResult` 加第 5 个 flag `notResource`:
任一 flag 命中(含 notResource)→ Worker 走 FLAGGED 进人工待审,跟 nsfw/ad/flame/illegal 同级。
改动
兼容性
Test
后续不在本 PR scope
🤖 Generated with Claude Code