fix: address security review - SSRF protection, credential logging, memory leak, type safety#2
fix: address security review - SSRF protection, credential logging, memory leak, type safety#2
Conversation
…tection, memory leak, and type safety Co-authored-by: AkkiaS7 <68485070+AkkiaS7@users.noreply.github.com>
Co-authored-by: AkkiaS7 <68485070+AkkiaS7@users.noreply.github.com>
…ogging, fix test naming Co-authored-by: AkkiaS7 <68485070+AkkiaS7@users.noreply.github.com>
Co-authored-by: AkkiaS7 <68485070+AkkiaS7@users.noreply.github.com>
…uplication Co-authored-by: AkkiaS7 <68485070+AkkiaS7@users.noreply.github.com>
… messages Co-authored-by: AkkiaS7 <68485070+AkkiaS7@users.noreply.github.com>
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. Warning Firewall rules blocked me from connecting to one or more addresses (expand for details)I tried to connect to the following addresses, but was blocked by firewall rules:
If you need me to access, download, or install something from one of these locations, you can either:
|
There was a problem hiding this comment.
Pull request overview
本 PR 针对 StickerHub 的飞书 webhook 绑定链路,修复安全审计与代码质量问题(SSRF 防护、日志凭据泄露、pending 请求清理、类型收紧),并补充相应测试与文档/配置示例,降低 webhook 绑定的安全与运维风险。
Changes:
- 新增 webhook 域名白名单配置并在 webhook URL 归一化时做校验,降低 SSRF 风险
- 抽取 URL 脱敏工具
mask_url(),避免日志中输出 webhook token - 增加 pending webhook 绑定请求的清理逻辑与测试;收紧
target_mode类型到Literal["bot", "webhook"]
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
src/stickerhub/services/binding.py |
注入 webhook allowed hosts,URL 归一化增加白名单校验,并在失败日志中脱敏输出 |
src/stickerhub/utils/url_masking.py |
新增通用 URL 脱敏工具供日志使用 |
src/stickerhub/config.py |
新增 FEISHU_WEBHOOK_ALLOWED_HOSTS 配置项映射到 Settings |
src/stickerhub/main.py |
将 Settings 中的 allowed hosts 传入 BindingService |
src/stickerhub/core/ports.py |
TargetPlatformSender.send() 的 target_mode 改为 Literal 收紧类型 |
src/stickerhub/adapters/telegram_source.py |
在多个入口调用 pending webhook 请求清理,避免字典无界增长 |
src/stickerhub/adapters/feishu_sender.py |
发送前对 webhook target 做脱敏日志输出 |
tests/test_telegram_source.py |
增加 webhook pending 清理的单测 |
tests/test_binding_sqlite.py |
增加 webhook 域名白名单(SSRF 防护)相关单测 |
README.md / .env.example |
记录新增配置项及示例 |
src/stickerhub/utils/url_masking.py
Outdated
| if parsed.path and len(parsed.path) > PATH_MASK_THRESHOLD: | ||
| masked_path = ( | ||
| f"{parsed.path[:PATH_PREFIX_LENGTH]}...{parsed.path[-PATH_SUFFIX_LENGTH:]}" | ||
| ) | ||
| else: | ||
| masked_path = parsed.path | ||
| return f"{parsed.scheme}://{parsed.netloc}{masked_path}" |
There was a problem hiding this comment.
mask_url() 目前对很多“非标准 URL”输入不会真正脱敏:urlparse() 很少抛异常,像 "token_only" 这类字符串会被格式化成 ://token_only 原样写入日志;另外使用 parsed.netloc 可能把 user:pass@host 这类 userinfo 直接打到日志里。建议在拼接返回值前显式校验 scheme/hostname,并使用 parsed.hostname(且忽略/剔除 userinfo、端口、query/fragment),否则直接返回固定占位符。
| if parsed.path and len(parsed.path) > PATH_MASK_THRESHOLD: | |
| masked_path = ( | |
| f"{parsed.path[:PATH_PREFIX_LENGTH]}...{parsed.path[-PATH_SUFFIX_LENGTH:]}" | |
| ) | |
| else: | |
| masked_path = parsed.path | |
| return f"{parsed.scheme}://{parsed.netloc}{masked_path}" | |
| # 仅在解析结果有明确的协议和主机名时才返回拼接后的 URL | |
| if not parsed.scheme or not parsed.hostname: | |
| return "[url_masked]" | |
| if parsed.path and len(parsed.path) > PATH_MASK_THRESHOLD: | |
| masked_path = ( | |
| f"{parsed.path[:PATH_PREFIX_LENGTH]}...{parsed.path[-PATH_SUFFIX_LENGTH:]}" | |
| ) | |
| else: | |
| masked_path = parsed.path | |
| # 仅使用 hostname,避免将 userinfo、端口等敏感信息写入日志 | |
| return f"{parsed.scheme}://{parsed.hostname}{masked_path}" |
There was a problem hiding this comment.
已修复,commit 7851551 和 fb62662。
现在 mask_url() 会:
- 验证 URL 有明确的 scheme 和 hostname,否则返回
[url_masked] - 仅使用
parsed.hostname,避免 userinfo、端口等敏感信息泄露 - 对非标准 URL(如
"token_only")返回固定占位符而非原样输出
src/stickerhub/config.py
Outdated
| feishu_webhook_allowed_hosts: list[str] = Field( | ||
| default=["open.feishu.cn", "open.larksuite.com"], | ||
| alias="FEISHU_WEBHOOK_ALLOWED_HOSTS", | ||
| ) | ||
| log_level: str = Field(default="INFO", alias="LOG_LEVEL") |
There was a problem hiding this comment.
FEISHU_WEBHOOK_ALLOWED_HOSTS 在 README/.env.example 里说明为“逗号分隔、留空用默认值”,但这里字段类型是 list[str]:从 .env 读到 FEISHU_WEBHOOK_ALLOWED_HOSTS=(空字符串)或 a,b(非 JSON)时,pydantic-settings 通常会解析失败并导致启动报错。建议在 SettingsConfigDict 配置 env_ignore_empty=True 并设置 env_parse_delimiter=","(或将该字段改成 str | None 后自行 split+strip)。
There was a problem hiding this comment.
已修复,commit 7851551。
现在使用 JSON 格式解析 FEISHU_WEBHOOK_ALLOWED_HOSTS:
- 不设置:使用默认白名单
["open.feishu.cn", "open.larksuite.com"] - 设为
[]:禁用白名单校验 - 设为 JSON 数组:如
["open.feishu.cn","custom.domain.com"]
示例配置:
# 使用默认白名单
FEISHU_WEBHOOK_ALLOWED_HOSTS=
# 禁用白名单(谨慎使用)
FEISHU_WEBHOOK_ALLOWED_HOSTS=[]
# 自定义白名单
FEISHU_WEBHOOK_ALLOWED_HOSTS=["open.feishu.cn","custom.domain.com"]
src/stickerhub/services/binding.py
Outdated
| if not parsed.netloc: | ||
| return None | ||
|
|
||
| # 域名白名单校验(SSRF 防护) | ||
| if parsed.netloc.lower() not in [host.lower() for host in allowed_hosts]: | ||
| return None | ||
|
|
There was a problem hiding this comment.
域名白名单校验使用了 parsed.netloc 做精确匹配;netloc 会包含端口(如 open.feishu.cn:443),也可能包含 userinfo,从而导致合法 URL 被误拒或校验语义不清。建议改用 parsed.hostname 做白名单匹配,并根据需要显式允许/禁止端口(例如仅允许空端口或 443)。
| if not parsed.netloc: | |
| return None | |
| # 域名白名单校验(SSRF 防护) | |
| if parsed.netloc.lower() not in [host.lower() for host in allowed_hosts]: | |
| return None | |
| # 必须有合法主机名 | |
| if not parsed.hostname: | |
| return None | |
| # 仅允许默认端口或 443,避免通过非常规端口绕过校验 | |
| if parsed.port not in (None, 443): | |
| return None | |
| # 域名白名单校验(SSRF 防护)——仅基于 hostname,而非包含端口的 netloc | |
| hostname = parsed.hostname.lower() | |
| if hostname not in [host.lower() for host in allowed_hosts]: | |
| return None |
There was a problem hiding this comment.
已修复,commit 7851551 和 fb62662。
现在域名白名单校验:
- 使用
parsed.hostname而非parsed.netloc进行匹配 - 不限制端口(任何端口都可以通过校验)
- 已添加测试验证带自定义端口的 URL(如
https://open.feishu.cn:8443/...)可以正常通过
| normalized_url = _normalize_feishu_webhook_url(webhook_url, self._webhook_allowed_hosts) | ||
| if not normalized_url: | ||
| # 脱敏 URL 用于日志 | ||
| masked_url = mask_url(webhook_url) | ||
| logger.warning( | ||
| "Webhook 绑定失败: 平台=%s user=%s 原因=URL格式不合法", | ||
| "Webhook 绑定失败: 平台=%s user=%s 原因=URL格式不合法或域名不在白名单内 url=%s", | ||
| source_platform, | ||
| source_user_id, | ||
| masked_url, |
There was a problem hiding this comment.
handle_bind_webhook() 虽然在失败日志里对输入 URL 做了脱敏,但后续成功日志仍会输出 details.get("previous_webhook")(旧 webhook URL),这通常包含敏感 token,依然会造成凭据泄露。建议对 previous_webhook 同样使用 mask_url() 后再记录,或只记录是否存在/哈希摘要而不记录 URL 本身。
There was a problem hiding this comment.
已修复,commit 7851551。
现在成功日志中的 previous_webhook 会先使用 mask_url() 脱敏后再记录,防止旧凭据泄露:
previous_webhook_masked = (
mask_url(details["previous_webhook"])
if details.get("previous_webhook")
else None
)
logger.info("... previous_webhook=%s ...", previous_webhook_masked)|
@copilot apply changes based on the comments in this thread
|
…option to disable whitelist, use JSON format for env config Co-authored-by: AkkiaS7 <68485070+AkkiaS7@users.noreply.github.com>
Co-authored-by: AkkiaS7 <68485070+AkkiaS7@users.noreply.github.com>
已完成所有修改,commits 主要变更
安全验证:CodeQL 分析通过,0 个告警 |
Addresses security and code quality issues from PR #1 code review: webhook URL credential leakage in logs, SSRF vulnerability via unrestricted webhook POST, memory leak in pending request cleanup, and loose type safety on target_mode parameter.
Security Fixes
SSRF Protection with Flexible Configuration:
FEISHU_WEBHOOK_ALLOWED_HOSTSconfig using JSON format (e.g.,["open.feishu.cn","open.larksuite.com"])["open.feishu.cn", "open.larksuite.com"][]: Disables whitelist (allows any domain - use with caution)hostname(notnetloc), allowing any port on whitelisted domainsEnhanced Credential Logging Protection:
mask_url()utility that validates URLs have proper scheme and hostnameparsed.hostnameinstead ofparsed.netlocto avoid leaking userinfo, port info[url_masked]for non-standard URLsprevious_webhookin success logs to prevent credential leakageMemory Leak Fix: Add
_cleanup_pending_webhook_requests()calls tohandle_bind(),handle_bind_mode_callback(), andhandle_message()to prevent unbounded dictionary growth from abandoned flows.Code Quality
Type Safety: Change
target_modefromstrtoLiteral["bot", "webhook"]inTargetPlatformSenderprotocol andFeishuSenderimplementation for compile-time validation.Test Coverage:
DRY: Extract URL masking constants and logic to shared
utils/url_masking.pymodule.Configuration: Add
get_webhook_allowed_hosts()method to Settings to handle default/custom/disabled whitelist logic cleanly.Configuration Examples
Security Validation: CodeQL analysis passes with 0 alerts.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.