fix(embedding): use removesuffix instead of rstrip to prevent over-trimming URLs#7026
Conversation
之前在 AstrBotDevs#6863 中我提交的修复使用了 rstrip() 来移除末尾的 /embeddings, 但 rstrip() 是字符集操作,会误删 URL 末尾属于该字符集的字符。 例如 siliconflow.cn 的末尾 n 会被误删,导致 URL 变成 siliconflow.c 改用 removesuffix() 可以正确处理这种情况,只在字符串以指定后缀结尾时才移除。 closes AstrBotDevs#7025
There was a problem hiding this comment.
Hey - I've left some high level feedback:
- Using
str.removesuffixchanges the minimum required Python version to 3.9; if this project still supports 3.8 or earlier, consider replacing it with an equivalent helper (e.g., a smallremove_suffixutility) to avoid breaking older runtimes.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- Using `str.removesuffix` changes the minimum required Python version to 3.9; if this project still supports 3.8 or earlier, consider replacing it with an equivalent helper (e.g., a small `remove_suffix` utility) to avoid breaking older runtimes.Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Code Review
This pull request updates the URL sanitization logic in the OpenAI embedding source by replacing .rstrip() with .removesuffix() to prevent unintended character stripping. The reviewer suggests using .rstrip("/") instead of .removesuffix("/") to more robustly handle multiple trailing slashes while maintaining the use of .removesuffix("/embeddings") to fix the logic bug.
| .removesuffix("/") | ||
| .removesuffix("/embeddings") |
There was a problem hiding this comment.
感谢您修复这个 bug!使用 removesuffix 确实是正确的做法。
这里有一个小建议:只将有问题的 .rstrip("/embeddings") 改为 .removesuffix("/embeddings"),并保留 .rstrip("/") 可能更好。
原因是 .rstrip("/") 可以移除末尾所有的斜杠(例如 //),比 .removesuffix("/") 只移除一个斜杠更具鲁棒性。rstrip 的问题仅在于当参数是多字符字符串时,它会将其作为字符集处理;对于单个字符 / 则是安全且符合预期的。
这样修改后,代码既能修复 bug,又能更好地处理用户可能输入的多余斜杠。
| .removesuffix("/") | |
| .removesuffix("/embeddings") | |
| .rstrip("/") | |
| .removesuffix("/embeddings") |
|
自己拉的史自己修(1/1) .removesuffix()方法只有与后缀结尾精确匹配时才会移除字符,这个方法是python3.9引入的,在astrbot目前版本完全适用。 |
动机
修复 #7025
之前在 #6863 中提交的修复使用了
rstrip()方法来移除 URL 末尾的/embeddings路径,但rstrip()是字符集操作而非字符串移除操作,会误删 URL 末尾属于该字符集的任意字符。问题表现
当 API Base URL 以
/embeddings字符集中的字符结尾时(如siliconflow.cn的n),该字符会被误删:https://api.siliconflow.cnhttps://api.siliconflow.c/v1(n被删掉)https://api.siliconflow.cn/v1修复方案
将
rstrip()改为removesuffix(),后者只在字符串以指定后缀结尾时才移除,避免误删字符。Modifications / 改动点
将
astrbot/core/provider/sources/openai_embedding_source.py中的.rstrip("/")和.rstrip("/embeddings")改为.removesuffix("/")和.removesuffix("/embeddings")This is NOT a breaking change. / 这不是一个破坏性变更。
Screenshots or Test Results / 运行截图或测试结果
本地测试日志:
URL 中的
n不再被误删。Checklist / 检查清单
😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
/ 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。
👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
/ 我的更改经过了良好的测试,并已在上方提供了"验证步骤"和"运行截图"。
🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in
requirements.txtandpyproject.toml./ 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到
requirements.txt和pyproject.toml文件相应位置。😮 My changes do not introduce malicious code.
/ 我的更改没有引入恶意代码。
Summary by Sourcery
Bug Fixes: