✨ feat(ai): 添加AI相关功能,包括小说分镜生成、图像生成、语音合成等模块#22
Conversation
🔧 chore(ai): 新增.gitignore、go.mod和go.sum文件 ✅ test(ai): 为新功能添加单元测试 ⬆️ deps(ai): 引入多个依赖库,包括json-repair、openai-go等
Code Review SummaryThis PR introduces solid AI integration functionality but requires attention to several critical issues before merging: Critical Issues:
Key Improvements Needed:
See inline comments for specific recommendations. |
|
|
||
| content := resp.Choices[0].Message.Content | ||
|
|
||
| fmt.Printf("SummaryChapter chat completion content: %s\n", content) |
There was a problem hiding this comment.
🔴 CRITICAL: Remove debug print statement
This leaks potentially sensitive LLM responses (user content, novel text) to stdout in production. Also creates performance overhead with large payloads.
Security: CWE-532 (Sensitive Information in Logs)
Remove this line before merging.
| input.NovelTitle, | ||
| input.ChapterTitle, | ||
| func() string { | ||
| voiceStylesJSON, _ := json.MarshalIndent(input.AvailableVoiceStyles, "", " ") |
There was a problem hiding this comment.
🔴 CRITICAL: Handle JSON marshal errors
Silently ignoring errors will cause empty strings in prompts, leading to incorrect AI responses.
| voiceStylesJSON, _ := json.MarshalIndent(input.AvailableVoiceStyles, "", " ") | |
| voiceStylesJSON, err := json.Marshal(input.AvailableVoiceStyles) | |
| if err != nil { | |
| return nil, fmt.Errorf("failed to marshal voice styles: %w", err) | |
| } |
Performance tip: Use Marshal instead of MarshalIndent to reduce token costs (~10-15% savings).
| prompt := fmt.Sprintf(` | ||
| 你是一个擅长从小说生成动漫分镜和配音选择的设计师,后续用户将给你每一章的小说原文,你需要按指定的输出格式进行输出。 | ||
|
|
||
| 当前用户选择的小说标题为:《%s》,章节标题为:《%s》。如果你熟悉该小说的背景设定和角色人设,也可以结合你已有的知识进行参考。 | ||
|
|
||
| 配音选择时,你可以从以下提供的语音风格列表中选择合适的语音风格: | ||
|
|
||
| %s | ||
|
|
||
| 请根据小说内容和情感,合理分割文本片段,并为每个片段选择合适的语音风格和语速比例(1.0为正常语速,>1.0为加快语速,<1.0为放慢语速)。 | ||
|
|
||
| 分镜设计时,请根据小说内容生成每个分镜的图像提示词,确保提示词能够准确描述该分镜的场景和氛围,并且相邻的分镜的场景一般不会突变,请尽可能输出详细的提示词,以保持一致性,这将用于后续提供给文生图大模型生成图像。 | ||
|
|
||
| 请严格按照以下给定的JSONSchema, 仅输出一个合法的 JSON 对象, 不要包含任何前导或后续的说明文字、代码块标记、引号等进行输出结果的编写,确保输出内容**严格符合JSONSchema的要求**且格式正确: | ||
|
|
||
| %s | ||
| `, |
There was a problem hiding this comment.
🔴 CRITICAL: Prompt injection vulnerability
User inputs (NovelTitle, ChapterTitle, Content) are directly interpolated without validation or sanitization. Attackers can manipulate AI behavior through crafted inputs.
Security: OWASP LLM01:2023 (Prompt Injection)
Recommendations:
- Add input validation with length limits
- Sanitize special characters
- Use structured delimiters to separate user content from system instructions
| CharacterFeatures []CharacterFeature `json:"character_features"` | ||
| } | ||
|
|
||
| func (g *GnxAIGC) SummaryChapter(ctx context.Context, input SummaryChapterInput) (*SummaryChapterOutput, error) { |
There was a problem hiding this comment.
🟡 Missing input validation
No validation for empty or oversized inputs. This risks API cost explosion and resource exhaustion.
Recommendation:
func (g *GnxAIGC) SummaryChapter(ctx context.Context, input SummaryChapterInput) (*SummaryChapterOutput, error) {
if input.Content == "" || input.NovelTitle == "" {
return nil, errors.New("content and title are required")
}
if len(input.Content) > 100000 {
return nil, errors.New("content exceeds maximum length")
}
// ... continue
}| jsonSchema := map[string]any{ | ||
| "type": "object", | ||
| "required": []string{"storyboard_items"}, | ||
| "properties": map[string]any{ | ||
| "storyboard_items": map[string]any{ | ||
| "type": "array", | ||
| "items": map[string]any{ | ||
| "type": "object", | ||
| "required": []string{ | ||
| "source_text_segments", | ||
| "image_prompt", | ||
| }, | ||
| "properties": map[string]any{ | ||
| "source_text_segments": map[string]any{ | ||
| "type": "array", | ||
| "description": "该分镜对应的多个语音文本片段及其配音选择。如一句话中可能包含旁白和对话,需要分成多个文本片段分别处理。", | ||
| "items": map[string]any{ | ||
| "type": "object", | ||
| "required": []string{ | ||
| "text", | ||
| "voice_name", | ||
| "voice_type", | ||
| "speed_ratio", | ||
| "is_narration", | ||
| }, | ||
| "properties": map[string]any{ | ||
| "text": map[string]any{ | ||
| "type": "string", | ||
| "description": "分镜对应的语音文本片段", | ||
| }, | ||
| "voice_name": map[string]any{ | ||
| "type": "string", | ||
| "description": "该文本片段的语音风格描述", | ||
| }, | ||
| "voice_type": map[string]any{ | ||
| "type": "string", | ||
| "description": "该文本片段的音色类型", | ||
| }, | ||
| "speed_ratio": map[string]any{ | ||
| "type": "number", | ||
| "description": "语速比例", | ||
| }, | ||
| "is_narration": map[string]any{ | ||
| "type": "boolean", | ||
| "description": "是否为旁白文本片段", | ||
| }, | ||
| }, | ||
| }, | ||
| }, | ||
| "image_prompt": map[string]any{ | ||
| "type": "string", | ||
| "description": "用于生成该分镜图像的提示词,建议使用英文描述,以便更好地兼容主流文生图大模型。", | ||
| }, | ||
| }, | ||
| }, | ||
| }, | ||
| }, | ||
| } |
There was a problem hiding this comment.
⚡ Performance: Move schema to package constant
This large JSON schema map is rebuilt on every call (~20+ allocations). Since it's constant data, move to package level:
var summaryChapterJSONSchema = map[string]any{
// schema definition
}Estimated savings: 2-3KB memory + CPU cycles per call
| } | ||
|
|
||
| func (g *GnxAIGC) GenerateImageByText(ctx context.Context, prompt string) ([]byte, error) { | ||
| resp, err := g.client.Images.Generate(context.TODO(), openai.ImageGenerateParams{ |
There was a problem hiding this comment.
🔴 CRITICAL: Context parameter ignored
Function accepts ctx but uses context.TODO(). This breaks timeout/cancellation propagation, causing resource leaks and inability to cancel long-running operations.
| resp, err := g.client.Images.Generate(context.TODO(), openai.ImageGenerateParams{ | |
| resp, err := g.client.Images.Generate(ctx, openai.ImageGenerateParams{ |
| return nil, errors.New("no image data received") | ||
| } | ||
|
|
||
| bs, err := base64.StdEncoding.DecodeString(resp.Data[0].B64JSON) |
There was a problem hiding this comment.
🟡 Security: Missing size validation before base64 decode
No bounds checking on response size. A malicious API response could cause OOM.
Security: CWE-400 (Resource Exhaustion)
Recommendation:
const MaxImageSize = 50 * 1024 * 1024 // 50MB
if base64.StdEncoding.DecodedLen(len(resp.Data[0].B64JSON)) > MaxImageSize {
return nil, errors.New("image data exceeds maximum size")
}| func (c *Config) validate() { | ||
| c.APIKey = cmp.Or(c.APIKey, os.Getenv("OPENAI_API_KEY")) | ||
| c.BaseURL = cmp.Or(c.BaseURL, os.Getenv("OPENAI_BASE_URL"), "https://openai.qiniu.com/v1") | ||
| c.ImageModel = cmp.Or(c.ImageModel, "gemini-2.5-flash-image") | ||
| c.LanguageModel = cmp.Or(c.LanguageModel, "deepseek/deepseek-v3.1-terminus") |
There was a problem hiding this comment.
🟡 Extract magic values to constants
Hardcoded model names and URLs make updates difficult.
const (
DefaultBaseURL = "https://openai.qiniu.com/v1"
DefaultImageModel = "gemini-2.5-flash-image"
DefaultLanguageModel = "deepseek/deepseek-v3.1-terminus"
)Security note: Consider validating that BaseURL uses HTTPS to prevent credential leakage.
| if err != nil { | ||
| return nil, fmt.Errorf("TextToSpeech failed: %w", err) | ||
| } | ||
| bs, err := base64.StdEncoding.DecodeString(ttsResp.Data) |
There was a problem hiding this comment.
🟡 Security: Missing size validation for audio data
Same issue as image generation - no bounds checking before base64 decode.
Apply similar fix:
const MaxAudioSize = 10 * 1024 * 1024 // 10MB
if base64.StdEncoding.DecodedLen(len(ttsResp.Data)) > MaxAudioSize {
return nil, errors.New("audio data exceeds maximum size")
}| return &ttsResp, nil | ||
| } | ||
|
|
||
| func (g *GnxAIGC) TextToSpeechSimple(ctx context.Context, text, voiceType string, ratio float64) ([]byte, error) { |
There was a problem hiding this comment.
🟡 Missing input validation
Add validation for text and speed ratio:
func (g *GnxAIGC) TextToSpeechSimple(ctx context.Context, text, voiceType string, ratio float64) ([]byte, error) {
if text == "" {
return nil, errors.New("text cannot be empty")
}
if ratio <= 0 || ratio > 3.0 {
return nil, fmt.Errorf("speed ratio must be between 0 and 3.0, got %f", ratio)
}
// ...
}| fmt.Printf("%+v\n", resp) | ||
| } | ||
|
|
||
| func TestSummaryChapter_Test(t *testing.T) { |
There was a problem hiding this comment.
🔵 Test naming convention
Test name TestSummaryChapter_Test is redundant and unclear. Consider renaming to:
func TestChatCompletionBasic(t *testing.T) {Or remove if it's just a debug test.
No description provided.