Skip to content

✨ feat(ai): 添加AI相关功能,包括小说分镜生成、图像生成、语音合成等模块#22

Merged
zhangzqs merged 1 commit into
cohesion-dev:mainfrom
zhangzqs:aigc
Oct 25, 2025
Merged

✨ feat(ai): 添加AI相关功能,包括小说分镜生成、图像生成、语音合成等模块#22
zhangzqs merged 1 commit into
cohesion-dev:mainfrom
zhangzqs:aigc

Conversation

@zhangzqs

Copy link
Copy Markdown
Collaborator

No description provided.

🔧 chore(ai): 新增.gitignore、go.mod和go.sum文件
✅ test(ai): 为新功能添加单元测试
⬆️ deps(ai): 引入多个依赖库,包括json-repair、openai-go等
@zhangzqs zhangzqs merged commit 6f5c9cb into cohesion-dev:main Oct 25, 2025
@zhangzqs zhangzqs deleted the aigc branch October 25, 2025 04:18
@fennoai

fennoai Bot commented Oct 25, 2025

Copy link
Copy Markdown

Code Review Summary

This PR introduces solid AI integration functionality but requires attention to several critical issues before merging:

Critical Issues:

  • Debug print statement leaking sensitive data in production
  • Prompt injection vulnerabilities in LLM input handling
  • Context parameter ignored leading to broken cancellation

Key Improvements Needed:

  • Input validation across all public APIs
  • Security hardening for user inputs
  • Documentation for all public APIs

See inline comments for specific recommendations.

Comment thread ai/gnxaigc/chapter.go

content := resp.Choices[0].Message.Content

fmt.Printf("SummaryChapter chat completion content: %s\n", content)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL: Remove debug print statement

This leaks potentially sensitive LLM responses (user content, novel text) to stdout in production. Also creates performance overhead with large payloads.

Security: CWE-532 (Sensitive Information in Logs)

Remove this line before merging.

Comment thread ai/gnxaigc/chapter.go
input.NovelTitle,
input.ChapterTitle,
func() string {
voiceStylesJSON, _ := json.MarshalIndent(input.AvailableVoiceStyles, "", " ")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL: Handle JSON marshal errors

Silently ignoring errors will cause empty strings in prompts, leading to incorrect AI responses.

Suggested change
voiceStylesJSON, _ := json.MarshalIndent(input.AvailableVoiceStyles, "", " ")
voiceStylesJSON, err := json.Marshal(input.AvailableVoiceStyles)
if err != nil {
return nil, fmt.Errorf("failed to marshal voice styles: %w", err)
}

Performance tip: Use Marshal instead of MarshalIndent to reduce token costs (~10-15% savings).

Comment thread ai/gnxaigc/chapter.go
Comment on lines +149 to +165
prompt := fmt.Sprintf(`
你是一个擅长从小说生成动漫分镜和配音选择的设计师,后续用户将给你每一章的小说原文,你需要按指定的输出格式进行输出。

当前用户选择的小说标题为:《%s》,章节标题为:《%s》。如果你熟悉该小说的背景设定和角色人设,也可以结合你已有的知识进行参考。

配音选择时,你可以从以下提供的语音风格列表中选择合适的语音风格:

%s

请根据小说内容和情感,合理分割文本片段,并为每个片段选择合适的语音风格和语速比例(1.0为正常语速,>1.0为加快语速,<1.0为放慢语速)。

分镜设计时,请根据小说内容生成每个分镜的图像提示词,确保提示词能够准确描述该分镜的场景和氛围,并且相邻的分镜的场景一般不会突变,请尽可能输出详细的提示词,以保持一致性,这将用于后续提供给文生图大模型生成图像。

请严格按照以下给定的JSONSchema, 仅输出一个合法的 JSON 对象, 不要包含任何前导或后续的说明文字、代码块标记、引号等进行输出结果的编写,确保输出内容**严格符合JSONSchema的要求**且格式正确:

%s
`,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL: Prompt injection vulnerability

User inputs (NovelTitle, ChapterTitle, Content) are directly interpolated without validation or sanitization. Attackers can manipulate AI behavior through crafted inputs.

Security: OWASP LLM01:2023 (Prompt Injection)

Recommendations:

  1. Add input validation with length limits
  2. Sanitize special characters
  3. Use structured delimiters to separate user content from system instructions

Comment thread ai/gnxaigc/chapter.go
CharacterFeatures []CharacterFeature `json:"character_features"`
}

func (g *GnxAIGC) SummaryChapter(ctx context.Context, input SummaryChapterInput) (*SummaryChapterOutput, error) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Missing input validation

No validation for empty or oversized inputs. This risks API cost explosion and resource exhaustion.

Recommendation:

func (g *GnxAIGC) SummaryChapter(ctx context.Context, input SummaryChapterInput) (*SummaryChapterOutput, error) {
    if input.Content == "" || input.NovelTitle == "" {
        return nil, errors.New("content and title are required")
    }
    if len(input.Content) > 100000 {
        return nil, errors.New("content exceeds maximum length")
    }
    // ... continue
}

Comment thread ai/gnxaigc/chapter.go
Comment on lines +90 to +147
jsonSchema := map[string]any{
"type": "object",
"required": []string{"storyboard_items"},
"properties": map[string]any{
"storyboard_items": map[string]any{
"type": "array",
"items": map[string]any{
"type": "object",
"required": []string{
"source_text_segments",
"image_prompt",
},
"properties": map[string]any{
"source_text_segments": map[string]any{
"type": "array",
"description": "该分镜对应的多个语音文本片段及其配音选择。如一句话中可能包含旁白和对话,需要分成多个文本片段分别处理。",
"items": map[string]any{
"type": "object",
"required": []string{
"text",
"voice_name",
"voice_type",
"speed_ratio",
"is_narration",
},
"properties": map[string]any{
"text": map[string]any{
"type": "string",
"description": "分镜对应的语音文本片段",
},
"voice_name": map[string]any{
"type": "string",
"description": "该文本片段的语音风格描述",
},
"voice_type": map[string]any{
"type": "string",
"description": "该文本片段的音色类型",
},
"speed_ratio": map[string]any{
"type": "number",
"description": "语速比例",
},
"is_narration": map[string]any{
"type": "boolean",
"description": "是否为旁白文本片段",
},
},
},
},
"image_prompt": map[string]any{
"type": "string",
"description": "用于生成该分镜图像的提示词,建议使用英文描述,以便更好地兼容主流文生图大模型。",
},
},
},
},
},
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Performance: Move schema to package constant

This large JSON schema map is rebuilt on every call (~20+ allocations). Since it's constant data, move to package level:

var summaryChapterJSONSchema = map[string]any{
    // schema definition
}

Estimated savings: 2-3KB memory + CPU cycles per call

Comment thread ai/gnxaigc/image.go
}

func (g *GnxAIGC) GenerateImageByText(ctx context.Context, prompt string) ([]byte, error) {
resp, err := g.client.Images.Generate(context.TODO(), openai.ImageGenerateParams{

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL: Context parameter ignored

Function accepts ctx but uses context.TODO(). This breaks timeout/cancellation propagation, causing resource leaks and inability to cancel long-running operations.

Suggested change
resp, err := g.client.Images.Generate(context.TODO(), openai.ImageGenerateParams{
resp, err := g.client.Images.Generate(ctx, openai.ImageGenerateParams{

Comment thread ai/gnxaigc/image.go
return nil, errors.New("no image data received")
}

bs, err := base64.StdEncoding.DecodeString(resp.Data[0].B64JSON)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Security: Missing size validation before base64 decode

No bounds checking on response size. A malicious API response could cause OOM.

Security: CWE-400 (Resource Exhaustion)

Recommendation:

const MaxImageSize = 50 * 1024 * 1024 // 50MB
if base64.StdEncoding.DecodedLen(len(resp.Data[0].B64JSON)) > MaxImageSize {
    return nil, errors.New("image data exceeds maximum size")
}

Comment thread ai/gnxaigc/image.go
Comment on lines +23 to +27
func (c *Config) validate() {
c.APIKey = cmp.Or(c.APIKey, os.Getenv("OPENAI_API_KEY"))
c.BaseURL = cmp.Or(c.BaseURL, os.Getenv("OPENAI_BASE_URL"), "https://openai.qiniu.com/v1")
c.ImageModel = cmp.Or(c.ImageModel, "gemini-2.5-flash-image")
c.LanguageModel = cmp.Or(c.LanguageModel, "deepseek/deepseek-v3.1-terminus")

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Extract magic values to constants

Hardcoded model names and URLs make updates difficult.

const (
    DefaultBaseURL       = "https://openai.qiniu.com/v1"
    DefaultImageModel    = "gemini-2.5-flash-image"
    DefaultLanguageModel = "deepseek/deepseek-v3.1-terminus"
)

Security note: Consider validating that BaseURL uses HTTPS to prevent credential leakage.

Comment thread ai/gnxaigc/tts.go
if err != nil {
return nil, fmt.Errorf("TextToSpeech failed: %w", err)
}
bs, err := base64.StdEncoding.DecodeString(ttsResp.Data)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Security: Missing size validation for audio data

Same issue as image generation - no bounds checking before base64 decode.

Apply similar fix:

const MaxAudioSize = 10 * 1024 * 1024 // 10MB
if base64.StdEncoding.DecodedLen(len(ttsResp.Data)) > MaxAudioSize {
    return nil, errors.New("audio data exceeds maximum size")
}

Comment thread ai/gnxaigc/tts.go
return &ttsResp, nil
}

func (g *GnxAIGC) TextToSpeechSimple(ctx context.Context, text, voiceType string, ratio float64) ([]byte, error) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Missing input validation

Add validation for text and speed ratio:

func (g *GnxAIGC) TextToSpeechSimple(ctx context.Context, text, voiceType string, ratio float64) ([]byte, error) {
    if text == "" {
        return nil, errors.New("text cannot be empty")
    }
    if ratio <= 0 || ratio > 3.0 {
        return nil, fmt.Errorf("speed ratio must be between 0 and 3.0, got %f", ratio)
    }
    // ...
}

fmt.Printf("%+v\n", resp)
}

func TestSummaryChapter_Test(t *testing.T) {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔵 Test naming convention

Test name TestSummaryChapter_Test is redundant and unclear. Consider renaming to:

func TestChatCompletionBasic(t *testing.T) {

Or remove if it's just a debug test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant