Skip to content

feat: add Audio/Style Influence Controls for voice-conditioned generation#1572

Merged
ChuxiJ merged 9 commits intomainfrom
feat/issue-1095
Apr 28, 2026
Merged

feat: add Audio/Style Influence Controls for voice-conditioned generation#1572
ChuxiJ merged 9 commits intomainfrom
feat/issue-1095

Conversation

@ChuxiJ
Copy link
Copy Markdown

@ChuxiJ ChuxiJ commented Apr 8, 2026

Summary

  • Add VoiceProfile type and voice influence state to generationStore (CRUD, setters, presets)
  • Implement VoiceProfileSelector dropdown and VoiceInfluenceControls slider component
  • Add 3 built-in presets: Natural (40/60), AI Enhanced (20/80), Voice Forward (70/30)
  • Wire influence values through pipeline mapping service (voiceInfluenceMapping.ts)
  • Integrate into FullSongForm with edit/regenerate persistence in ClipGenerationParams
  • 38 new tests across 4 test files, 4010 total passing, 0 TypeScript errors

Closes #1095

Acceptance Criteria

  • Audio/Style Influence sliders render when voice profile selected
  • Sliders hidden when no voice profile active
  • Values passed to backend in generation request
  • Preset buttons apply predefined slider combinations
  • Double-click resets to default values
  • Per-voice defaults loaded when switching voices
  • Unit tests for influence parameter mapping (38 tests)
  • TypeScript: 0 errors
  • Build: successful

Test plan

  • Verify sliders appear only when a voice profile is selected
  • Verify preset buttons toggle correctly and highlight active preset
  • Verify double-click resets slider to default value
  • Verify influence values persist in ClipGenerationParams for edit/regenerate
  • Run npm test — 38 new tests pass with 0 regressions

https://claude.ai/code/session_01WKLnBJjQqPLrNpB8kprhzL

…tion (#1095)

Add voice influence sliders that control reference voice preservation and AI
style application during voice-conditioned generation. Includes VoiceProfile
type, store CRUD, VoiceProfileSelector dropdown, VoiceInfluenceControls
component with presets (Natural/AI Enhanced/Voice Forward), double-click
reset, per-voice defaults, and pipeline mapping service.

38 new tests (4010 total), 0 TypeScript errors, build passes.

https://claude.ai/code/session_01WKLnBJjQqPLrNpB8kprhzL
Copilot AI review requested due to automatic review settings April 8, 2026 06:09
Copy link
Copy Markdown
Author

ChuxiJ commented Apr 8, 2026

Note: PR #1533 also covers #1095 as part of a broader Voice Library implementation. This PR is an independent, focused implementation of the influence controls only. Reviewers may prefer the more comprehensive #1533 — if so, this PR can be closed.


Generated by Claude Code

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds voice-conditioned generation UI/state for selecting a voice profile and adjusting Audio/Style Influence, plus a mapping helper and unit tests.

Changes:

  • Introduces VoiceProfile + influence defaults/presets/clamping utilities and wires voice-related state into generationStore.
  • Adds VoiceProfileSelector and VoiceInfluenceControls components and integrates them into FullSongForm with persistence in ClipGenerationParams.
  • Adds unit tests for store behavior, UI components, and influence-to-API mapping.

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
tests/unit/voiceInfluence.test.ts Unit tests for clamping, store defaults, CRUD, selection behavior, presets, hydration, reset.
src/types/voice.ts New VoiceProfile/preset types + defaults + clampInfluence.
src/types/project.ts Persists voiceProfileId, audioInfluence, styleInfluence into ClipGenerationParams.
src/store/generationStore.ts Adds voice profile list + selection + influence setters/preset applier.
src/services/voiceInfluenceMapping.ts New mapper from 0–100 UI values to API params.
src/services/tests/voiceInfluenceMapping.test.ts Tests for mapping/clamping/null behavior.
src/components/generation/VoiceProfileSelector.tsx Dropdown to select voice profile (hidden if none exist).
src/components/generation/VoiceInfluenceControls.tsx Sliders + preset buttons rendered only when a voice is selected.
src/components/generation/FullSongForm.tsx Renders selector/controls and persists voice params into generationParams.
src/components/generation/tests/VoiceProfileSelector.test.tsx UI tests for voice selector rendering and selection/deselection.
src/components/generation/tests/VoiceInfluenceControls.test.tsx UI tests for sliders, presets, reset, and voice name display.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +191 to +199
if (p.voiceProfileId) {
useGenerationStore.getState().setSelectedVoiceProfile(p.voiceProfileId);
}
if (p.audioInfluence !== undefined) {
useGenerationStore.getState().setAudioInfluence(p.audioInfluence);
}
if (p.styleInfluence !== undefined) {
useGenerationStore.getState().setStyleInfluence(p.styleInfluence);
}
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When hydrating edit mode from generationParams, voice-related fields are only applied if present. If the clip has generationParams but no voiceProfileId/influence fields (e.g., older clips or clips generated without voice), the previously-selected voice/influence values in generationStore remain, causing stale UI state and persisting the wrong params on re-generate. Consider explicitly clearing selectedVoiceProfileId (and resetting influences) when p.voiceProfileId is missing, and similarly resetting influences when p.audioInfluence/p.styleInfluence are undefined.

Copilot uses AI. Check for mistakes.
Comment on lines +37 to +45
const handleAudioDoubleClick = useCallback(
() => setAudioInfluence(DEFAULT_AUDIO_INFLUENCE),
[setAudioInfluence],
);

const handleStyleDoubleClick = useCallback(
() => setStyleInfluence(DEFAULT_STYLE_INFLUENCE),
[setStyleInfluence],
);
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Double-click reset currently sets sliders back to global defaults (40/60) rather than the selected voice profile's per-voice defaults (defaultAudioInfluence / defaultStyleInfluence). With per-voice defaults already modeled and loaded on selection, double-click should reset to those values (falling back to global defaults only when the profile has none) to avoid surprising behavior when a voice’s defaults differ.

Copilot uses AI. Check for mistakes.
Comment thread src/store/generationStore.ts Outdated
Comment on lines +924 to +925
audioInfluence: profile?.defaultAudioInfluence ?? DEFAULT_AUDIO_INFLUENCE,
styleInfluence: profile?.defaultStyleInfluence ?? DEFAULT_STYLE_INFLUENCE,
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setSelectedVoiceProfile assigns audioInfluence/styleInfluence directly from the stored VoiceProfile defaults without clamping. Since profiles may come from persisted/user-edited data, consider running clampInfluence on defaultAudioInfluence/defaultStyleInfluence here to keep generationForm invariants (0–100 ints) consistent with the setters and mapping.

Suggested change
audioInfluence: profile?.defaultAudioInfluence ?? DEFAULT_AUDIO_INFLUENCE,
styleInfluence: profile?.defaultStyleInfluence ?? DEFAULT_STYLE_INFLUENCE,
audioInfluence: clampInfluence(profile?.defaultAudioInfluence ?? DEFAULT_AUDIO_INFLUENCE),
styleInfluence: clampInfluence(profile?.defaultStyleInfluence ?? DEFAULT_STYLE_INFLUENCE),

Copilot uses AI. Check for mistakes.
Comment thread src/services/voiceInfluenceMapping.ts Outdated
Comment on lines +15 to +20
export interface VoiceInfluenceApiParams {
/** Maps to CoverTaskParams.audio_cover_strength. null = no voice active. */
audio_cover_strength: number | null;
/** Multiplier for guidance_scale. null = no voice active. */
guidance_scale_factor: number | null;
}
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

voiceInfluenceMapping introduces guidance_scale_factor, but this field does not exist in the typed backend task params in src/types/api.ts (e.g., Text2MusicTaskParams, LegoTaskParams, CoverTaskParams). If this is intended to be sent to the backend, the API types (and the request construction code) should be updated accordingly; otherwise the mapping risks diverging from what is actually transmitted.

Copilot uses AI. Check for mistakes.
/**
* Audio/Style Influence sliders for voice-conditioned generation.
* Rendered only when a voice profile is selected in the generation form.
* Follows DAW interaction patterns: double-click to reset, real-time percentage tooltip.
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The component docstring claims a “real-time percentage tooltip”, but the UI only renders inline percentage text and no tooltip (e.g., title/popover). Either implement an actual tooltip or update the comment to match the current behavior to avoid misleading docs.

Suggested change
* Follows DAW interaction patterns: double-click to reset, real-time percentage tooltip.
* Follows DAW interaction patterns: double-click to reset and real-time percentage feedback.

Copilot uses AI. Check for mistakes.

it('renders nothing when no profiles exist', () => {
const { container } = render(<VoiceProfileSelector />);
// Should still render the selector even with no profiles (shows "No voice" only)
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test’s inline comment is contradictory: it says the selector should still render with no profiles, but the assertion expects no <select> and the component intentionally returns null when voiceProfiles.length === 0. Consider updating/removing the comment to match the intended behavior so the test remains self-explanatory.

Suggested change
// Should still render the selector even with no profiles (shows "No voice" only)
// Should not render the selector when no voice profiles are available.

Copilot uses AI. Check for mistakes.
Comment on lines +250 to +252
voiceProfileId: selectedVoiceProfileId ?? undefined,
audioInfluence: selectedVoiceProfileId ? audioInfluence : undefined,
styleInfluence: selectedVoiceProfileId ? styleInfluence : undefined,
Copy link

Copilot AI Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new generationParams fields (voiceProfileId / audioInfluence / styleInfluence) are now persisted for edit/regenerate, but the new-generation path (generateText2Music(...)) does not carry any corresponding voice fields into the generation pipeline. That means the sliders won’t affect backend requests for new clips (and potentially not for regenerate either, depending on how regenerateClip reads params). Consider extending the generation request/pipeline to consume these params and apply voiceInfluenceMapping, otherwise the persisted fields are effectively unused.

Copilot uses AI. Check for mistakes.
ChuxiJ added 8 commits April 8, 2026 06:15
- Clamp per-voice defaults in setSelectedVoiceProfile
- Clear voice state when editing clips without voice params (stale state bug)
- Double-click reset uses per-voice defaults instead of global defaults
- Fix docstring: "tooltip" → "feedback" (no tooltip UI exists)
- Fix contradictory test comment in VoiceProfileSelector
- Clarify guidance_scale_factor as local abstraction in mapping docs
- Add TODO for voice param pipeline wiring (pending backend support)

https://claude.ai/code/session_01WKLnBJjQqPLrNpB8kprhzL
@ChuxiJ
Copy link
Copy Markdown
Author

ChuxiJ commented Apr 28, 2026

Review/verification update after rebasing onto the merged Voice Library work:

  • Adapted the influence controls to the shared Voice Library store and removed the older standalone selector/mapping implementation.
  • Preserved per-clip voiceProfileId/audioInfluence/styleInfluence for regenerate, while keeping legacy regenerate fallback behavior when a clip has no saved voice profile.
  • Fixed edit-mode stale selection cases: saved voices hydrate correctly, missing saved voices clear the selected voice, clips with no saved voice do not inherit stale UI state, and clip overrides are not written back into global Voice Library defaults.
  • Addressed Copilot's double-click reset feedback by resetting to the selected voice's loaded defaults.
  • Final Codex review: no findings.
  • Local verification:
    • npx tsc --noEmit --pretty false
    • git diff --check
    • PATH=/Users/gongjunmin/.cache/codex-runtimes/codex-primary-runtime/dependencies/node/bin:$PATH npx vitest run tests/unit/generationAdvancedControls.test.tsx src/components/generation/__tests__/VoiceInfluenceControls.test.tsx tests/unit/voiceInfluence.test.ts src/services/__tests__/generationPipeline.test.ts (64 tests)
  • GitHub Actions verification on latest head 666d522:
    • Test workflow: build, build-wasm, rust-test, tauri-rust-test, type-check, unit-test all passed.
    • E2E Tests workflow: e2e-critical and e2e-extended passed.

@ChuxiJ ChuxiJ merged commit 37513d5 into main Apr 28, 2026
8 checks passed
@ChuxiJ ChuxiJ deleted the feat/issue-1095 branch April 28, 2026 18:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: Audio/Style Influence Controls for Voice-Conditioned Generation

2 participants