feat: add Audio/Style Influence Controls for voice-conditioned generation#1572
feat: add Audio/Style Influence Controls for voice-conditioned generation#1572
Conversation
…tion (#1095) Add voice influence sliders that control reference voice preservation and AI style application during voice-conditioned generation. Includes VoiceProfile type, store CRUD, VoiceProfileSelector dropdown, VoiceInfluenceControls component with presets (Natural/AI Enhanced/Voice Forward), double-click reset, per-voice defaults, and pipeline mapping service. 38 new tests (4010 total), 0 TypeScript errors, build passes. https://claude.ai/code/session_01WKLnBJjQqPLrNpB8kprhzL
|
Note: PR #1533 also covers #1095 as part of a broader Voice Library implementation. This PR is an independent, focused implementation of the influence controls only. Reviewers may prefer the more comprehensive #1533 — if so, this PR can be closed. Generated by Claude Code |
There was a problem hiding this comment.
Pull request overview
Adds voice-conditioned generation UI/state for selecting a voice profile and adjusting Audio/Style Influence, plus a mapping helper and unit tests.
Changes:
- Introduces
VoiceProfile+ influence defaults/presets/clamping utilities and wires voice-related state intogenerationStore. - Adds
VoiceProfileSelectorandVoiceInfluenceControlscomponents and integrates them intoFullSongFormwith persistence inClipGenerationParams. - Adds unit tests for store behavior, UI components, and influence-to-API mapping.
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/unit/voiceInfluence.test.ts | Unit tests for clamping, store defaults, CRUD, selection behavior, presets, hydration, reset. |
| src/types/voice.ts | New VoiceProfile/preset types + defaults + clampInfluence. |
| src/types/project.ts | Persists voiceProfileId, audioInfluence, styleInfluence into ClipGenerationParams. |
| src/store/generationStore.ts | Adds voice profile list + selection + influence setters/preset applier. |
| src/services/voiceInfluenceMapping.ts | New mapper from 0–100 UI values to API params. |
| src/services/tests/voiceInfluenceMapping.test.ts | Tests for mapping/clamping/null behavior. |
| src/components/generation/VoiceProfileSelector.tsx | Dropdown to select voice profile (hidden if none exist). |
| src/components/generation/VoiceInfluenceControls.tsx | Sliders + preset buttons rendered only when a voice is selected. |
| src/components/generation/FullSongForm.tsx | Renders selector/controls and persists voice params into generationParams. |
| src/components/generation/tests/VoiceProfileSelector.test.tsx | UI tests for voice selector rendering and selection/deselection. |
| src/components/generation/tests/VoiceInfluenceControls.test.tsx | UI tests for sliders, presets, reset, and voice name display. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if (p.voiceProfileId) { | ||
| useGenerationStore.getState().setSelectedVoiceProfile(p.voiceProfileId); | ||
| } | ||
| if (p.audioInfluence !== undefined) { | ||
| useGenerationStore.getState().setAudioInfluence(p.audioInfluence); | ||
| } | ||
| if (p.styleInfluence !== undefined) { | ||
| useGenerationStore.getState().setStyleInfluence(p.styleInfluence); | ||
| } |
There was a problem hiding this comment.
When hydrating edit mode from generationParams, voice-related fields are only applied if present. If the clip has generationParams but no voiceProfileId/influence fields (e.g., older clips or clips generated without voice), the previously-selected voice/influence values in generationStore remain, causing stale UI state and persisting the wrong params on re-generate. Consider explicitly clearing selectedVoiceProfileId (and resetting influences) when p.voiceProfileId is missing, and similarly resetting influences when p.audioInfluence/p.styleInfluence are undefined.
| const handleAudioDoubleClick = useCallback( | ||
| () => setAudioInfluence(DEFAULT_AUDIO_INFLUENCE), | ||
| [setAudioInfluence], | ||
| ); | ||
|
|
||
| const handleStyleDoubleClick = useCallback( | ||
| () => setStyleInfluence(DEFAULT_STYLE_INFLUENCE), | ||
| [setStyleInfluence], | ||
| ); |
There was a problem hiding this comment.
Double-click reset currently sets sliders back to global defaults (40/60) rather than the selected voice profile's per-voice defaults (defaultAudioInfluence / defaultStyleInfluence). With per-voice defaults already modeled and loaded on selection, double-click should reset to those values (falling back to global defaults only when the profile has none) to avoid surprising behavior when a voice’s defaults differ.
| audioInfluence: profile?.defaultAudioInfluence ?? DEFAULT_AUDIO_INFLUENCE, | ||
| styleInfluence: profile?.defaultStyleInfluence ?? DEFAULT_STYLE_INFLUENCE, |
There was a problem hiding this comment.
setSelectedVoiceProfile assigns audioInfluence/styleInfluence directly from the stored VoiceProfile defaults without clamping. Since profiles may come from persisted/user-edited data, consider running clampInfluence on defaultAudioInfluence/defaultStyleInfluence here to keep generationForm invariants (0–100 ints) consistent with the setters and mapping.
| audioInfluence: profile?.defaultAudioInfluence ?? DEFAULT_AUDIO_INFLUENCE, | |
| styleInfluence: profile?.defaultStyleInfluence ?? DEFAULT_STYLE_INFLUENCE, | |
| audioInfluence: clampInfluence(profile?.defaultAudioInfluence ?? DEFAULT_AUDIO_INFLUENCE), | |
| styleInfluence: clampInfluence(profile?.defaultStyleInfluence ?? DEFAULT_STYLE_INFLUENCE), |
| export interface VoiceInfluenceApiParams { | ||
| /** Maps to CoverTaskParams.audio_cover_strength. null = no voice active. */ | ||
| audio_cover_strength: number | null; | ||
| /** Multiplier for guidance_scale. null = no voice active. */ | ||
| guidance_scale_factor: number | null; | ||
| } |
There was a problem hiding this comment.
voiceInfluenceMapping introduces guidance_scale_factor, but this field does not exist in the typed backend task params in src/types/api.ts (e.g., Text2MusicTaskParams, LegoTaskParams, CoverTaskParams). If this is intended to be sent to the backend, the API types (and the request construction code) should be updated accordingly; otherwise the mapping risks diverging from what is actually transmitted.
| /** | ||
| * Audio/Style Influence sliders for voice-conditioned generation. | ||
| * Rendered only when a voice profile is selected in the generation form. | ||
| * Follows DAW interaction patterns: double-click to reset, real-time percentage tooltip. |
There was a problem hiding this comment.
The component docstring claims a “real-time percentage tooltip”, but the UI only renders inline percentage text and no tooltip (e.g., title/popover). Either implement an actual tooltip or update the comment to match the current behavior to avoid misleading docs.
| * Follows DAW interaction patterns: double-click to reset, real-time percentage tooltip. | |
| * Follows DAW interaction patterns: double-click to reset and real-time percentage feedback. |
|
|
||
| it('renders nothing when no profiles exist', () => { | ||
| const { container } = render(<VoiceProfileSelector />); | ||
| // Should still render the selector even with no profiles (shows "No voice" only) |
There was a problem hiding this comment.
This test’s inline comment is contradictory: it says the selector should still render with no profiles, but the assertion expects no <select> and the component intentionally returns null when voiceProfiles.length === 0. Consider updating/removing the comment to match the intended behavior so the test remains self-explanatory.
| // Should still render the selector even with no profiles (shows "No voice" only) | |
| // Should not render the selector when no voice profiles are available. |
| voiceProfileId: selectedVoiceProfileId ?? undefined, | ||
| audioInfluence: selectedVoiceProfileId ? audioInfluence : undefined, | ||
| styleInfluence: selectedVoiceProfileId ? styleInfluence : undefined, |
There was a problem hiding this comment.
These new generationParams fields (voiceProfileId / audioInfluence / styleInfluence) are now persisted for edit/regenerate, but the new-generation path (generateText2Music(...)) does not carry any corresponding voice fields into the generation pipeline. That means the sliders won’t affect backend requests for new clips (and potentially not for regenerate either, depending on how regenerateClip reads params). Consider extending the generation request/pipeline to consume these params and apply voiceInfluenceMapping, otherwise the persisted fields are effectively unused.
- Clamp per-voice defaults in setSelectedVoiceProfile - Clear voice state when editing clips without voice params (stale state bug) - Double-click reset uses per-voice defaults instead of global defaults - Fix docstring: "tooltip" → "feedback" (no tooltip UI exists) - Fix contradictory test comment in VoiceProfileSelector - Clarify guidance_scale_factor as local abstraction in mapping docs - Add TODO for voice param pipeline wiring (pending backend support) https://claude.ai/code/session_01WKLnBJjQqPLrNpB8kprhzL
|
Review/verification update after rebasing onto the merged Voice Library work:
|
Summary
VoiceProfiletype and voice influence state togenerationStore(CRUD, setters, presets)VoiceProfileSelectordropdown andVoiceInfluenceControlsslider componentvoiceInfluenceMapping.ts)FullSongFormwith edit/regenerate persistence inClipGenerationParamsCloses #1095
Acceptance Criteria
Test plan
npm test— 38 new tests pass with 0 regressionshttps://claude.ai/code/session_01WKLnBJjQqPLrNpB8kprhzL