Skip to content

Fix multi sample#24

Merged
jamiepine merged 5 commits intomainfrom
fix-multi-sample
Jan 31, 2026
Merged

Fix multi sample#24
jamiepine merged 5 commits intomainfrom
fix-multi-sample

Conversation

@jamiepine
Copy link
Owner

@jamiepine jamiepine commented Jan 31, 2026

Fix voice cloning with multiple samples fallback to default voice

Closes #18

Summary

Fixed a critical bug where voice profiles with multiple samples would fall back to the default Qwen voice instead of using the cloned voice. The issue was caused by temporary files being deleted before generation could use them.

Changes

Backend:

  • backend/profiles.py: Changed combined audio storage from temporary files to persistent cache files (data/cache/combined_{profile_id}_{hash}.wav)
  • backend/backends/mlx_backend.py: Added validation to check cached audio files exist before use, with automatic regeneration if missing
  • backend/backends/pytorch_backend.py: Updated cache validation logic for consistency
  • backend/utils/cache.py: Added clear_voice_prompt_cache() function to clear stale cache entries
  • backend/main.py: Added POST /cache/clear endpoint for manual cache clearing

Frontend:

  • app/src/components/VoiceProfiles/SampleList.tsx: Added informational note that single 30-second samples produce optimal quality

Technical Details

When multiple samples were combined, the system created a temp file, generated a voice prompt (storing the file path), then immediately deleted the temp file. This caused the MLX backend to fail during generation since it stores file paths rather than loading audio into memory. The fix ensures combined audio persists in the cache directory.


Note

Medium Risk
Touches core voice-cloning prompt creation/caching and adds a new cache-clearing API, so regressions could affect generation quality or cache behavior, though changes are localized and guarded with validation/fallbacks.

Overview
Fixes voice cloning for profiles with multiple samples by persisting the combined reference audio in the cache directory (instead of a temp file that could be deleted before generation) and hardening cached-prompt usage.

The MLX backend now verifies cached ref_audio paths exist (both when creating and when generating) and falls back to non-cloned generation if the referenced audio is missing; PyTorch cache handling is clarified for dict-vs-tensor prompt formats. Adds clear_voice_prompt_cache() plus a new POST /cache/clear endpoint to purge stale prompt entries.

On the frontend, HistoryTable gains infinite scrolling with accumulated paging, loading indicators, and end-of-list messaging, and SampleList adds a short guidance note about optimal sample usage.

Written by Cursor Bugbot for commit 953e6ec. Configure here.

- Updated HistoryTable to implement infinite scrolling for loading history items dynamically.
- Introduced state management for accumulated history and total item count.
- Added Intersection Observer for triggering additional data fetches when scrolling.
- Implemented cache clearing functionality in the backend to manage voice prompt caches effectively.
- Improved loading indicators and user feedback for data fetching states.
- Refactored code for better readability and maintainability.
- Rearranged import statements for consistency and clarity.
- Corrected the spelling of "interchangeable" in the note about sample quality.
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

cache_file.unlink()
deleted_count += 1
except Exception as e:
print(f"Failed to delete cache file {cache_file}: {e}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache clearing omits combined audio WAV files

Medium Severity

The new clear_voice_prompt_cache() function only deletes *.prompt files, but this PR introduces combined audio files stored as combined_{profile_id}_{hash}.wav in the same cache directory. These .wav files are never cleaned up - not when clearing the cache, not when samples are deleted, and not when profiles change. Over time, this causes orphaned files to accumulate.

Additional Locations (1)

Fix in Cursor Fix in Web

setPage(0);
setAllHistory([]);
}
}, [deleteGeneration.isSuccess, importGeneration.isSuccess]);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deleted items briefly reappear due to race condition

Low Severity

When a deletion or import succeeds, the reset effect sets page to 0 and clears allHistory. However, because page changed, the historyData effect immediately runs with the stale historyData (refetch hasn't completed), setting allHistory back to the old items—including any just-deleted item. This causes deleted items to briefly reappear until the refetch completes.

Additional Locations (1)

Fix in Cursor Fix in Web

reference_texts,
)

# Save combined audio temporarily
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Profile deletion leaves orphaned combined audio cache files

Low Severity

The new combined audio files are stored in the cache directory as combined_{profile_id}_{hash}.wav, but delete_profile() only deletes the profile directory. When a profile is deleted, its combined audio cache files remain orphaned in the cache directory indefinitely. The profile_id is embedded in the filename, so cleanup is straightforward but currently absent.

Fix in Cursor Fix in Web

- Added `clear_profile_cache` function to manage cache files for specific profiles.
- Integrated cache clearing in `add_profile_sample`, `delete_profile`, and `delete_profile_sample` functions to ensure stale audio caches are invalidated after modifications.
- Enhanced `clear_voice_prompt_cache` to also delete combined audio files, improving overall cache management.
…ents

- Implemented delete confirmation dialogs for both HistoryTable and SampleList components to enhance user experience and prevent accidental deletions.
- Added state management for handling the selected item to be deleted and the visibility of the delete dialog.
- Refactored delete handling functions to utilize the new dialog confirmation flow, improving code clarity and maintainability.
- Added a default `type` prop set to 'button' in the CircleButton component to ensure proper button behavior.
- Enhanced the component's flexibility by allowing the type to be overridden through props.
@jamiepine jamiepine merged commit 20851cc into main Jan 31, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Voice profiles having multiple samples generate audio using a random voice instead

1 participant