Voice Cloning #40

howardbaik · 2024-01-03T19:12:27Z

Purpose/implementation Section

What changes are being implemented in this Pull Request?

tts_coqui_vc(), a function that takes as input the text to convert to speech, WAV audio file of speaker whose voice to clone, language of speaker, whether to use GPU, version of Python to be used, etc.
system_open(), an utility function to use the system command to open audio files in a private folder. Useful for quickly opening up the output from tts().

What was your approach?

tts_coqui_vc() is very similar to tts_coqui() except it requires you to specify the version of Python to be used by reticulate, provide a WAV audio file of the speaker, and decide whether to use the GPU or not. It interacts with the Python API of Coqui TTS with reticulate, using the Sample Python code in #39

Tell potential reviewers what kind of feedback you are soliciting.

I realize that tts(), which is a wrapper around tts_coqui_vc(), and tts_auth(), a function to check if the Python API of TTS is properly installed, is not complete. I will loop back to this once I integrate tts_loqui_vc() into loqui-vc.

howardbaik · 2024-01-03T19:36:34Z

For this PR, I changed the base branch from main to dev after remembering these notes: jhudsl/ari#54

cansavvy

Thanks for working on this @howardbaek ! It seems good but I don't necessarily know much about what's happening. My comments are mainly asking for some clarity. Thanks!

cansavvy · 2024-01-05T13:48:36Z

R/tts.R

+
+#' @export
+#' @rdname tts
+tts_coqui_vc <- function(


@howardbaek Can you add some more comments here so I can understand what is going on? Thanks!!

cansavvy · 2024-01-05T13:48:57Z

R/tts.R

@@ -82,22 +103,17 @@ tts = function(
      bind_audio = bind_audio,
      ...)
  }
-  if (service == "microsoft") {
-    res = tts_microsoft(
+  if (service == "google") {


What's this change about?

The purpose of this change is to change the order of the TTS services (Coqui and Coqui Voice Cloning services come first) so that the free services can be highlighted. Also, I imagine Coqui would be used much more than the other paid services, I moved the two Coqui services to the front.

cansavvy · 2024-01-05T13:49:15Z

R/tts.R

+  }
+  if (service == "coqui-vc") {
+    cli::cli_alert_info("This service does not support MP3 format; will produce a WAV audio output.")
+    # TODO: Specify Python version, just as we specify path to coqui above


May want to make an issue for this as well.

Good idea! Issue is created: #41

howardbaik · 2024-01-10T20:15:31Z

Thanks for working on this @howardbaek ! It seems good but I don't necessarily know much about what's happening. My comments are mainly asking for some clarity. Thanks!

Sorry if this was confusing and overwhelming! I have left detailed comments for you, but let me know if you have further questions!

Also, tagging @seankross to keep him in the loop.

howardbaik · 2024-01-10T20:17:49Z

R/aaa_utils.R

@@ -167,3 +167,10 @@ coqui_path_missing <- paste(
  "If you've already downloaded the software, use function",
  "'set_coqui_path(path = \"path/to/coqui/tts\")' to point R to your local coqui tts Executable File"
 )
+
+# Open private audio files
+system_open <- function(path) {


This helper function takes a file path as input and invokes the open system command to pull up the audio file (or video file) at that path.

howardbaik · 2024-01-10T21:33:17Z

R/tts.R

+    save_local_dest = NULL,
+    ...) {
+  # Specify version of Python to be used by reticulate
+  reticulate::use_python(python_version)


Select the version of Python to be used by reticulate.

howardbaik · 2024-01-10T21:35:28Z

R/tts.R

+  # Specify version of Python to be used by reticulate
+  reticulate::use_python(python_version)
+  # Import TTS
+  TTS_api <- reticulate::import("TTS.api")


Imports the module api within the TTS package to make it available for use within R.

howardbaik · 2024-01-10T21:43:34Z

R/tts.R

+  # Import TTS
+  TTS_api <- reticulate::import("TTS.api")
+  # Model name
+  model_name = "tts_models/multilingual/multi-dataset/xtts_v2"


Specify the name of the model. For voice cloning, we use the xtts_v2 model.

howardbaik · 2024-01-10T21:45:42Z

R/tts.R

+  # Model name
+  model_name = "tts_models/multilingual/multi-dataset/xtts_v2"
+  # TTS
+  tts <- TTS_api$TTS(model_name, gpu = gpu)


Using the TTS class from the TTS.api module (https://github.com/coqui-ai/TTS/blob/dev/TTS/api.py), create an instance

howardbaik · 2024-01-10T21:47:36Z

R/tts.R

+
+    res = vapply(string_processed, function(tt) {
+      output_path = tts_temp_audio("wav")
+      tts$tts_to_file(text = tt,


In this comment, I'll link the underlying Python code inside the TTS.api module.

Use the tts_to_file method (https://github.com/coqui-ai/TTS/blob/dev/TTS/api.py#L290) within the TTS class (https://github.com/coqui-ai/TTS/blob/dev/TTS/api.py#L15).

howardbaik · 2024-01-10T21:49:05Z

R/tts.R

+      # Output file path
+      output_path
+    }, FUN.VALUE = character(1L), USE.NAMES = FALSE)
+    out = lapply(res, tts_audio_read,


The rest of these lines are the same as the code inside tts_coqui()

howardbaik · 2024-02-15T21:17:02Z

@cansavvy Can I merge this PR?

cansavvy

Yes! Sorry bout that!

howardbaik added 6 commits November 27, 2023 17:36

Add more meat to tts_coqui_vc()

1fc2161

Finish writing tts_coqui_vc()

7bfafea

gitignore wav audio files

ba710b8

Start incorporating tts_coqui_vc() into tts()

aaf3f74

Add reticulate to Imports

2ea9238

Add function to open audio files

a6986ed

howardbaik requested a review from cansavvy January 3, 2024 19:12

howardbaik linked an issue Jan 3, 2024 that may be closed by this pull request

Voice Cloning #39

Open

WIP: tts() and tts_auth()

a66844e

howardbaik changed the base branch from main to dev January 3, 2024 19:33

howardbaik mentioned this pull request Jan 3, 2024

Voice Cloning jhudsl/ari#58

Open

howardbaik added 2 commits January 4, 2024 15:17

Add tts_coqui_vc() to NAMESPACE

619994f

Document

56ae97a

cansavvy reviewed Jan 5, 2024

View reviewed changes

howardbaik commented Jan 10, 2024

View reviewed changes

Improve comments on system_open()

8d540da

howardbaik commented Jan 10, 2024

View reviewed changes

cansavvy approved these changes Feb 15, 2024

View reviewed changes

howardbaik merged commit dc2c169 into dev Feb 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Voice Cloning #40

Voice Cloning #40

howardbaik commented Jan 3, 2024 •

edited

Loading

howardbaik commented Jan 3, 2024

cansavvy left a comment

cansavvy Jan 5, 2024

cansavvy Jan 5, 2024

howardbaik Jan 10, 2024 •

edited

Loading

cansavvy Jan 5, 2024

howardbaik Jan 10, 2024

howardbaik commented Jan 10, 2024 •

edited

Loading

howardbaik Jan 10, 2024

howardbaik Jan 10, 2024

howardbaik Jan 10, 2024 •

edited

Loading

howardbaik Jan 10, 2024 •

edited

Loading

howardbaik Jan 10, 2024 •

edited

Loading

howardbaik Jan 10, 2024 •

edited

Loading

howardbaik Jan 10, 2024

howardbaik commented Feb 15, 2024

cansavvy left a comment

Voice Cloning #40

Voice Cloning #40

Conversation

howardbaik commented Jan 3, 2024 • edited Loading

Purpose/implementation Section

What changes are being implemented in this Pull Request?

What was your approach?

Tell potential reviewers what kind of feedback you are soliciting.

howardbaik commented Jan 3, 2024

cansavvy left a comment

Choose a reason for hiding this comment

cansavvy Jan 5, 2024

Choose a reason for hiding this comment

cansavvy Jan 5, 2024

Choose a reason for hiding this comment

howardbaik Jan 10, 2024 • edited Loading

Choose a reason for hiding this comment

cansavvy Jan 5, 2024

Choose a reason for hiding this comment

howardbaik Jan 10, 2024

Choose a reason for hiding this comment

howardbaik commented Jan 10, 2024 • edited Loading

howardbaik Jan 10, 2024

Choose a reason for hiding this comment

howardbaik Jan 10, 2024

Choose a reason for hiding this comment

howardbaik Jan 10, 2024 • edited Loading

Choose a reason for hiding this comment

howardbaik Jan 10, 2024 • edited Loading

Choose a reason for hiding this comment

howardbaik Jan 10, 2024 • edited Loading

Choose a reason for hiding this comment

howardbaik Jan 10, 2024 • edited Loading

Choose a reason for hiding this comment

howardbaik Jan 10, 2024

Choose a reason for hiding this comment

howardbaik commented Feb 15, 2024

cansavvy left a comment

Choose a reason for hiding this comment

howardbaik commented Jan 3, 2024 •

edited

Loading

howardbaik Jan 10, 2024 •

edited

Loading

howardbaik commented Jan 10, 2024 •

edited

Loading

howardbaik Jan 10, 2024 •

edited

Loading

howardbaik Jan 10, 2024 •

edited

Loading

howardbaik Jan 10, 2024 •

edited

Loading

howardbaik Jan 10, 2024 •

edited

Loading