Skip to content

Commit

Permalink
First commit
Browse files Browse the repository at this point in the history
  • Loading branch information
lucataco committed Nov 10, 2023
0 parents commit ba40870
Show file tree
Hide file tree
Showing 7 changed files with 76 additions and 0 deletions.
17 changes: 17 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
# The .dockerignore file excludes files from the container build process.
#
# https://docs.docker.com/engine/reference/builder/#dockerignore-file

# Exclude Git files
.git
.github
.gitignore

# Exclude Python cache files
__pycache__
.mypy_cache
.pytest_cache
.ruff_cache

# Exclude Python virtual environment
/venv
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
__pycache__
.cog
TTS
13 changes: 13 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# coqui /xtts-v2

This is an implementation of the [coqui /xtts-v2](https://github.com/coqui-ai/tts) as a Cog model. [Cog packages machine learning models as standard containers.](https://github.com/replicate/cog)

Run predictions:

cog predict -i text="Hi there, I'm your new voice clone. Try your best to upload quality audio" -i speaker_wav=@female.wav

## Example:

"Hi there, I'm your new voice clone. Try your best to upload quality audio"

![alt text](output.wav)
10 changes: 10 additions & 0 deletions cog.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Configuration for Cog ⚙️
build:
gpu: true
cuda: "11.8"
python_version: "3.11"

python_packages:
- "git+https://github.com/coqui-ai/TTS.git"

predict: "predict.py:Predictor"
Binary file added female.wav
Binary file not shown.
Binary file added output.wav
Binary file not shown.
33 changes: 33 additions & 0 deletions predict.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Prediction interface for Cog
from cog import BasePredictor, Input, Path
import os
from TTS.api import TTS

class Predictor(BasePredictor):
def setup(self) -> None:
"""Load the model into memory to make running multiple predictions efficient"""
os.environ["COQUI_TOS_AGREED"] = "1"
self.model = TTS("tts_models/multilingual/multi-dataset/xtts_v2").to('cuda')

def predict(
self,
text: str = Input(
description="Text to synthesize",
default="Hi there, I'm your new voice clone. Try your best to upload quality audio"
),
speaker_wav: Path = Input(description="Original speaker audio"),
language: str = Input(
description="Language",
choices=["en", "es", "fr", "de", "it", "pt", "pl", "tr", "ru", "nl", "cs", "ar", "zh-cn"],
default="en"
),
) -> Path:
"""Run a single prediction on the model"""
path = self.model.tts_to_file(
text=text,
file_path = "output.wav",
speaker_wav = speaker_wav,
language= language
)

return Path(path)

0 comments on commit ba40870

Please sign in to comment.