Merge pull request #7 from idiap/dev

v0.23.0
idiap · Apr 18, 2024 · 45abf5a · 45abf5a
2 parents 3327b47 + 5527f70
commit 45abf5a
Show file tree

Hide file tree

Showing 102 changed files with 801 additions and 540 deletions.
diff --git a/.github/workflows/docker.yaml b/.github/workflows/docker.yaml
@@ -10,15 +10,15 @@ on:
 jobs:
   docker-build:
     name: "Build and push Docker image"
-    runs-on: ubuntu-20.04
+    runs-on: ubuntu-latest
     strategy:
       matrix:
         arch: ["amd64"]
         base:
         - "nvidia/cuda:11.8.0-base-ubuntu22.04" # GPU enabled
         - "python:3.10.8-slim" # CPU only
     steps:
-      - uses: actions/checkout@v2
+      - uses: actions/checkout@v4
       - name: Log in to the Container registry
         uses: docker/login-action@v1
         with:
@@ -29,11 +29,11 @@ jobs:
         id: compute-tag
         run: |
           set -ex
-          base="ghcr.io/coqui-ai/tts"
+          base="ghcr.io/idiap/coqui-tts"
           tags="" # PR build
 
           if [[ ${{ matrix.base }} = "python:3.10.8-slim" ]]; then
-            base="ghcr.io/coqui-ai/tts-cpu"
+            base="ghcr.io/idiap/coqui-tts-cpu"
           fi
 
           if [[ "${{ startsWith(github.ref, 'refs/heads/') }}" = "true" ]]; then

diff --git a/.github/workflows/pypi-release.yml b/.github/workflows/pypi-release.yml
@@ -8,7 +8,7 @@ defaults:
       bash
 jobs:
   build-sdist:
-    runs-on: ubuntu-20.04
+    runs-on: ubuntu-latest
     steps:
       - uses: actions/checkout@v4
       - name: Verify tag matches version
@@ -33,7 +33,7 @@ jobs:
           name: sdist
           path: dist/*.tar.gz
   build-wheels:
-    runs-on: ubuntu-20.04
+    runs-on: ubuntu-latest
     strategy:
       matrix:
         python-version: ["3.9", "3.10", "3.11"]
@@ -55,7 +55,7 @@ jobs:
           name: wheel-${{ matrix.python-version }}
           path: dist/*-manylinux*.whl
   publish-artifacts:
-    runs-on: ubuntu-20.04
+    runs-on: ubuntu-latest
     needs: [build-sdist, build-wheels]
     environment:
       name: release

diff --git a/.github/workflows/style_check.yml b/.github/workflows/style_check.yml
@@ -15,9 +15,9 @@ jobs:
         python-version: [3.9]
         experimental: [false]
     steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
       - name: Set up Python ${{ matrix.python-version }}
-        uses: actions/setup-python@v4
+        uses: actions/setup-python@v5
         with:
           python-version: ${{ matrix.python-version }}
           architecture: x64

diff --git a/CITATION.cff b/CITATION.cff
@@ -10,8 +10,8 @@ authors:
 version: 1.4
 doi: 10.5281/zenodo.6334862
 license: "MPL-2.0"
-url: "https://github.com/eginhard/coqui-tts"
-repository-code: "https://github.com/eginhard/coqui-tts"
+url: "https://github.com/idiap/coqui-ai-TTS"
+repository-code: "https://github.com/idiap/coqui-ai-TTS"
 keywords:
   - machine learning
   - deep learning

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -2,7 +2,7 @@
 
 Welcome to the 🐸TTS!
 
-This repository is governed by [the Contributor Covenant Code of Conduct](https://github.com/eginhard/coqui-tts/blob/main/CODE_OF_CONDUCT.md).
+This repository is governed by [the Contributor Covenant Code of Conduct](https://github.com/idiap/coqui-ai-TTS/blob/main/CODE_OF_CONDUCT.md).
 
 ## Where to start.
 We welcome everyone who likes to contribute to 🐸TTS.
@@ -15,13 +15,13 @@ If you like to contribute code, squash a bug but if you don't know where to star
 
     You can pick something out of our road map. We keep the progess of the project in this simple issue thread. It has new model proposals or developmental updates etc.
 
-- [Github Issues Tracker](https://github.com/eginhard/coqui-tts/issues)
+- [Github Issues Tracker](https://github.com/idiap/coqui-ai-TTS/issues)
 
     This is a place to find feature requests, bugs.
 
     Issues with the ```good first issue``` tag are good place for beginners to take on.
 
-- ✨**PR**✨ [pages](https://github.com/eginhard/coqui-tts/pulls) with the ```🚀new version``` tag.
+- ✨**PR**✨ [pages](https://github.com/idiap/coqui-ai-TTS/pulls) with the ```🚀new version``` tag.
 
     We list all the target improvements for the next version. You can pick one of them and start contributing.
 
@@ -46,14 +46,14 @@ Let us know if you encounter a problem along the way.
 
 The following steps are tested on an Ubuntu system.
 
-1. Fork 🐸TTS[https://github.com/eginhard/coqui-tts] by clicking the fork button at the top right corner of the project page.
+1. Fork 🐸TTS[https://github.com/idiap/coqui-ai-TTS] by clicking the fork button at the top right corner of the project page.
 
 2. Clone 🐸TTS and add the main repo as a new remote named ```upstream```.
 
     ```bash
-    $ git clone git@github.com:<your Github name>/coqui-tts.git
-    $ cd coqui-tts
-    $ git remote add upstream https://github.com/eginhard/coqui-tts.git
+    $ git clone git@github.com:<your Github name>/coqui-ai-TTS.git
+    $ cd coqui-ai-TTS
+    $ git remote add upstream https://github.com/idiap/coqui-ai-TTS.git
     ```
 
 3. Install 🐸TTS for development.
@@ -124,22 +124,22 @@ The following steps are tested on an Ubuntu system.
 
 13. Let's discuss until it is perfect. 💪
 
-    We might ask you for certain changes that would appear in the ✨**PR**✨'s page under 🐸TTS[https://github.com/eginhard/coqui-tts/pulls].
+    We might ask you for certain changes that would appear in the ✨**PR**✨'s page under 🐸TTS[https://github.com/idiap/coqui-ai-TTS/pulls].
 
 14. Once things look perfect, We merge it to the ```dev``` branch and make it ready for the next version.
 
 ## Development in Docker container
 
 If you prefer working within a Docker container as your development environment, you can do the following:
 
-1. Fork 🐸TTS[https://github.com/eginhard/coqui-tts] by clicking the fork button at the top right corner of the project page.
+1. Fork 🐸TTS[https://github.com/idiap/coqui-ai-TTS] by clicking the fork button at the top right corner of the project page.
 
 2. Clone 🐸TTS and add the main repo as a new remote named ```upsteam```.
 
     ```bash
-    $ git clone git@github.com:<your Github name>/coqui-tts.git
-    $ cd coqui-tts
-    $ git remote add upstream https://github.com/eginhard/coqui-tts.git
+    $ git clone git@github.com:<your Github name>/coqui-ai-TTS.git
+    $ cd coqui-ai-TTS
+    $ git remote add upstream https://github.com/idiap/coqui-ai-TTS.git
     ```
 
 3. Build the Docker Image as your development environment (it installs all of the dependencies for you):

diff --git a/README.md b/README.md
@@ -1,7 +1,8 @@
 
-## 🐸Coqui.ai News
+## 🐸Coqui TTS News
+- 📣 Fork of the [original, unmaintained repository](https://github.com/coqui-ai/TTS). New PyPI package: [coqui-tts](https://pypi.org/project/coqui-tts)
 - 📣 ⓍTTSv2 is here with 16 languages and better performance across the board.
-- 📣 ⓍTTS fine-tuning code is out. Check the [example recipes](https://github.com/eginhard/coqui-tts/tree/dev/recipes/ljspeech).
+- 📣 ⓍTTS fine-tuning code is out. Check the [example recipes](https://github.com/idiap/coqui-ai-TTS/tree/dev/recipes/ljspeech).
 - 📣 ⓍTTS can now stream with <200ms latency.
 - 📣 ⓍTTS, our production TTS model that can speak 13 languages, is released [Blog Post](https://coqui.ai/blog/tts/open_xtts), [Demo](https://huggingface.co/spaces/coqui/xtts), [Docs](https://coqui-tts.readthedocs.io/en/dev/models/xtts.html)
 - 📣 [🐶Bark](https://github.com/suno-ai/bark) is now available for inference with unconstrained voice cloning. [Docs](https://coqui-tts.readthedocs.io/en/dev/models/bark.html)
@@ -11,7 +12,7 @@
 <div align="center">
 <img src="https://static.scarf.sh/a.png?x-pxid=cf317fe7-2188-4721-bc01-124bb5d5dbb2" />
 
-## <img src="https://raw.githubusercontent.com/eginhard/coqui-tts/main/images/coqui-log-green-TTS.png" height="56"/>
+## <img src="https://raw.githubusercontent.com/idiap/coqui-ai-TTS/main/images/coqui-log-green-TTS.png" height="56"/>
 
 
 **🐸TTS is a library for advanced Text-to-Speech generation.**
@@ -25,14 +26,14 @@ ______________________________________________________________________
 
 [![Discord](https://img.shields.io/discord/1037326658807533628?color=%239B59B6&label=chat%20on%20discord)](https://discord.gg/5eXr5seRrv)
 [![License](<https://img.shields.io/badge/License-MPL%202.0-brightgreen.svg>)](https://opensource.org/licenses/MPL-2.0)
-[![PyPI version](https://badge.fury.io/py/TTS.svg)](https://badge.fury.io/py/TTS)
-[![Covenant](https://camo.githubusercontent.com/7d620efaa3eac1c5b060ece5d6aacfcc8b81a74a04d05cd0398689c01c4463bb/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f6e7472696275746f72253230436f76656e616e742d76322e3025323061646f707465642d6666363962342e737667)](https://github.com/eginhard/coqui-tts/blob/main/CODE_OF_CONDUCT.md)
-[![Downloads](https://pepy.tech/badge/tts)](https://pepy.tech/project/tts)
+[![PyPI version](https://badge.fury.io/py/coqui-tts.svg)](https://badge.fury.io/py/coqui-tts)
+[![Covenant](https://camo.githubusercontent.com/7d620efaa3eac1c5b060ece5d6aacfcc8b81a74a04d05cd0398689c01c4463bb/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f436f6e7472696275746f72253230436f76656e616e742d76322e3025323061646f707465642d6666363962342e737667)](https://github.com/idiap/coqui-ai-TTS/blob/main/CODE_OF_CONDUCT.md)
+[![Downloads](https://pepy.tech/badge/coqui-tts)](https://pepy.tech/project/coqui-tts)
 [![DOI](https://zenodo.org/badge/265612440.svg)](https://zenodo.org/badge/latestdoi/265612440)
 
-![GithubActions](https://github.com/eginhard/coqui-tts/actions/workflows/tests.yml/badge.svg)
-![GithubActions](https://github.com/eginhard/coqui-tts/actions/workflows/docker.yaml/badge.svg)
-![GithubActions](https://github.com/eginhard/coqui-tts/actions/workflows/style_check.yml/badge.svg)
+![GithubActions](https://github.com/idiap/coqui-ai-TTS/actions/workflows/tests.yml/badge.svg)
+![GithubActions](https://github.com/idiap/coqui-ai-TTS/actions/workflows/docker.yaml/badge.svg)
+![GithubActions](https://github.com/idiap/coqui-ai-TTS/actions/workflows/style_check.yml/badge.svg)
 [![Docs](<https://readthedocs.org/projects/coqui-tts/badge/?version=latest&style=plastic>)](https://coqui-tts.readthedocs.io/en/latest/)
 
 </div>
@@ -49,8 +50,8 @@ Please use our dedicated channels for questions and discussion. Help is much mor
 | 👩‍💻 **Usage Questions**          | [GitHub Discussions]                    |
 | 🗯 **General Discussion**       | [GitHub Discussions] or [Discord]   |
 
-[github issue tracker]: https://github.com/eginhard/coqui-tts/issues
-[github discussions]: https://github.com/eginhard/coqui-tts/discussions
+[github issue tracker]: https://github.com/idiap/coqui-ai-TTS/issues
+[github discussions]: https://github.com/idiap/coqui-ai-TTS/discussions
 [discord]: https://discord.gg/5eXr5seRrv
 [Tutorials and Examples]: https://github.com/coqui-ai/TTS/wiki/TTS-Notebooks-and-Tutorials
 
@@ -59,10 +60,10 @@ Please use our dedicated channels for questions and discussion. Help is much mor
 | Type                            | Links                               |
 | ------------------------------- | --------------------------------------- |
 | 💼 **Documentation**              | [ReadTheDocs](https://coqui-tts.readthedocs.io/en/latest/)
-| 💾 **Installation**               | [TTS/README.md](https://github.com/eginhard/coqui-tts/tree/dev#installation)|
-| 👩‍💻 **Contributing**               | [CONTRIBUTING.md](https://github.com/eginhard/coqui-tts/blob/main/CONTRIBUTING.md)|
+| 💾 **Installation**               | [TTS/README.md](https://github.com/idiap/coqui-ai-TTS/tree/dev#installation)|
+| 👩‍💻 **Contributing**               | [CONTRIBUTING.md](https://github.com/idiap/coqui-ai-TTS/blob/main/CONTRIBUTING.md)|
 | 📌 **Road Map**                   | [Main Development Plans](https://github.com/coqui-ai/TTS/issues/378)
-| 🚀 **Released Models**            | [Standard models](https://github.com/eginhard/coqui-tts/blob/dev/TTS/.models.json) and [Fairseq models in ~1100 languages](https://github.com/eginhard/coqui-tts#example-text-to-speech-using-fairseq-models-in-1100-languages-)|
+| 🚀 **Released Models**            | [Standard models](https://github.com/idiap/coqui-ai-TTS/blob/dev/TTS/.models.json) and [Fairseq models in ~1100 languages](https://github.com/idiap/coqui-ai-TTS#example-text-to-speech-using-fairseq-models-in-1100-languages-)|
 | 📰 **Papers**                    | [TTS Papers](https://github.com/erogol/TTS-papers)|
 
 ## Features
@@ -130,7 +131,7 @@ Please use our dedicated channels for questions and discussion. Help is much mor
 You can also help us implement more models.
 
 ## Installation
-🐸TTS is tested on Ubuntu 18.04 with **python >= 3.9, < 3.12.**.
+🐸TTS is tested on Ubuntu 22.04 with **python >= 3.9, < 3.12.**.
 
 If you are only interested in [synthesizing speech](https://coqui-tts.readthedocs.io/en/latest/inference.html) with the released 🐸TTS models, installing from PyPI is the easiest option.
 
@@ -141,7 +142,7 @@ pip install coqui-tts
 If you plan to code or train models, clone 🐸TTS and install it locally.
 
 ```bash
-git clone https://github.com/eginhard/coqui-tts
+git clone https://github.com/idiap/coqui-ai-TTS
 pip install -e .[all,dev,notebooks,server]  # Select the relevant extras
 ```
 

diff --git a/TTS/VERSION b/TTS/VERSION
@@ -1 +1 @@
-0.22.1
+0.23.0
diff --git a/TTS/api.py b/TTS/api.py
@@ -1,3 +1,4 @@
+import logging
 import tempfile
 import warnings
 from pathlib import Path
@@ -9,6 +10,8 @@
 from TTS.utils.manage import ModelManager
 from TTS.utils.synthesizer import Synthesizer
 
+logger = logging.getLogger(__name__)
+
 
 class TTS(nn.Module):
     """TODO: Add voice conversion and Capacitron support."""
@@ -59,7 +62,7 @@ def __init__(
             gpu (bool, optional): Enable/disable GPU. Some models might be too slow on CPU. Defaults to False.
         """
         super().__init__()
-        self.manager = ModelManager(models_file=self.get_models_file_path(), progress_bar=progress_bar, verbose=False)
+        self.manager = ModelManager(models_file=self.get_models_file_path(), progress_bar=progress_bar)
         self.config = load_config(config_path) if config_path else None
         self.synthesizer = None
         self.voice_converter = None
@@ -122,7 +125,7 @@ def get_models_file_path():
 
     @staticmethod
     def list_models():
-        return ModelManager(models_file=TTS.get_models_file_path(), progress_bar=False, verbose=False).list_models()
+        return ModelManager(models_file=TTS.get_models_file_path(), progress_bar=False).list_models()
 
     def download_model_by_name(self, model_name: str):
         model_path, config_path, model_item = self.manager.download_model(model_name)

diff --git a/TTS/bin/compute_attention_masks.py b/TTS/bin/compute_attention_masks.py
@@ -1,5 +1,6 @@
 import argparse
 import importlib
+import logging
 import os
 from argparse import RawTextHelpFormatter
 
@@ -13,9 +14,12 @@
 from TTS.tts.models import setup_model
 from TTS.tts.utils.text.characters import make_symbols, phonemes, symbols
 from TTS.utils.audio import AudioProcessor
+from TTS.utils.generic_utils import ConsoleFormatter, setup_logger
 from TTS.utils.io import load_checkpoint
 
 if __name__ == "__main__":
+    setup_logger("TTS", level=logging.INFO, screen=True, formatter=ConsoleFormatter())
+
     # pylint: disable=bad-option-value
     parser = argparse.ArgumentParser(
         description="""Extract attention masks from trained Tacotron/Tacotron2 models.

diff --git a/TTS/bin/compute_embeddings.py b/TTS/bin/compute_embeddings.py
@@ -1,4 +1,5 @@
 import argparse
+import logging
 import os
 from argparse import RawTextHelpFormatter
 
@@ -10,6 +11,7 @@
 from TTS.tts.datasets import load_tts_samples
 from TTS.tts.utils.managers import save_file
 from TTS.tts.utils.speakers import SpeakerManager
+from TTS.utils.generic_utils import ConsoleFormatter, setup_logger
 
 
 def compute_embeddings(
@@ -100,6 +102,8 @@ def compute_embeddings(
 
 
 if __name__ == "__main__":
+    setup_logger("TTS", level=logging.INFO, screen=True, formatter=ConsoleFormatter())
+
     parser = argparse.ArgumentParser(
         description="""Compute embedding vectors for each audio file in a dataset and store them keyed by `{dataset_name}#{file_path}` in a .pth file\n\n"""
         """

diff --git a/TTS/bin/compute_statistics.py b/TTS/bin/compute_statistics.py
@@ -3,6 +3,7 @@
 
 import argparse
 import glob
+import logging
 import os
 
 import numpy as np
@@ -12,10 +13,13 @@
 from TTS.config import load_config
 from TTS.tts.datasets import load_tts_samples
 from TTS.utils.audio import AudioProcessor
+from TTS.utils.generic_utils import ConsoleFormatter, setup_logger
 
 
 def main():
     """Run preprocessing process."""
+    setup_logger("TTS", level=logging.INFO, screen=True, formatter=ConsoleFormatter())
+
     parser = argparse.ArgumentParser(description="Compute mean and variance of spectrogtram features.")
     parser.add_argument("config_path", type=str, help="TTS config file path to define audio processin parameters.")
     parser.add_argument("out_path", type=str, help="save path (directory and filename).")

diff --git a/TTS/bin/eval_encoder.py b/TTS/bin/eval_encoder.py
@@ -1,4 +1,5 @@
 import argparse
+import logging
 from argparse import RawTextHelpFormatter
 
 import torch
@@ -7,6 +8,7 @@
 from TTS.config import load_config
 from TTS.tts.datasets import load_tts_samples
 from TTS.tts.utils.speakers import SpeakerManager
+from TTS.utils.generic_utils import ConsoleFormatter, setup_logger
 
 
 def compute_encoder_accuracy(dataset_items, encoder_manager):
@@ -51,6 +53,8 @@ def compute_encoder_accuracy(dataset_items, encoder_manager):
 
 
 if __name__ == "__main__":
+    setup_logger("TTS", level=logging.INFO, screen=True, formatter=ConsoleFormatter())
+
     parser = argparse.ArgumentParser(
         description="""Compute the accuracy of the encoder.\n\n"""
         """