Merge pull request #124 from r9y9/enhance-docs

Add more docs
nnsvs · Jun 26, 2022 · a59452f · a59452f
2 parents 9b9e4fe + a06dfd7
commit a59452f
Show file tree

Hide file tree

Showing 30 changed files with 795 additions and 86 deletions.
diff --git a/.gitignore b/.gitignore
@@ -22,12 +22,17 @@ recipes*/*/*/*.wav
 recipes*/*/*/packed_models
 
 # misc
+aup
+samples
+score
 notebooks
 demo
-docs/generated
+docs/modules/generated
 pretrained_models
 mlruns
 multirun
+?.py
+?.sh
 
 # Created by https://www.gitignore.io/api/osx,vim,linux,emacs,python,visualstudiocode
 # Edit at https://www.gitignore.io/?templates=osx,vim,linux,emacs,python,visualstudiocode

diff --git a/docs/changelog.rst b/docs/changelog.rst
@@ -5,26 +5,33 @@ v0.0.3 <2022-xx-xx>
 -------------------
 
 New features
-^^^^^^^^^^^^^
+^^^^^^^^^^^^
 
 - Recipe-level integration of hyperparameter optimization with Optuna `#43`_ :doc:`optuna`
-- Speech parameter trajectory smoothing (:cite:t:`takamichi2015naist`). Disabled by default.
-- Objective metrics (such as mel-cepstrum distortion and RMSE) are now logged to tensorboard. `#41`_
-- Spectrogram, aperiodicity, F0, and generated audio is now logged to tensorboard if ``train_resf0.py`` is used.
-- A heuristic trick is added to prevent serious V/UV prediction errors (hardcoded for Japanese for now). `#95`_
-- GAN-based post-filters (:cite:t:`Kaneko2017Interspeech`, :cite:t:`kaneko2017generative`) `#85`_
-- GV post-filter (:cite:t:`silen2012ways`)
-- Number of training iterations can be now specified by either epochs or steps.
+- GAN-based post-filters (:cite:t:`Kaneko2017Interspeech`, :cite:t:`kaneko2017generative`) `#85`_ and GV post-filter (:cite:t:`silen2012ways`)
 - Mixed precision training `#106`_
 - Added VariancePredictor (:cite:t:`ren2020fastspeech`).
+- Spectrogram, aperiodicity, F0, and generated audio is now logged to tensorboard if ``train_resf0.py`` is used.
+- Objective metrics (such as mel-cepstrum distortion and RMSE) are now logged to tensorboard. `#41`_
+- Added MDNv2 (MDN + dropout) `#118`_
+- Correct V/UV (``correct_vuv``) option is added to feature processing.
 
 Bug fixes
-~~~~~~~~~
+^^^^^^^^^
 
 - Add a heuristic trick to prevent non-negative durations at synthesis time
 
+Improvements
+^^^^^^^^^^^^
+
+- ``nnsvs.model.MDN`` now support dropout by the ``dropout`` argument. The ``dropout`` argument existed before but it was no-op for a long time.
+- Number of training iterations can be now specified by either epochs or steps.
+- A heuristic trick is added to prevent serious V/UV prediction errors . `#95`_ `#119`_
+- Speech parameter trajectory smoothing (:cite:t:`takamichi2015naist`). Disabled by default.
+- Added recipe tests on CI `#116`_
+
 Deprecations
-^^^^^^^^^^^^^
+^^^^^^^^^^^^
 
 - ``dropout`` for ``nnsvs.model.MDN`` is deprecated. Please consider removing the parameter as it has no effect.
 - ``dropout`` for ``nnsvs.model.Conv1dResnet`` is deprecated. Please consider removing the parameter as it has no effect.
@@ -48,7 +55,7 @@ Some features that are available but not yet tested or documented
 - WaveNet `#100`_
 - GAN-based acoustic models `#85`_
 
-v0.0.2 (2022-04-29)
+v0.0.2 <2022-04-29>
 -------------------
 
 A version that should work with `ENUNU v0.4.0 <https://github.com/oatsu-gh/ENUNU/releases/tag/v0.4.0>`_
@@ -83,3 +90,6 @@ PyPi release is also available. So you can install the core library by pip insta
 .. _#95: https://github.com/r9y9/nnsvs/issues/95
 .. _#100: https://github.com/r9y9/nnsvs/issues/100
 .. _#106: https://github.com/r9y9/nnsvs/issues/106
+.. _#116: https://github.com/r9y9/nnsvs/pull/116
+.. _#118: https://github.com/r9y9/nnsvs/pull/118
+.. _#119: https://github.com/r9y9/nnsvs/pull/119
diff --git a/docs/custom_models.rst b/docs/custom_models.rst
@@ -0,0 +1,109 @@
+
+Defining your custom model
+==========================
+
+*Your PyTorch models can be used with NNSVS.*
+
+NNSVS allows you to define your custom model easily. If you want your custom model to be used with NNSVS, you can implement your own by inheriting the :class:`nnsvs.base.BaseModel` class.
+
+Write your PyTorch model
+------------------------
+
+.. note::
+
+    If you are not familiar with PyTorch, please check the `PyTorch's documentation <https://pytorch.org/>`_ first.
+
+
+A simplest example is shown below.
+
+.. code-block:: python
+
+    from nnsvs.base import BaseModel
+    from torch import nn
+
+    class MyModel(BaseModel):
+        """My awesome neural network
+
+        Args:
+            in_dim (int): input dimension
+            hidden_dim (int): hidden dimension
+            out_dim (int): output dimension
+        """
+
+        def __init__(self, in_dim, hidden_dim, out_dim):
+            super().__init__()
+            self.model = nn.Sequential(
+                nn.Linear(in_dim, hidden_dim),
+                nn.ReLU(),
+                nn.Linear(hidden_dim, out_dim)
+            )
+
+        def forward(self, x, lengths=None, y=None):
+            """Forward pass
+
+            Args:
+                x (torch.Tensor): input tensor
+                lengths (torch.Tensor): input sequence lengths
+                y (torch.Tensor): target tensor (optional)
+
+            Returns:
+                torch.Tensor: output tensor
+            """
+            return self.model(x)
+
+The above is a toy example defining a model with simple two-layer feed-forward neural networks with ReLU activation function. ``lengths`` and ``y`` are optional arguments.
+The model name, number of arguments, and model architecture are totally customizable.
+
+If you follow the :class:`nnsvs.base.BaseModel` interface, your model can be used as time-lag/duration/acoustic models.
+
+
+Specify your model in model configs
+-----------------------------------
+
+Once you implement your model, you can use your model by changing your model configs like:
+
+
+.. code-block:: yaml
+
+    netG:
+    _target_: ${path.to.your.model}
+    # the followings are arguments passed to your model's __init__ method
+    in_dim: 331
+    hidden_dim: 32
+    out_dim: 1
+
+
+Note that your model must be in the ``PYTHONPATH``. If you edit ``nnsvs/model.py`` directly, you can specify your new model as:
+
+.. code-block:: yaml
+
+    netG:
+    _target_: nnsvs.model.MyModel
+    # the followings are arguments passed to your model's __init__ method
+    in_dim: 331
+    hidden_dim: 32
+    out_dim: 1
+
+If you add a new file at ``nnsvs/test.py`` for example, you can refer your model by:
+
+.. code-block:: yaml
+
+    netG:
+    _target_: nnsvs.test.MyModel
+    # the followings are arguments passed to your model's __init__ method
+    in_dim: 331
+    hidden_dim: 32
+    out_dim: 1
+
+That's it.
+
+Available model types
+---------------------
+
+You may want to know what models are implemented and what are missing? Please check the following docs for the available models:
+
+- Generic models: :doc:`modules/model`
+- Acoustic models: :doc:`modules/acoustic_models`
+- Post-filteres: :doc:`modules/postfilters`
+
+If you find you model works well, please feel to free to make pull requests to the NNSVS repository.
diff --git a/docs/demo_server.rst b/docs/demo_server.rst
diff --git a/docs/devdocs.rst b/docs/devdocs.rst
@@ -1,36 +1,42 @@
-Notes for developers
-====================
+Development guide
+=================
 
 This page summarizes docs for developers of NNSVS. If you want to contribute to NNSVS itself, please check the document below.
 
 Installation
----------------
+------------
 
-It is recommended to install full requirements with editiable mode  (``-e`` with pip) enabled:
+For development purposes, it is recommended to install full requirements with editiable mode  (``-e`` with pip) enabled:
 
 .. code::
 
-   pip install -e ".[dev,lint.test,docs]"
+   pip install -e ".[dev,lint,test,docs]"
+
+This allows your local changes available to your python environment without manually re-installing NNSVS.
 
 
 Repository structure
 ---------------------
 
+Here's the list of important components of the NNSVS repository:
+
 - ``nnsvs``: The core Python library. Neural network implementations for SVS systems can be found here.
 - ``recipes``: Recipes.  The recipes are written mostly in bash and YAML-style configs. Some recipes use small Python scripts.
 - ``docs``: Documentation. It is written by `Sphinx <https://www.sphinx-doc.org/>`_.
 - ``notebooks``: Jupyter notebooks. Notebooks are helpful for interactive debugging and development.
-- ``utils``: Utility scripts
+- ``utils``: Utility scripts that are used by the recipes.
+- ``tests``: Tests
 
 Python docstring style
 ----------------------
 
-NNSVS follows the Google's style: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html
+NNSVS follows the `Google's style <https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html>`_.
+If you write a docstrings for your new functinoality, please follow the same style.
 
 Formatting and linting
 ----------------------
 
-https://github.com/pfnet/pysen is used for formatting and linting.
+https://github.com/pfnet/pysen is used for formatting and linting. Please run the following commands when you make a PR.
 
 Formatting
 ^^^^^^^^^^^
@@ -49,10 +55,15 @@ Linting
 Tests
 -----
 
+To prevent unintentional bugs, it is better to write tests as much as possible. If you propose a new function, please consdier to write tests.
+You can run the tests by the following command:
+
 .. code::
 
     pytest -v -s
 
+Please make sure tests are all passing before making a PR.
+
 Building docs locally
 ---------------------
 

diff --git a/docs/index.rst b/docs/index.rst
@@ -24,6 +24,11 @@ Audio samples
 
 Samples by r9y9: https://soundcloud.com/r9y9/sets/dnn-based-singing-voice
 
+Online demo
+-----------
+
+https://share.streamlit.io/r9y9/nnsvs/streamlit_demo/app.py
+
 Selected videos
 ---------------
 
@@ -42,38 +47,51 @@ You can find more from the NNSVS/ENUNU community: `YouTube <https://www.youtube.
 
    notebooks/Demos
    notebooks/NNSVS_vs_Sinsy
-   demo_server
 
 .. toctree::
    :maxdepth: 1
-   :caption: Notes
+   :caption: Guides
 
    installation
-   overview
    recipes
+   custom_models
+   devdocs
+   update_guide
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Advanced guides
+
    optuna
+   train_postfilters
+   train_vocoders
+
+
+.. toctree::
+   :maxdepth: 1
+   :caption: Notes
+
+   overview
    tips
-   update_guide
-   devdocs
 
 .. toctree::
    :maxdepth: 1
    :caption: Package reference
 
-   pretrained
-   svs
-   base
-   model
-   acoustic_models
-   postfilters
-   discriminators
-   dsp
-   gen
-   mdn
-   pitch
-   multistream
-   util
-   train_util
+   modules/base
+   modules/model
+   modules/acoustic_models
+   modules/postfilters
+   modules/discriminators
+   modules/pretrained
+   modules/svs
+   modules/dsp
+   modules/gen
+   modules/mdn
+   modules/pitch
+   modules/multistream
+   modules/util
+   modules/train_util
 
 .. toctree::
    :maxdepth: 1

diff --git a/docs/installation.rst b/docs/installation.rst
@@ -15,7 +15,6 @@ It is strongly recommended to use Linux for development purposes.
 C/C++ compiler
 ---------------
 
-
 You must need to install C/C++ compiler in advance. You can use `GCC <https://gcc.gnu.org/>`_, `Clang <https://clang.llvm.org/>`_, `Visual Studio <https://visualstudio.microsoft.com/>`_, or `MinGW <https://mingw.org/>`_.
 
 For Linux/Mac OS X users, it is likely that you already have C/C++ compiler installed. For Windows users, you'd need to install Visual Studio with C++ compiler support.
@@ -36,8 +35,8 @@ Python
 Python 3.7 or later.
 Because NNSVS is written by `PyTorch <https://pytorch.org/>`_, it is recommended to check the Pytorch installation before testing NNSVS.
 
-Installation
-------------
+Installation commands
+---------------------
 
 Once the above setup is done, you can install NNSVS as follows.
 
@@ -47,9 +46,9 @@ For development
 .. code::
 
    git clone https://github.com/r9y9/nnsvs.git && cd nnsvs
-   pip install -e ".[lint.test]"
+   pip install -e ".[dev,lint,test]"
 
-Note: adding ``[lint,test]`` to the end of the command above will install test/lint requirements as well.
+Note: adding ``[dev,lint,test]`` to the end of the command above will install dev/test/lint requirements as well.
 
 For inference only
 ^^^^^^^^^^^^^^^^^^
@@ -58,4 +57,18 @@ For inference only
 
    pip install nnsvs
 
-If you don't need to train your models by yourself (I guess it's unlikely though), this should be enough.
+If you don't need to train your models by yourself (I guess it's unlikely though), this should be enough.
+
+
+Google Colab
+^^^^^^^^^^^^
+
+If you are on Google colab, you may want to copy the following command into a cell.
+
+.. code-block::
+
+   %%capture
+   try:
+      import nnsvs
+   except ImportError:
+      ! pip install git+https://github.com/r9y9/nnsvs