Skip to content

Commit

Permalink
Merge pull request #124 from r9y9/enhance-docs
Browse files Browse the repository at this point in the history
Add more docs
  • Loading branch information
r9y9 committed Jun 26, 2022
2 parents 9b9e4fe + a06dfd7 commit a59452f
Show file tree
Hide file tree
Showing 30 changed files with 795 additions and 86 deletions.
7 changes: 6 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -22,12 +22,17 @@ recipes*/*/*/*.wav
recipes*/*/*/packed_models

# misc
aup
samples
score
notebooks
demo
docs/generated
docs/modules/generated
pretrained_models
mlruns
multirun
?.py
?.sh

# Created by https://www.gitignore.io/api/osx,vim,linux,emacs,python,visualstudiocode
# Edit at https://www.gitignore.io/?templates=osx,vim,linux,emacs,python,visualstudiocode
Expand Down
32 changes: 21 additions & 11 deletions docs/changelog.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,26 +5,33 @@ v0.0.3 <2022-xx-xx>
-------------------

New features
^^^^^^^^^^^^^
^^^^^^^^^^^^

- Recipe-level integration of hyperparameter optimization with Optuna `#43`_ :doc:`optuna`
- Speech parameter trajectory smoothing (:cite:t:`takamichi2015naist`). Disabled by default.
- Objective metrics (such as mel-cepstrum distortion and RMSE) are now logged to tensorboard. `#41`_
- Spectrogram, aperiodicity, F0, and generated audio is now logged to tensorboard if ``train_resf0.py`` is used.
- A heuristic trick is added to prevent serious V/UV prediction errors (hardcoded for Japanese for now). `#95`_
- GAN-based post-filters (:cite:t:`Kaneko2017Interspeech`, :cite:t:`kaneko2017generative`) `#85`_
- GV post-filter (:cite:t:`silen2012ways`)
- Number of training iterations can be now specified by either epochs or steps.
- GAN-based post-filters (:cite:t:`Kaneko2017Interspeech`, :cite:t:`kaneko2017generative`) `#85`_ and GV post-filter (:cite:t:`silen2012ways`)
- Mixed precision training `#106`_
- Added VariancePredictor (:cite:t:`ren2020fastspeech`).
- Spectrogram, aperiodicity, F0, and generated audio is now logged to tensorboard if ``train_resf0.py`` is used.
- Objective metrics (such as mel-cepstrum distortion and RMSE) are now logged to tensorboard. `#41`_
- Added MDNv2 (MDN + dropout) `#118`_
- Correct V/UV (``correct_vuv``) option is added to feature processing.

Bug fixes
~~~~~~~~~
^^^^^^^^^

- Add a heuristic trick to prevent non-negative durations at synthesis time

Improvements
^^^^^^^^^^^^

- ``nnsvs.model.MDN`` now support dropout by the ``dropout`` argument. The ``dropout`` argument existed before but it was no-op for a long time.
- Number of training iterations can be now specified by either epochs or steps.
- A heuristic trick is added to prevent serious V/UV prediction errors . `#95`_ `#119`_
- Speech parameter trajectory smoothing (:cite:t:`takamichi2015naist`). Disabled by default.
- Added recipe tests on CI `#116`_

Deprecations
^^^^^^^^^^^^^
^^^^^^^^^^^^

- ``dropout`` for ``nnsvs.model.MDN`` is deprecated. Please consider removing the parameter as it has no effect.
- ``dropout`` for ``nnsvs.model.Conv1dResnet`` is deprecated. Please consider removing the parameter as it has no effect.
Expand All @@ -48,7 +55,7 @@ Some features that are available but not yet tested or documented
- WaveNet `#100`_
- GAN-based acoustic models `#85`_

v0.0.2 (2022-04-29)
v0.0.2 <2022-04-29>
-------------------

A version that should work with `ENUNU v0.4.0 <https://github.com/oatsu-gh/ENUNU/releases/tag/v0.4.0>`_
Expand Down Expand Up @@ -83,3 +90,6 @@ PyPi release is also available. So you can install the core library by pip insta
.. _#95: https://github.com/r9y9/nnsvs/issues/95
.. _#100: https://github.com/r9y9/nnsvs/issues/100
.. _#106: https://github.com/r9y9/nnsvs/issues/106
.. _#116: https://github.com/r9y9/nnsvs/pull/116
.. _#118: https://github.com/r9y9/nnsvs/pull/118
.. _#119: https://github.com/r9y9/nnsvs/pull/119
109 changes: 109 additions & 0 deletions docs/custom_models.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@

Defining your custom model
==========================

*Your PyTorch models can be used with NNSVS.*

NNSVS allows you to define your custom model easily. If you want your custom model to be used with NNSVS, you can implement your own by inheriting the :class:`nnsvs.base.BaseModel` class.

Write your PyTorch model
------------------------

.. note::

If you are not familiar with PyTorch, please check the `PyTorch's documentation <https://pytorch.org/>`_ first.


A simplest example is shown below.

.. code-block:: python
from nnsvs.base import BaseModel
from torch import nn
class MyModel(BaseModel):
"""My awesome neural network
Args:
in_dim (int): input dimension
hidden_dim (int): hidden dimension
out_dim (int): output dimension
"""
def __init__(self, in_dim, hidden_dim, out_dim):
super().__init__()
self.model = nn.Sequential(
nn.Linear(in_dim, hidden_dim),
nn.ReLU(),
nn.Linear(hidden_dim, out_dim)
)
def forward(self, x, lengths=None, y=None):
"""Forward pass
Args:
x (torch.Tensor): input tensor
lengths (torch.Tensor): input sequence lengths
y (torch.Tensor): target tensor (optional)
Returns:
torch.Tensor: output tensor
"""
return self.model(x)
The above is a toy example defining a model with simple two-layer feed-forward neural networks with ReLU activation function. ``lengths`` and ``y`` are optional arguments.
The model name, number of arguments, and model architecture are totally customizable.

If you follow the :class:`nnsvs.base.BaseModel` interface, your model can be used as time-lag/duration/acoustic models.


Specify your model in model configs
-----------------------------------

Once you implement your model, you can use your model by changing your model configs like:


.. code-block:: yaml
netG:
_target_: ${path.to.your.model}
# the followings are arguments passed to your model's __init__ method
in_dim: 331
hidden_dim: 32
out_dim: 1
Note that your model must be in the ``PYTHONPATH``. If you edit ``nnsvs/model.py`` directly, you can specify your new model as:

.. code-block:: yaml
netG:
_target_: nnsvs.model.MyModel
# the followings are arguments passed to your model's __init__ method
in_dim: 331
hidden_dim: 32
out_dim: 1
If you add a new file at ``nnsvs/test.py`` for example, you can refer your model by:

.. code-block:: yaml
netG:
_target_: nnsvs.test.MyModel
# the followings are arguments passed to your model's __init__ method
in_dim: 331
hidden_dim: 32
out_dim: 1
That's it.

Available model types
---------------------

You may want to know what models are implemented and what are missing? Please check the following docs for the available models:

- Generic models: :doc:`modules/model`
- Acoustic models: :doc:`modules/acoustic_models`
- Post-filteres: :doc:`modules/postfilters`

If you find you model works well, please feel to free to make pull requests to the NNSVS repository.
4 changes: 0 additions & 4 deletions docs/demo_server.rst

This file was deleted.

27 changes: 19 additions & 8 deletions docs/devdocs.rst
Original file line number Diff line number Diff line change
@@ -1,36 +1,42 @@
Notes for developers
====================
Development guide
=================

This page summarizes docs for developers of NNSVS. If you want to contribute to NNSVS itself, please check the document below.

Installation
---------------
------------

It is recommended to install full requirements with editiable mode (``-e`` with pip) enabled:
For development purposes, it is recommended to install full requirements with editiable mode (``-e`` with pip) enabled:

.. code::
pip install -e ".[dev,lint.test,docs]"
pip install -e ".[dev,lint,test,docs]"
This allows your local changes available to your python environment without manually re-installing NNSVS.


Repository structure
---------------------

Here's the list of important components of the NNSVS repository:

- ``nnsvs``: The core Python library. Neural network implementations for SVS systems can be found here.
- ``recipes``: Recipes. The recipes are written mostly in bash and YAML-style configs. Some recipes use small Python scripts.
- ``docs``: Documentation. It is written by `Sphinx <https://www.sphinx-doc.org/>`_.
- ``notebooks``: Jupyter notebooks. Notebooks are helpful for interactive debugging and development.
- ``utils``: Utility scripts
- ``utils``: Utility scripts that are used by the recipes.
- ``tests``: Tests

Python docstring style
----------------------

NNSVS follows the Google's style: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html
NNSVS follows the `Google's style <https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html>`_.
If you write a docstrings for your new functinoality, please follow the same style.

Formatting and linting
----------------------

https://github.com/pfnet/pysen is used for formatting and linting.
https://github.com/pfnet/pysen is used for formatting and linting. Please run the following commands when you make a PR.

Formatting
^^^^^^^^^^^
Expand All @@ -49,10 +55,15 @@ Linting
Tests
-----

To prevent unintentional bugs, it is better to write tests as much as possible. If you propose a new function, please consdier to write tests.
You can run the tests by the following command:

.. code::
pytest -v -s
Please make sure tests are all passing before making a PR.

Building docs locally
---------------------

Expand Down
56 changes: 37 additions & 19 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,11 @@ Audio samples

Samples by r9y9: https://soundcloud.com/r9y9/sets/dnn-based-singing-voice

Online demo
-----------

https://share.streamlit.io/r9y9/nnsvs/streamlit_demo/app.py

Selected videos
---------------

Expand All @@ -42,38 +47,51 @@ You can find more from the NNSVS/ENUNU community: `YouTube <https://www.youtube.

notebooks/Demos
notebooks/NNSVS_vs_Sinsy
demo_server

.. toctree::
:maxdepth: 1
:caption: Notes
:caption: Guides

installation
overview
recipes
custom_models
devdocs
update_guide

.. toctree::
:maxdepth: 1
:caption: Advanced guides

optuna
train_postfilters
train_vocoders


.. toctree::
:maxdepth: 1
:caption: Notes

overview
tips
update_guide
devdocs

.. toctree::
:maxdepth: 1
:caption: Package reference

pretrained
svs
base
model
acoustic_models
postfilters
discriminators
dsp
gen
mdn
pitch
multistream
util
train_util
modules/base
modules/model
modules/acoustic_models
modules/postfilters
modules/discriminators
modules/pretrained
modules/svs
modules/dsp
modules/gen
modules/mdn
modules/pitch
modules/multistream
modules/util
modules/train_util

.. toctree::
:maxdepth: 1
Expand Down
25 changes: 19 additions & 6 deletions docs/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@ It is strongly recommended to use Linux for development purposes.
C/C++ compiler
---------------


You must need to install C/C++ compiler in advance. You can use `GCC <https://gcc.gnu.org/>`_, `Clang <https://clang.llvm.org/>`_, `Visual Studio <https://visualstudio.microsoft.com/>`_, or `MinGW <https://mingw.org/>`_.

For Linux/Mac OS X users, it is likely that you already have C/C++ compiler installed. For Windows users, you'd need to install Visual Studio with C++ compiler support.
Expand All @@ -36,8 +35,8 @@ Python
Python 3.7 or later.
Because NNSVS is written by `PyTorch <https://pytorch.org/>`_, it is recommended to check the Pytorch installation before testing NNSVS.

Installation
------------
Installation commands
---------------------

Once the above setup is done, you can install NNSVS as follows.

Expand All @@ -47,9 +46,9 @@ For development
.. code::
git clone https://github.com/r9y9/nnsvs.git && cd nnsvs
pip install -e ".[lint.test]"
pip install -e ".[dev,lint,test]"
Note: adding ``[lint,test]`` to the end of the command above will install test/lint requirements as well.
Note: adding ``[dev,lint,test]`` to the end of the command above will install dev/test/lint requirements as well.

For inference only
^^^^^^^^^^^^^^^^^^
Expand All @@ -58,4 +57,18 @@ For inference only
pip install nnsvs
If you don't need to train your models by yourself (I guess it's unlikely though), this should be enough.
If you don't need to train your models by yourself (I guess it's unlikely though), this should be enough.


Google Colab
^^^^^^^^^^^^

If you are on Google colab, you may want to copy the following command into a cell.

.. code-block::
%%capture
try:
import nnsvs
except ImportError:
! pip install git+https://github.com/r9y9/nnsvs

0 comments on commit a59452f

Please sign in to comment.