Paroli on Orange Pi

Streaming mode implementation of the Piper TTS system in C++ with RK3588/3566 NPU acceleration support.

How to use

Assuming you are running Ubuntu/Debian clean on your Orange Pi RK3588/3566 (Orange Pi 5 series/ Orange Pi 3B)

You first need to instal rknpu lib, the fastest way is using Petrolus ezrknn installer

https://github.com/Pelochus/ezrknn-toolkit2
cd ezrknn-toolkit2
sudo bash install.sh

then you need to install git-lfs to clone models (which is large size)

(. /etc/lsb-release && curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo env os=ubuntu dist="${DISTRIB_CODENAME}" bash)
sudo apt-get install git-lfs

Install some dependencies

sudo apt install -y cmake build-essential
sudo apt install -y libxtensor-dev nlohmann-json3-dev libspdlog-dev libopus-dev libfmt-dev libjsoncpp-dev
sudo apt install -y espeak-ng libespeak-ng-dev libogg-dev libsoxr-dev

Install drogon

cd ~
sudo apt install libssl-dev pkg-config libbotan-2-dev libc-ares-dev uuid-dev doxygen
git clone https://github.com/drogonframework/drogon
cd drogon
git submodule update --init --recursive
mkdir build
cd build
cmake ..
make -j4
sudo make install

Install libopusenc

cd ~
wget https://archive.mozilla.org/pub/opus/libopusenc-0.2.1.tar.gz
tar -xvzf libopusenc-0.2.1.tar.gz
cd libopusenc-0.2.1
./configure
make -j4
sudo make install

this part is optional, some time the ldconfig not update your /usr/local/lib, then you need to do it manually sudo nano /etc/ld.so.conf.d/local.conf add this line /usr/local/lib then save and exit

Cloning models

cd ~
git clone https://huggingface.co/thanhtantran/piper-paroli-rknn-model

Clone onnxruntime

cd ~
wget https://github.com/microsoft/onnxruntime/releases/download/v1.14.1/onnxruntime-linux-aarch64-1.14.1.tgz
tar -xvf onnxruntime-linux-aarch64-1.14.1.tgz

Clone piper-phoemize

cd ~
wget https://github.com/rhasspy/piper-phonemize/releases/download/2023.11.14-4/piper-phonemize_linux_aarch64.tar.gz
tar -xvf piper-phonemize_linux_aarch64.tar.gz

Now build the app

git clone https://github.com/thanhtantran/paroli-on-orangepi
cd paroli-on-orangepi
mkdir build && cd build
cmake .. -DUSE_RKNN=ON -DORT_ROOT=/home/orangepi/onnxruntime-linux-aarch64-1.14.1 -DPIPER_PHONEMIZE_ROOT=/home/orangepi/piper_phonemize -DCMAKE_BUILD_TYPE=Release
make -j4

After cmake run, you will see wherether the lib librknnrt.so is loaded or not, normally it should be in /usr/lib/. If you see not loaded, that mean the lib still not installed. And you can not run with RKNPU accelerator.

After all, the program is compiled, you can use it inside the build folder. But still need one more step, this is important. Copy the espeak-ng-data from piper_phonemize into the build folder:

cp -r ~/piper_phonemize/share/espeak-ng-data/ .

The Command to transform text to wav

./paroli-cli --encoder /home/orangepi/piper-paroli-rknn-model/encoder.onnx --decoder /home/orangepi/piper-paroli-rknn-model/decoder.rknn -c ~/piper-paroli-rknn-model/config.json

After piper loaded, paste the text into the shell, then it will transform to wav.

Change the decoder.rknn to your Orange Pi model, for example with Orange Pi 5 series, you will use decoder-3588.rknn; if you are using Orange Pi 3B, use decoder-3566.rknn

The API server

An web API server is also provided so other applications can easily perform text to speech. For details, please refer to the web API document for details. By default, a demo UI can be accessed at the root of the URL. The API server supports both responding with compressed audio to reduce bandwidth requirement and streaming audio via WebSocket.

To run it:

./paroli-server --encoder /home/orangepi/piper-paroli-rknn-model/encoder.onnx --decoder /home/orangepi/piper-paroli-rknn-model/decoder.rknn -c ~/piper-paroli-rknn-model/config.json --ip 0.0.0.0 --port 8848

Same as the CLI, change the decoder.rknn to your Orange Pi model, for example with Orange Pi 5 series, you will use decoder-3588.rknn; if you are using Orange Pi 3B, use decoder-3566.rknn

And to invoke TSS

curl http://your.server.address:8848/api/v1/synthesise -X POST -H 'Content-Type: application/json' -d '{"text": "To be or not to be, that is the question"}' > test.opus

Demo

Compare between model onnx inference and model rknn inference

Using RKNN to inference Vietnamese long text

Video wait here ...

Authentication

To enable use cases where the service is exposed for whatever reason. The API server supports a basic authentication scheme. The --auth flag will generate a bearer token that is different every time and both websocket and HTTP synthesis API will only work if enabled. --auth [YOUR_TOKEN] will set the token to YOUR_TOKEN. Furthermore setting the PAROLI_TOKEN environment variable will set the bearer token to whatever the environment variable is set to.

Authentication: Bearer <insert the token>

The Web UI will not work when authentication is enabled

Training models

To obtain the encoder and decoder models, you'll either need to download them or creating one from checkpoints. Checkpoints are the trained raw model piper generates. Please refer to piper's TRAINING.md for details. To convert checkpoints into ONNX file pairs, you'll need mush42's piper fork and the streaming branch. Run

python3 -m piper_train.export_onnx_streaming /path/to/your/traning/lighting_logs/version_0/checkpoints/blablablas.ckpt /path/to/output/directory

Downloading models

Some 100% legal models are provided on HuggingFace. Built with and converted for RK3588 by Paroli original.

Converting model to Rockchip NPU

Also, converting ONNX to RKNN has to be done on an x64 computer. As of writing this document, you likely want to install the version for Python 3.10 as this is the same version that works with upstream piper. rknn-toolkit2 version 1.6.0 is required.

git clone https://github.com/airockchip/rknn-toolkit2
cd rknn-toolkit2/rknpu2/runtime/Linux/librknn_api
sudo cp aarch64/librknnrt.so /usr/lib/
sudo cp include/* /usr/include/

Then convert your model

# Install rknn-toolkit2
cd rknn-toolkit2/rknn-toolkit2/packages/x86_64
pip install rknn_toolkit2-2.3.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl

# Run the conversion script
python tools/decoder2rknn.py /path/to/model/decoder.onnx /path/to/model/decoder.rknn

If you get the error

ImportError: libGL.so.1: cannot open shared object file: No such file or directory

then install the needed lib sudo apt install libgl1

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
cmake		cmake
paroli-cli		paroli-cli
paroli-server		paroli-server
piper		piper
tools		tools
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Paroli on Orange Pi

How to use

The Command to transform text to wav

The API server

Demo

Authentication

Training models

Downloading models

Converting model to Rockchip NPU

Credit

About

Uh oh!

Releases

Packages

Languages

License

thanhtantran/paroli-on-orangepi

Folders and files

Latest commit

History

Repository files navigation

Paroli on Orange Pi

How to use

The Command to transform text to wav

The API server

Demo

Authentication

Training models

Downloading models

Converting model to Rockchip NPU

Credit

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages