PicoTTS

This component provides an ESP-IDF port of the popular PicoTTS Text-to-Speech engine. While Espressif provides an Chinese language TTS, to date there has been no support for other languages. PicoTTS fills this gap, and provides Text-To-Speech for the following languages:

English (UK)
English (US)
German
French
Italian
Spanish

Requirements

The Text-to-Speech engine is quite resource intensive. While the code size is only around 175KB, language resources occupy another 750-1400KB of flash depending on language, and the engine uses just over 1.1MB of RAM while initialised. As such an ESP32-S3 with sufficient amount of PSRAM and flash is a recommended target.

This component does not provide any board-specific audio support. The TTS engine generates 16bit/16KHz samples, and leaves it to the user to direct those to the correct audio device.

Getting started

Using the PicoTTS component is straight forward. Effectively the steps are:

Initialise the engine
Register a callback function to receive the speech samples
Send text to the engine
Eventually, shut down the engine

In code, this can look like:

  #include "picotts.h"

  #define TTS_TASK_PRIORITY 5
  #define TTS_CORE 1

  void my_sample_cb(int16_t *buf, unsigned count)
  {
    esp_codec_dev_write(speaker_codec_dev, buf, count*2);
  }

  if (picotts_init(TTS_TASK_PRIORITY, my_sample_cb, TTS_CORE))
  {
    static const msg[] = "Hello, world";
    picotts_add(msg, sizeof(msg)); // Include the \0 to tell TTS to go

    // Do other stuff, or at least wait until the msg has been spoken

    picotts_shutdown();
  }

API documentation can be found in the picotts.h header file.

Resource handling

The PicoTTS engine relies on two resource blobs, a Text Analysis (TA) resource and a Signal Generator (SG) resource. In upstream PicoTTS, these are loaded into RAM from files on disk. As RAM is a very precious resource on a microcontroller, this component has replaced the resource loading routines such that they can be accessed directly from memory-mapped flash instead. This reduces the RAM foot-print from 2.5MB down to 1.1MB.

There are two options on how to bundle the resource files onto flash. The default, and arguably the easiest, is to embed the resource files directly into the application binary. The one downside to this approach is that application size grows significantly, and may present an issue with firmware upgrades. You will definitely use a much larger application partition than usual. Alternatively, the resource files can be placed in dedicated flash partitions and accessed from there instead. The advantage with this approach is that the language resources are no longer directly coupled to the application binary. Which approach is best will depend on the specific project circumstances.

Custom paritions for language resources

When this component is configured to load its language resources from partitions rather than having them directly embedded into the application binary itself, you will need to add partition entries to hold the Text Analysis (TA) and Signal Generator (SG) resources. Example entries for partitions.csv:

picotts_ta, data, undefined,   ,        640K,
picotts_sg, data, undefined,   ,        820K,

You are free to use any valid partition type and subtype. This component loads purely by the partition name. The partition names may be changed via Kconfig if so desired.

The partition sizes may be shrunk to better match the language you're using. What's show here are the maximum partition sizes to fit any language bundle.

Examples

The boot_greeting example is written for ESP-BOX and uses this component to issue a greeting upon boot.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
examples/boot_greeting		examples/boot_greeting
include		include
pico		pico
CMakeLists.txt		CMakeLists.txt
Kconfig		Kconfig
README.md		README.md
esp_picorsrc.c		esp_picorsrc.c
esp_picorsrc.h		esp_picorsrc.h
esp_picotts.c		esp_picotts.c
idf_component.yml		idf_component.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples/boot_greeting

examples/boot_greeting

include

include

pico

pico

CMakeLists.txt

CMakeLists.txt

Kconfig

Kconfig

README.md

README.md

esp_picorsrc.c

esp_picorsrc.c

esp_picorsrc.h

esp_picorsrc.h

esp_picotts.c

esp_picotts.c

idf_component.yml

idf_component.yml

Repository files navigation

PicoTTS

Requirements

Getting started

Resource handling

Custom paritions for language resources

Examples

About

Releases

Packages

Languages

DiUS/esp-picotts

Folders and files

Latest commit

History

Repository files navigation

PicoTTS

Requirements

Getting started

Resource handling

Custom paritions for language resources

Examples

About

Resources

Stars

Watchers

Forks

Languages