Skip to content

A computational model of speech acquisition using goal-directed exploration

License

Notifications You must be signed in to change notification settings

aphilippsen/goalspeech

Repository files navigation

GoALSpeech: Goal-directed Articulatory Learning for Speech Acquisition

Source code for replicating the results of the following paper: Philippsen, A. (2021). Goal-directed exploration for learning vowels and syllables: a computational model of speech acquisition. KI-Künstliche Intelligenz, 35(1), 53-70.

This the python implementation of the original Matlab implementation used for Ph.D. thesis on "Learning How to Speak. Goal Space Exploration for Articulatory Skill Acquisition": https://pub.uni-bielefeld.de/record/2921296

Installation

The code runs with Python3.

Python packages

Can be installed e.g. via pip:

  • dtw (tested with version 1.4.0)
  • matplotlib (you might need to install the package python3-tk)
  • numpy
  • python-speech-features (https://github.com/StevenLOL/python_speech_features)
  • scipy
  • scikit-learn
  • sounddevice (you might need to install the package for the PortAudio library first in some Linux distributions, libportaudio2)
  • torch
  • oct2py (If GBFB features should be used, see below.)
  • tqdm (progress bar for sound production)
  • fastdtw (Comparison of sounds using the syllable weighting scheme)

Articulatory system

Acoustics

How to run

1. Configure the parameters. The parameters for an experiment are defined in a cfg file. Examples can be found in goalspeech/config/. Details about the format can be found in goalspeech/config/info.txt. A config file can also be generated via the file generateConfig.py. Modify the script and execute it to generate the file which will be written to config/.

2. Initialize the experiment. In ipython: Run one of the files initExperiment*.py [e.g. "%run -i initExperimentVowels.py"]. If you changed the config and want to use your own one, replace the path for the config file first. This will create an instance of VTLSpeechProduction and loads all the required parameters from the config into the ipython workspace. (When switching between *Vowels.py and *Syllables.py, currently a new ipython instance has to be started.)

3. Run experiment. After the initialization, runExperiment*.py starts the babbling learning process. If the script is run for the first time, ambient speech data is generated and stored into data/ambientSpeech/. In subsequent runs this file is reused. If you want the system to override it, delete it in the above mentioned directory. The following steps are performed:

  • Create articulatory data set (temporarily, is discarded after generating the acoustics)
  • Create corresponding acoustic data set, store in data/ambientSpeech (used as ambient speech, i.e. the speech from the environment that the system hears in its environment)
  • Start babbling. Results will be saved into a folder named with the current date in data/results/.

4. Inspect results.

  • In the beginning of babbling: gs.png shows the generated goal space. Make sure that it looks meaningful before investing time in continuing the babbling. The config is stored as config.txt.
  • After each babbling run (#runs defined in "runs") the results of the corresponding run R are stored as "results-R.pickle"
  • The script evaluateResults.py can be used to evaluate and visualize the results from multiple runs of one experiment.

About

A computational model of speech acquisition using goal-directed exploration

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages