Skip to content

Commit

Permalink
Wrote documentation for the speech module.
Browse files Browse the repository at this point in the history
  • Loading branch information
Alexandre Chabot-Leclerc committed Aug 11, 2014
2 parents 2e71e62 + c4c7f70 commit 2d90791
Show file tree
Hide file tree
Showing 20 changed files with 446 additions and 48 deletions.
12 changes: 7 additions & 5 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,13 @@ For running tests, you will need `pytest <http://pytest.org/>`__.
Install
-------

You can simply do:
Right now, `pambox` is only avaible through Github. It should be available
via `pip` soon. To install pambox from source::

::
git clone https://github.com/achabotl/pambox.git
cd pambox
python setup.py install

pip install pambox

Contributing
------------
Expand All @@ -58,11 +60,11 @@ You can check out the latest source and install it for development with:
cd pambox
python setup.py develop

To run tests, from the root pambox folder, type:
To run tests (you will need `pytest`), from the root pambox folder, type:

::

py.test
python setup.py test

License
-------
Expand Down
19 changes: 19 additions & 0 deletions docs/audio/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
Audio
=====

The :mod:`~pambox.audio` module provides a single function,
:py:func:`~pambox.audio.play`. By default, the output is scaled to
prevent clipping.

::

from pambox import audio
import numpy as np
audio.play(np.random.randn(10000))


API
---

.. automodule:: pambox.audio
:members:
8 changes: 8 additions & 0 deletions docs/central/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Central auditory processing
===========================

API
---

.. automodule:: pambox.central
:members:
13 changes: 13 additions & 0 deletions docs/distort/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
Signal Distortion and Processing
================================

The :mod:`~pambox.distort` module groups together various distortions and
types of processing that can be applied to signals.

API
---

.. automodule:: pambox.distort
:members:


49 changes: 41 additions & 8 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,29 +3,60 @@
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
PAMBOX
pambox
======

`pambox <https://github.com/achabotl/pambox>`_ is a Python toolbox to
facilitate the development of auditory models, with a focus on speech
intelligibility prediction models.

The Grand Idea is for `pambox` to be a repository of published auditory models,
as well as a simple and powerful tool for developing auditory models.
Components should be reusable and easier to modify.
The current focus is to include speech intelligibility prediction models to
the toolbox, using a standard interface for all models. This should greatly
simplify comparisons across models.

In case Python is not your thing and you prefer Matlab, the `Auditory Modeling
Toolbox <http://amtoolbox.sourceforge.net>`_ is an excellent alternative.

Installing
----------

Install PAMBOX with::
Right now, `pambox` is only avaible through Github. It should be available
via `pip` soon. To install `pambox` from source::

git clone https://github.com/achabotl/pambox.git
cd pambox
python setup.py install

pip install pambox

Structure of the toolbox
------------------------

The structure of the toolbox is inspired by the auditory system. The classes
and functions are split between a "peripheral" and a "central" part. The
"peripheral" part contains, obviously, the "outer", "middle",
and "inner" modules. The "central" part is more general and contains the
modules and functions for central processes, without much order for now.
and functions are split between "peripheral" and "central" parts. The
"peripheral" part is directly accessible through an :mod:`~pambox.inner`,
a :mod:`~pambox.middle`, and an :mod:`~pambox.outer` module.
The :mod:`~pambox.central` part is more general and contains the
modules and functions for central processes, without much extra separation
for now.

The :mod:`~pambox.speech` module contains speech intelligibility models and
various functions and classes to facilitate speech intelligibility prediction
experiments.

The :mod:`~pambox.utils` module contains functions for manipulating
signals, such as setting levels, or padding signals, that are not directly
auditory processes.

The :mod:`~pambox.distort` module contains distortions and processes that
can be applied to signals. Most of them are used in speech intelligibility
experiments.

The :mod:`~pambox.audio` module is a thin wrapper around `pyaudio
<http://people.csail.mit.edu/hubert/pyaudio/>`_ that simplify the playback of
numpy arrays, which his often useful for debugging.

Contents
--------
Expand All @@ -34,7 +65,9 @@ Contents
:maxdepth: 2

audio/index
periph/index
inner/index
middle/index
outer/index
central/index
speech/index
distort/index
Expand Down
8 changes: 8 additions & 0 deletions docs/inner/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Inner ear processing
====================

API
---

.. automodule:: pambox.inner
:members:
8 changes: 8 additions & 0 deletions docs/middle/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Middle ear processes
====================

API
---

.. automodule:: pambox.middle
:members:
8 changes: 8 additions & 0 deletions docs/outer/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
Outer ear processes
===================

API
---

.. automodule:: pambox.outer
:members:
132 changes: 132 additions & 0 deletions docs/speech/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,132 @@
Speech Intelligibility Models and Experiments
=============================================

Introduction
------------

The :mod:`~pambox.speech` module groups together speech intelligibility models
and various other tools to facilitate the creation of speech intelligibility
prediction "experiments".


Speech Intelligibility Models
-----------------------------

Each model presents a standard ``predict`` function that takes the clean speech
signal, the processed speech (or the mixture of the speech and noise),
and the noise alone. The reference level is that a signal with an RMS value
of 1 corresponds to 0 dB SPL.

::

>>> from pambox.speech import Sepsm
>>> s = Sepsm()
>>> res = s.predict_spec(clean, mix, noise)


For models that do take time signals as inputs,
such as the :py:class:`~pambox.speech.Sii`, two other types of interfaces are
defined:

* ``predict_spec`` if the model takes frequency spectra as its inputs. Once
again, the spectra of the clean speech, of the mixture, and of the noise
should be provided::

>>> from pambox.speech import Sii
>>> s = Sii()
>>> res = s.predict_spec(clean_spec, mix_spec, noise_spec)


* ``predict_ir`` if the models takes impulse responses as its inputs. The
function then takes two inputs, the impulse response to the target,
and the concatenated impulse responses to the maskers::

>>> from pambox.speech import IrModel
>>> s = IrModel()
>>> res = s.predict_ir(clean_ir, noise_irs)

Intelligibility models return a dictionary with **at least** the following key:

* ``p`` (for "predictions"): which is a dictionary with the outputs of the
model. They keys are the names of the outputs. This allows models to have
multiple return values. For example, the :py:class:`~pambox.speech.MrSepsm`
returns two predictions values::

>>> s = MrSepsm()
>>> res = s.predict(clean, mix, noise)
>>> res['p']
{'lt_snr_env': 10.5, 'snr_env': 20.5}

It might seem a bit over-complicated, but it allows for an easier storing of
the results of an experiment.

Additionally, the models can add another keys to the results dictionary. For
example, a model can return some of its internal attributes,
its internal representation, etc.

Speech Materials
----------------

The :py:class:`~pambox.speech.Material` class simplifies the
access to the speech files when doing speech intelligibility prediction
experiments.

When creating the class, you have to define:

* where the sentences can be found
* their sampling frequency
* their reference level, in dB SPL (the reference is that a signal with an
RMS value of 1 corresponds to 0 dB SPL),
* as well as the path to a file where the corresponding speech-shaped noise for
this particular material can be found.

For example, to create a speech material object for IEEE sentences stored in
the `../stimuli/ieee` folder::

>>> sm = SpeechMaterial(
... fs=25000,
... root_path='../stimuli/ieee',
... path_to_ssn='ieee_ssn.wav',
... ref_level=74
... name='IEEE'
... )

Each speech file can be loaded using its name::

>>> x = sm.load_file(sm.list[0])

Or files can be loaded as an iterator::

>>> all_files = sm.load_files()
>>> for x in all_files:
... # do some processing on `x`
... pass


By default, the list of files is simply all the files found in
the `root_path`. To overwrite this behavior, simply replace the
:py:func:`~pambox.speech.Material.files_list` function::

>>> def new_files_list():
... return ['file1.wav', 'file2.wav']
>>> sm.files_list = new_files_list

It is common that individual sentences of a speech material are not adjusted
to the exact same level. This is typically done to compensate for differences
in intelligibility between sentences. In order to keep the inter-sentence
level difference, it is recommended to use the
:py:func:`~pambox.speech.Material.set_level` method of the speech material.
The code below sets the levelo of the first sentence to 65 dB SPL,
with the reference that a signal with an RMS value of 1 has a level of 0 dB SPL.

>>> x = sm.load_file(sm.files[0])
>>> adjusted_x = sm.set_level(x, 65)




API
---

.. automodule:: pambox.speech
:members:
10 changes: 10 additions & 0 deletions docs/utils/index.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Utilities
=========



API
---

.. automodule:: pambox.utils
:members:
13 changes: 11 additions & 2 deletions pambox/audio.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
:mod:`~pambox.audio` provides a imple wrapper around pyaudio to simplify
:mod:`~pambox.audio` provides a simple wrapper around `pyaudio` to simplify
sound playback.
"""
from __future__ import division, print_function
Expand All @@ -22,9 +22,18 @@ def play(x, fs=44100, normalize=True):
fs : int (optional
Sampling frequency. The default is 44100 Hz.
normalize : bool
Normalize the signal such that the maximumal (absolute value) is 1 to
Normalize the signal such that the maximum (absolute value) is 1 to
prevent clipping.
Examples
--------
To playback a numpy array:
>>> from pambox import audio
>>> import numpy as np
>>> audio.play(np.random.randn(10000))
"""
x = np.asarray(x)
if normalize:
Expand Down
7 changes: 3 additions & 4 deletions pambox/speech/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,11 @@
from .sepsm import Sepsm
from .mrsepsm import MrSepsm
from .sii import Sii
from .slidingmrsepsm import SlidingMrSepsm
from .material import Material

__all__ = [
'Sepsm',
'MrSepsm'
'MrSepsm',
'Sii',
'SlidingMrSepsm',
'SpeechMaterial'
'Material'
]
Loading

0 comments on commit 2d90791

Please sign in to comment.