Wrote documentation for the speech module.

achabotl · Aug 11, 2014 · 2d90791 · 2d90791
2 parents 2e71e62 + c4c7f70
commit 2d90791
Show file tree

Hide file tree

Showing 20 changed files with 446 additions and 48 deletions.
diff --git a/README.rst b/README.rst
@@ -41,11 +41,13 @@ For running tests, you will need `pytest <http://pytest.org/>`__.
 Install
 -------
 
-You can simply do:
+Right now, `pambox` is only avaible through Github. It should be available
+via `pip` soon. To install pambox from source::
 
-::
+    git clone https://github.com/achabotl/pambox.git
+    cd pambox
+    python setup.py install
 
-    pip install pambox
 
 Contributing
 ------------
@@ -58,11 +60,11 @@ You can check out the latest source and install it for development with:
     cd pambox
     python setup.py develop
 
-To run tests, from the root pambox folder, type:
+To run tests (you will need `pytest`), from the root pambox folder, type:
 
 ::
 
-    py.test
+    python setup.py test
 
 License
 -------

diff --git a/docs/audio/index.rst b/docs/audio/index.rst
@@ -0,0 +1,19 @@
+Audio
+=====
+
+The :mod:`~pambox.audio` module provides a single function,
+:py:func:`~pambox.audio.play`. By default, the output is scaled to
+prevent clipping.
+
+::
+
+    from pambox import audio
+    import numpy as np
+    audio.play(np.random.randn(10000))
+
+
+API
+---
+
+.. automodule:: pambox.audio
+   :members:
diff --git a/docs/central/index.rst b/docs/central/index.rst
@@ -0,0 +1,8 @@
+Central auditory processing
+===========================
+
+API
+---
+
+.. automodule:: pambox.central
+   :members:
diff --git a/docs/distort/index.rst b/docs/distort/index.rst
@@ -0,0 +1,13 @@
+Signal Distortion and Processing
+================================
+
+The :mod:`~pambox.distort` module groups together various distortions and
+types of processing that can be applied to signals.
+
+API
+---
+
+.. automodule:: pambox.distort
+    :members:
+
+
diff --git a/docs/index.rst b/docs/index.rst
@@ -3,29 +3,60 @@
    You can adapt this file completely to your liking, but it should at least
    contain the root `toctree` directive.
 
-PAMBOX
+pambox
 ======
 
 `pambox <https://github.com/achabotl/pambox>`_ is a Python toolbox to
 facilitate the development of auditory models, with a focus on speech
 intelligibility prediction models.
 
+The Grand Idea is for `pambox` to be a repository of published auditory models,
+as well as a simple and powerful tool for developing auditory models.
+Components should be reusable and easier to modify.
+The current focus is to include speech intelligibility prediction models to
+the toolbox, using a standard interface for all models. This should greatly
+simplify comparisons across models.
+
+In case Python is not your thing and you prefer Matlab, the `Auditory Modeling
+Toolbox <http://amtoolbox.sourceforge.net>`_ is an excellent alternative.
 
 Installing
 ----------
 
-Install PAMBOX with::
+Right now, `pambox` is only avaible through Github. It should be available
+via `pip` soon. To install `pambox` from source::
+
+    git clone https://github.com/achabotl/pambox.git
+    cd pambox
+    python setup.py install
 
-    pip install pambox
 
 Structure of the toolbox
 ------------------------
 
 The structure of the toolbox is inspired by the auditory system. The classes
-and functions are split between a "peripheral" and a "central" part. The
-"peripheral" part contains, obviously, the "outer", "middle",
-and "inner" modules. The "central" part is more general and contains the
-modules and functions for central processes, without much order for now.
+and functions are split between "peripheral" and "central" parts. The
+"peripheral" part is directly accessible through an :mod:`~pambox.inner`,
+a :mod:`~pambox.middle`, and an :mod:`~pambox.outer` module.
+The :mod:`~pambox.central` part is more general and contains the
+modules and functions for central processes, without much extra separation
+for now.
+
+The :mod:`~pambox.speech` module contains speech intelligibility models and
+various functions and classes to facilitate speech intelligibility prediction
+experiments.
+
+The :mod:`~pambox.utils` module contains functions for manipulating
+signals, such as setting levels, or padding signals, that are not directly
+auditory processes.
+
+The :mod:`~pambox.distort` module contains distortions and processes that
+can be applied to signals. Most of them are used in speech intelligibility
+experiments.
+
+The :mod:`~pambox.audio` module is a thin wrapper around `pyaudio
+<http://people.csail.mit.edu/hubert/pyaudio/>`_ that simplify the playback of
+numpy arrays, which his often useful for debugging.
 
 Contents
 --------
@@ -34,7 +65,9 @@ Contents
    :maxdepth: 2
 
    audio/index
-   periph/index
+   inner/index
+   middle/index
+   outer/index
    central/index
    speech/index
    distort/index

diff --git a/docs/inner/index.rst b/docs/inner/index.rst
@@ -0,0 +1,8 @@
+Inner ear processing
+====================
+
+API
+---
+
+.. automodule:: pambox.inner
+   :members:
diff --git a/docs/middle/index.rst b/docs/middle/index.rst
@@ -0,0 +1,8 @@
+Middle ear processes
+====================
+
+API
+---
+
+.. automodule:: pambox.middle
+   :members:
diff --git a/docs/outer/index.rst b/docs/outer/index.rst
@@ -0,0 +1,8 @@
+Outer ear processes
+===================
+
+API
+---
+
+.. automodule:: pambox.outer
+   :members:
diff --git a/docs/speech/index.rst b/docs/speech/index.rst
@@ -0,0 +1,132 @@
+Speech Intelligibility Models and Experiments
+=============================================
+
+Introduction
+------------
+
+The :mod:`~pambox.speech` module groups together speech intelligibility models
+and various other tools to facilitate the creation of speech intelligibility
+prediction "experiments".
+
+
+Speech Intelligibility Models
+-----------------------------
+
+Each model presents a standard ``predict`` function that takes the clean speech
+signal, the processed speech (or the mixture of the speech and noise),
+and the noise alone. The reference level is that a signal with an RMS value
+of 1 corresponds to 0 dB SPL.
+
+::
+
+    >>> from pambox.speech import Sepsm
+    >>> s = Sepsm()
+    >>> res = s.predict_spec(clean, mix, noise)
+
+
+For models that do take time signals as inputs,
+such as the :py:class:`~pambox.speech.Sii`, two other types of interfaces are
+defined:
+
+* ``predict_spec`` if the model takes frequency spectra as its inputs. Once
+  again, the spectra of the clean speech, of the mixture, and of the noise
+  should be provided::
+
+    >>> from pambox.speech import Sii
+    >>> s = Sii()
+    >>> res = s.predict_spec(clean_spec, mix_spec, noise_spec)
+
+
+* ``predict_ir`` if the models takes impulse responses as its inputs. The
+  function then takes two inputs, the impulse response to the target,
+  and the concatenated impulse responses to the maskers::
+
+    >>> from pambox.speech import IrModel
+    >>> s = IrModel()
+    >>> res = s.predict_ir(clean_ir, noise_irs)
+
+Intelligibility models return a dictionary with **at least** the following key:
+
+* ``p`` (for "predictions"): which is a dictionary with the outputs of the
+  model. They keys are the names of the outputs. This allows models to have
+  multiple return values. For example, the :py:class:`~pambox.speech.MrSepsm`
+  returns two predictions values::
+
+    >>> s = MrSepsm()
+    >>> res = s.predict(clean, mix, noise)
+    >>> res['p']
+    {'lt_snr_env': 10.5, 'snr_env': 20.5}
+
+It might seem a bit over-complicated, but it allows for an easier storing of
+the results of an experiment.
+
+Additionally, the models can add another keys to the results dictionary. For
+example, a model can return some of its internal attributes,
+its internal representation, etc.
+
+Speech Materials
+----------------
+
+The :py:class:`~pambox.speech.Material`  class simplifies the
+access to the speech files when doing speech intelligibility prediction
+experiments.
+
+When creating the class, you have to define:
+
+* where the sentences can be found
+* their sampling frequency
+* their reference level, in dB SPL (the reference is that a signal with an
+  RMS value of 1 corresponds to 0 dB SPL),
+* as well as the path to a file where the corresponding speech-shaped noise for
+  this particular material can be found.
+
+For example, to create a speech material object for IEEE sentences stored in
+the `../stimuli/ieee` folder::
+
+    >>> sm = SpeechMaterial(
+    ...    fs=25000,
+    ...    root_path='../stimuli/ieee',
+    ...    path_to_ssn='ieee_ssn.wav',
+    ...    ref_level=74
+    ...    name='IEEE'
+    ...    )
+
+Each speech file can be loaded using its name::
+
+    >>> x = sm.load_file(sm.list[0])
+
+Or files can be loaded as an iterator::
+
+    >>> all_files = sm.load_files()
+    >>> for x in all_files:
+    ...    # do some processing on `x`
+    ...    pass
+
+
+By default, the list of files is simply all the files found in
+the `root_path`. To overwrite this behavior, simply replace the
+:py:func:`~pambox.speech.Material.files_list` function::
+
+    >>> def new_files_list():
+    ...     return ['file1.wav', 'file2.wav']
+    >>> sm.files_list = new_files_list
+
+It is common that individual sentences of a speech material are not adjusted
+to the exact same level. This is typically done to compensate for differences
+in intelligibility between sentences. In order to keep the inter-sentence
+level difference, it is recommended to use the
+:py:func:`~pambox.speech.Material.set_level` method of the speech material.
+The code below sets the levelo of the first sentence to 65 dB SPL,
+with the reference that a signal with an RMS value of 1 has a level of 0 dB SPL.
+
+    >>> x = sm.load_file(sm.files[0])
+    >>> adjusted_x = sm.set_level(x, 65)
+
+
+
+
+API
+---
+
+.. automodule:: pambox.speech
+   :members:
diff --git a/docs/utils/index.rst b/docs/utils/index.rst
@@ -0,0 +1,10 @@
+Utilities
+=========
+
+
+
+API
+---
+
+.. automodule:: pambox.utils
+   :members:
diff --git a/pambox/audio.py b/pambox/audio.py
@@ -1,5 +1,5 @@
 """
-:mod:`~pambox.audio` provides a imple wrapper around pyaudio to simplify
+:mod:`~pambox.audio` provides a simple wrapper around `pyaudio` to simplify
 sound playback.
 """
 from __future__ import division, print_function
@@ -22,9 +22,18 @@ def play(x, fs=44100, normalize=True):
     fs : int (optional
         Sampling frequency. The default is 44100 Hz.
     normalize : bool
-        Normalize the signal such that the maximumal (absolute value) is 1 to
+        Normalize the signal such that the maximum (absolute value) is 1 to
         prevent clipping.
 
+    Examples
+    --------
+
+    To playback a numpy array:
+
+    >>> from pambox import audio
+    >>> import numpy as np
+    >>> audio.play(np.random.randn(10000))
+
     """
     x = np.asarray(x)
     if normalize:

diff --git a/pambox/speech/__init__.py b/pambox/speech/__init__.py
@@ -7,12 +7,11 @@
 from .sepsm import Sepsm
 from .mrsepsm import MrSepsm
 from .sii import Sii
-from .slidingmrsepsm import SlidingMrSepsm
+from .material import Material
 
 __all__ = [
     'Sepsm',
-    'MrSepsm'
+    'MrSepsm',
     'Sii',
-    'SlidingMrSepsm',
-    'SpeechMaterial'
+    'Material'
 ]