Skip to content

Commit

Permalink
Added cew, exception handling, Downloader. Unified aeneas.tools.*
Browse files Browse the repository at this point in the history
  • Loading branch information
Alberto Pettarin committed Oct 8, 2015
1 parent ccab249 commit 08626de
Show file tree
Hide file tree
Showing 194 changed files with 7,806 additions and 3,857 deletions.
9 changes: 6 additions & 3 deletions .gitignore
@@ -1,7 +1,8 @@
*.py[cdo]
*.pyc
*.pyd
*.pyo
*.swp
*.so
*.pyd
MANIFEST
aeneas/build
bak
Expand All @@ -10,10 +11,12 @@ dist
docs/build
tmp

# service scripts
zzz_*.py
zzz_*.sh

# Eclipse / PyDev
# Eclipse/PyDev
.project
.pydevproject
.settings

17 changes: 14 additions & 3 deletions README.md
Expand Up @@ -2,8 +2,8 @@

**aeneas** is a Python library and a set of tools to automagically synchronize audio and text.

* Version: 1.2.0
* Date: 2015-09-27
* Version: 1.2.1
* Date: 2015-10-XX
* Developed by: [ReadBeyond](http://www.readbeyond.it/)
* Lead Developer: [Alberto Pettarin](http://www.albertopettarin.it/)
* License: the GNU Affero General Public License Version 3 (AGPL v3)
Expand Down Expand Up @@ -50,7 +50,7 @@ or raw CSV/SSV/TSV/TXT/XML for further processing.
2. `ffmpeg` and `ffprobe` executables available in your `$PATH`
3. `espeak` executable available in your `$PATH`
4. Python 2.7.x
5. Python modules `BeautifulSoup`, `lxml`, `numpy`, and `scikits.audiolab`
5. Python modules `BeautifulSoup`, `lxml`, and `numpy`
6. (Optional but strongly suggested) Python C headers to compile the Python C extensions

Depending on the format(s) of audio files you work with,
Expand Down Expand Up @@ -275,6 +275,17 @@ The code for computing the MFCCs
is a verbatim copy from the
[CMU Sphinx III project](http://cmusphinx.sourceforge.net/).

The code for reading and writing WAVE files
[`aeneas/wavfile.py`](aeneas/wavfile.py)
is a verbatim copy from the
[scipy project](https://github.com/scipy/scipy/),
included here verbatim to avoid the dependency
on the whole `scipy` package, while replacing `scikits.audiolab`.

The C header [`speak_lib.h`](aeneas/speak_lib.h) for `espeak`
is a verbatim copy from the
[espeak project](http://espeak.sourceforge.net/).

Audio files contained in the unit tests `aeneas/tests/res/` directory
are adapted from recordings produced by
the [LibriVox Project](http://www.librivox.org)
Expand Down
17 changes: 13 additions & 4 deletions README.txt
Expand Up @@ -4,8 +4,8 @@ aeneas
**aeneas** is a Python library and a set of tools to automagically
synchronize audio and text.

- Version: 1.2.0
- Date: 2015-09-27
- Version: 1.2.1
- Date: 2015-10-XX
- Developed by: `ReadBeyond <http://www.readbeyond.it/>`__
- Lead Developer: `Alberto Pettarin <http://www.albertopettarin.it/>`__
- License: the GNU Affero General Public License Version 3 (AGPL v3)
Expand Down Expand Up @@ -55,8 +55,7 @@ System Requirements
2. ``ffmpeg`` and ``ffprobe`` executables available in your ``$PATH``
3. ``espeak`` executable available in your ``$PATH``
4. Python 2.7.x
5. Python modules ``BeautifulSoup``, ``lxml``, ``numpy``, and
``scikits.audiolab``
5. Python modules ``BeautifulSoup``, ``lxml``, and ``numpy``
6. (Optional but strongly suggested) Python C headers to compile the
Python C extensions

Expand Down Expand Up @@ -297,6 +296,16 @@ The code for computing the MFCCs ```aeneas/mfcc.py`` <aeneas/mfcc.py>`__
is a verbatim copy from the `CMU Sphinx III
project <http://cmusphinx.sourceforge.net/>`__.

The code for reading and writing WAVE files
```aeneas/wavfile.py`` <aeneas/wavfile.py>`__ is a verbatim copy from
the `scipy project <https://github.com/scipy/scipy/>`__, included here
verbatim to avoid the dependency on the whole ``scipy`` package, while
replacing ``scikits.audiolab``.

The C header ```speak_lib.h`` <aeneas/speak_lib.h>`__ for ``espeak`` is
a verbatim copy from the `espeak
project <http://espeak.sourceforge.net/>`__.

Audio files contained in the unit tests ``aeneas/tests/res/`` directory
are adapted from recordings produced by the `LibriVox
Project <http://www.librivox.org>`__ and they are in the public domain.
Expand Down
2 changes: 1 addition & 1 deletion VERSION
@@ -1 +1 @@
1.2.0
1.2.1
38 changes: 28 additions & 10 deletions aeneas/__init__.py
Expand Up @@ -9,27 +9,45 @@
from aeneas.adjustboundaryalgorithm import AdjustBoundaryAlgorithm
from aeneas.analyzecontainer import AnalyzeContainer
from aeneas.audiofile import AudioFile
from aeneas.container import Container, ContainerFormat
from aeneas.dtw import DTWAlgorithm, DTWAligner
from aeneas.audiofile import AudioFileMonoWAV
from aeneas.audiofile import AudioFileUnsupportedFormatError
from aeneas.container import Container
from aeneas.container import ContainerFormat
from aeneas.downloader import Downloader
from aeneas.dtw import DTWAlgorithm
from aeneas.dtw import DTWAligner
from aeneas.espeakwrapper import ESPEAKWrapper
from aeneas.executejob import ExecuteJob
from aeneas.executetask import ExecuteTask
from aeneas.executetask import ExecuteTaskExecutionError
from aeneas.executetask import ExecuteTaskInputError
from aeneas.ffmpegwrapper import FFMPEGWrapper
from aeneas.ffprobewrapper import FFPROBEParsingError
from aeneas.ffprobewrapper import FFPROBEUnsupportedFormatError
from aeneas.ffprobewrapper import FFPROBEWrapper
import aeneas.globalconstants as gc
import aeneas.globalfunctions as gf
from aeneas.hierarchytype import HierarchyType
from aeneas.idsortingalgorithm import IDSortingAlgorithm
from aeneas.job import Job, JobConfiguration
from aeneas.job import Job
from aeneas.job import JobConfiguration
from aeneas.language import Language
from aeneas.logger import Logger
from aeneas.sd import SD, SDMetric
from aeneas.syncmap import SyncMap, SyncMapFragment, SyncMapFormat, SyncMapHeadTailFormat
from aeneas.sd import SD
from aeneas.sd import SDMetric
from aeneas.syncmap import SyncMap
from aeneas.syncmap import SyncMapFormat
from aeneas.syncmap import SyncMapFragment
from aeneas.syncmap import SyncMapHeadTailFormat
from aeneas.syncmap import SyncMapMissingParameterError
from aeneas.synthesizer import Synthesizer
from aeneas.task import Task, TaskConfiguration
from aeneas.textfile import TextFile, TextFileFormat, TextFragment
from aeneas.task import Task
from aeneas.task import TaskConfiguration
from aeneas.textfile import TextFile
from aeneas.textfile import TextFileFormat
from aeneas.textfile import TextFragment
from aeneas.vad import VAD
from aeneas.validator import Validator
import aeneas.globalconstants as gc
import aeneas.globalfunctions as gf

__author__ = "Alberto Pettarin"
__copyright__ = """
Expand All @@ -38,7 +56,7 @@
Copyright 2015, Alberto Pettarin (www.albertopettarin.it)
"""
__license__ = "GNU AGPL v3"
__version__ = "1.2.0"
__version__ = "1.2.1"
__email__ = "aeneas@readbeyond.it"
__status__ = "Production"

Expand Down
82 changes: 53 additions & 29 deletions aeneas/adjustboundaryalgorithm.py
Expand Up @@ -19,7 +19,7 @@
Copyright 2015, Alberto Pettarin (www.albertopettarin.it)
"""
__license__ = "GNU AGPL v3"
__version__ = "1.2.0"
__version__ = "1.2.1"
__email__ = "aeneas@readbeyond.it"
__status__ = "Production"

Expand All @@ -45,6 +45,8 @@ class AdjustBoundaryAlgorithm(object):
:type value: string
:param logger: the logger object
:type logger: :class:`aeneas.logger.Logger`
:raises ValueError: if one of `text_map`, `speech` or `nonspeech` is `None` or `algorithm` value is not allowed
"""

AFTERCURRENT = "aftercurrent"
Expand Down Expand Up @@ -91,9 +93,12 @@ class AdjustBoundaryAlgorithm(object):
""" List of all the allowed values """

DEFAULT_MAX_RATE = 21.0
""" Default max rate (used only when RATE or RATEAGGRESSIVE
""" Default max rate (used only when ``RATE`` or ``RATEAGGRESSIVE``
algorithms are used) """

DEFAULT_PERCENT = 50
""" Default percent value (used only when ``PERCENT`` algorithm is used) """

TOLERANCE = 0.001
""" Tolerance when comparing floats """

Expand All @@ -108,28 +113,57 @@ def __init__(
value=None,
logger=None
):
if algorithm not in self.ALLOWED_VALUES:
raise ValueError("Algorithm value not allowed")
if text_map is None:
raise ValueError("Text map is None")
if speech is None:
raise ValueError("Speech list is None")
if nonspeech is None:
raise ValueError("Nonspeech list is None")
self.algorithm = algorithm
self.text_map = copy.deepcopy(text_map)
self.speech = speech
self.nonspeech = nonspeech
self.value = value
self.logger = logger
self.max_rate = self.DEFAULT_MAX_RATE
if self.logger is None:
self.logger = Logger()
self._parse_value()

def _log(self, message, severity=Logger.DEBUG):
""" Log """
self.logger.log(message, severity, self.TAG)

def _parse_value(self):
"""
Parse the self.value value
"""
if self.algorithm == self.AUTO:
return
elif self.algorithm == self.PERCENT:
try:
self.value = int(self.value)
except ValueError:
self.value = self.DEFAULT_PERCENT
self.value = max(min(self.value, 100), 0)
else:
try:
self.value = float(self.value)
except ValueError:
self.value = 0.0
if (
(self.value <= 0) and
(self.algorithm in [self.RATE, self.RATEAGGRESSIVE])
):
self.value = self.DEFAULT_MAX_RATE

def adjust(self):
"""
Adjust the boundaries of the text map.
:rtype: list of intervals
"""
if self.text_map is None:
raise AttributeError("Text map is None")
if self.algorithm == self.AUTO:
return self._adjust_auto()
elif self.algorithm == self.AFTERCURRENT:
Expand All @@ -153,14 +187,13 @@ def _adjust_auto(self):
def _adjust_offset(self):
self._log("Called _adjust_offset")
try:
value = float(self.value)
for index in range(1, len(self.text_map)):
current = self.text_map[index]
previous = self.text_map[index - 1]
if value >= 0:
offset = min(value, current[1] - current[0])
if self.value >= 0:
offset = min(self.value, current[1] - current[0])
else:
offset = -min(-value, previous[1] - previous[0])
offset = -min(-self.value, previous[1] - previous[0])
previous[1] += offset
current[0] += offset
except:
Expand All @@ -170,18 +203,15 @@ def _adjust_offset(self):
def _adjust_percent(self):
def new_time(current_boundary, nsi):
duration = nsi[1] - nsi[0]
try:
percent = max(min(int(self.value), 100), 0) / 100.0
except:
percent = 0.500
percent = self.value / 100.0
return nsi[0] + duration * percent
return self._adjust_on_nsi(new_time)

def _adjust_aftercurrent(self):
def new_time(current_boundary, nsi):
duration = nsi[1] - nsi[0]
try:
delay = max(min(float(self.value), duration), 0)
delay = max(min(self.value, duration), 0)
if delay == 0:
return current_boundary
return nsi[0] + delay
Expand All @@ -193,7 +223,7 @@ def _adjust_beforenext(self):
def new_time(current_boundary, nsi):
duration = nsi[1] - nsi[0]
try:
delay = max(min(float(self.value), duration), 0)
delay = max(min(self.value, duration), 0)
if delay == 0:
return current_boundary
return nsi[1] - delay
Expand Down Expand Up @@ -241,23 +271,23 @@ def _adjust_on_nsi(self, new_time_function):
def _len(self, string):
"""
Return the length of the given string.
If it is greater than 2 times the max_rate,
If it is greater than 2 times the self.value (= user max rate),
one space will become a newline,
and hence we do not count it
(e.g., max_rate = 21 => max 42 chars per line).
(e.g., value = 21 => max 42 chars per line).
:param string: the string to be counted
:type string: string
:rtype: int
"""
# TODO this should depend on the number of lines
# in the text fragment; current code assumes
# at most 2 lines of at most max_rate characters each
# at most 2 lines of at most value characters each
# (the effect of this finesse is negligible in practice)
if string is None:
return 0
length = len(string)
if length > 2 * self.max_rate:
if length > 2 * self.value:
length -= 1
return length

Expand Down Expand Up @@ -336,7 +366,7 @@ def _compute_slack(self, index):
Return the slack of a fragment, that is,
the difference between the current duration
of the fragment and the duration it should have
if its rate was exactly self.max_rate
if its rate was exactly self.value (= max rate)
If the slack is positive, the fragment
can be shrinken; if the slack is negative,
Expand All @@ -356,15 +386,9 @@ def _compute_slack(self, index):
end = fragment[1]
length = self._len(fragment[3])
duration = end - start
return duration - (length / self.max_rate)
return duration - (length / self.value)

def _adjust_rate(self, aggressive=False):
try:
self.max_rate = float(self.value)
except:
pass
if self.max_rate <= 0:
self.max_rate = self.DEFAULT_MAX_RATE
faster = []

# TODO numpy-fy this loop?
Expand All @@ -373,7 +397,7 @@ def _adjust_rate(self, aggressive=False):
self._log(["Fragment %d", index])
rate = self._compute_rate(index)
self._log([" %.3f %.3f => %.3f", fragment[0], fragment[1], rate])
if rate > self.max_rate:
if rate > self.value:
self._log(" too fast")
faster.append(index)

Expand All @@ -382,7 +406,7 @@ def _adjust_rate(self, aggressive=False):
return self.text_map

if len(faster) == 0:
self._log(["No fragment faster than max rate %.3f", self.max_rate])
self._log(["No fragment faster than max rate %.3f", self.value])
return self.text_map

# TODO numpy-fy this loop?
Expand Down

0 comments on commit 08626de

Please sign in to comment.