Updated README and ACKNOWLEDGEMENTS, might need to update COPYING.

MycroftAI · Feb 4, 2016 · 5b9aa7b · 5b9aa7b
1 parent 66fd64d
commit 5b9aa7b
Show file tree

Hide file tree

Showing 2 changed files with 36 additions and 229 deletions.
diff --git a/ACKNOWLEDGEMENTS b/ACKNOWLEDGEMENTS
@@ -1,12 +1,15 @@
+Mimic is currently being developed and maintained by Mycroft AI, Inc.
+Mycroft AI is the company behind the Mycroft artificial intelligence
+and voice assistant platform, more info here: https://mycroft.ai
 
-The initial development of flite was primarily done by awb while
-travelling, perhaps the name is doubly appropriate as a substantial
-amount of the coding was done over 30,000ft).  During most of that
-time awb was funded by the Language Technonologies Institute at
-Carnegie Mellon University.
+The initial development of Mimic was primarily done by awb through
+the Flite project while he was traveling, perhaps the name is
+doubly appropriate as a substantial amount of the coding was
+done over 30,000ft).  During most of that time awb was funded
+by the Language Technologies Institute at Carnegie Mellon University.
 
 Kevin A. Lenzo was involved in the design, conversion techniques and
-representions for the voice distributed with flite (as well as being
+representations for the voice distributed with Flite (as well as being
 the actual kal voice itself).
 
 Other contributions are:
@@ -15,7 +18,7 @@ Henry Spencer
    For the regex code
 University of Edinburgh
    for releasing Festival for free, making a companion runtime synthesizer
-   a practical project, much of the design of flite relies on the 
+   a practical project, much of the design of flite relies on the
    architecture decisions made in the Festival Speech Synthesis Systems and
    the Edinburgh Speech Tools.
    The duration cart tree and intonation (accent and F0) models were
@@ -34,7 +37,7 @@ Marcela Charfuelan (DFKI)
    For the mixed-excitation techniques.  These originally came from NITECH
    but we understood the technqiues from Marcela's Open Mary Java code and
    implemented them in our optimized version of MLSA.
-David Huggins-Daines (dhd@cepstral.com) 
+David Huggins-Daines (dhd@cepstral.com)
    much of the clunits code, porting to multiple platforms, substantial
    code tidy up and configure/autoconf guidance.
 Cepstral, LLC (http://cepstral.com)
@@ -55,7 +58,7 @@ Mario Lang:
 Eric House (fixin@peak.org)
    who provided examples of how to do 68K Call Backs for system functions
 Greg Parker gparker@sealiesoftware.com
-   peal, the binding glue and shared library foo for getting the arm 
+   peal, the binding glue and shared library foo for getting the arm
    version doing something reasonable under PalmOS
 Lukas Loehrer <loehrerl@gmx.net> Feb 2006
    alsa support (default if available)

diff --git a/README b/README
@@ -1,202 +1,43 @@
 
-         Flite: a small run-time speech synthesis engine
-                      version 2.0.0-release
-          Copyright Carnegie Mellon University 1999-2014
-                      All rights reserved
-                      http://cmuflite.org
-
-
-Flite is a small fast run-time speech synthesis engine.  It is the
-latest addition to the suite of free software synthesis tools
-including University of Edinburgh's Festival Speech Synthesis System
-and Carnegie Mellon University's FestVox project, tools, scripts and
-documentation for building synthetic voices.  However, flite itself
-does not require either of these systems to compile and run.
-
-The core Flite library was developed by Alan W Black <awb@cs.cmu.edu>
-(mostly in his so-called spare time) while employed in the Language
-Technologies Institute at Carnegie Mellon University.  The name
-"flite", originally chosen to mean "festival-lite" is perhaps doubly
-appropriate as a substantial part of design and coding was done over
-30,000ft while awb was travelling, and (usually) isn't in meetings.
-
-The voices, lexicon and language components of flite, both their
-compression techniques and their actual contents were developed by
-Kevin A. Lenzo <lenzo@cs.cmu.edu> and Alan W Black <awb@cs.cmu.edu>.
-
-Flite is the answer to the complaint that Festival is too big, too slow,
-and not portable enough.
-
-o Flite is designed for very small devices, such as PDAs, and also
-  for large server machines which need to serve lots of ports.
-o Flite is not a replacement for Festival but an alternative run time
-  engine for voices developed in the FestVox framework where size and
-  speed is crucial.
-o Flite is all in ANSI C, it contains no C++ or Scheme, thus requires
-  more care in programming, and is harder to customize at run time.
-o It is thread safe
-o Voices, lexicons and language descriptions can be compiled 
-  (mostly automatically for voices and lexicons) into C representations 
-  from their FestVox formats
-o All voices, lexicons and language model data are const and in the
-  text segment (i.e. they may be put in ROM).  As they are linked in
-  at compile time, there is virtually no startup delay.
-o Although the synthesized output is not exactly the same as the same 
-  voice in Festival they are effectively equivalent.  That is, flite 
-  doesn't sound better or worse than the equivalent voice in festival,
-  just faster, smaller and scalable.
-o For standard diphone voices, maximum run time memory
-  requirements are approximately less than twice the memory requirement 
-  for the waveform generated.  For 32bit archtectures
-  this effectively means under 1M.
-o The flite program supports, synthesis of individual strings or files
-  (utterance by utterance) to direct audio devices or to waveform files.
-o The flite library offers simple functions suitable for use in specific
-  applications.
-Flite is distributed with a single 8K diphone voice (derived from the
-cmu_us_kal voice), a pruned lexicon (derived from
-cmulex) and a set of models for US English.  Here are comparisons
-with Festival using basically the same 8KHz diphone voice
-                Flite    Festival
-   core code    60K      2.6M
-   USEnglish    100K     ??
-   lexicon      600K     5M
-   diphone      1.8M     2.1M
-   runtime      <1M      16-20M
-
-On a 500Mhz PIII, a timing test of the first two chapters of
-"Alice in Wonderland" (doc/alice) was done.  This produces about
-1300 seconds of speech.  With flite it takes 19.128 seconds (about
-70.6 times faster than real time) with Festival it takes 97 seconds
-(13.4 times faster than real time).  On the ipaq (with the 16KHz diphones)
-flite synthesizes 9.79 time faster than real time.
-
-Requirements:  
+                  Mimic, the Mycroft TTS Engine
+
+Mimic is a lightweight run-time speech synthesis engine, based on
+Flite (Festival-Lite). The Flite project website can be found
+here: http://www.festvox.org/flite/ - further information can be found
+in the ACKNOWLEDGEMENTS file in the Mimic repo.
+
+Requirements:
 
 o A good C compiler, some of these files are quite large and some C
   compilers might choke on these, gcc is fine.  Sun CC 3.01 has been
   tested too.  Visual C++ 6.0 is known to fail on the large diphone
   database files.  We recommend you use GCC under Cygwin or mingw32
   instead.
 o GNU Make
-o An audio device isn't required as flite can write its output to 
-  a waveform file. 
+o An audio device isn't required as flite can write its output to
+  a waveform file.
 
 Supported platforms:
 
-We have successfully compiled and run on 
+We have successfully compiled and run on
 
-o Various Intel Linux systems (and iPaq Linux), under various versions
-  of GCC (2.7.2 to 4.x)
+o Linux, with both ARM and Intel architectures under it
 o Mac OS X
-o Various Android devices
-o FreeBSD 3.x and 4.x
-o Solaris 5.7, and Solaris 9
-o Windows 2000/XP and later under Cygwin 1.3.5 and later
-o Successfully compiles and runs under 64Bit Linux architectures
-o OSF1 V4.0 (gives an unimportant warning about sizes when compiled cst_val.c)
-
-Previously we supported PalmOS and Windows CE but these seem to be rare
-nowadays so they are no longer actively supported.
-
-Other similar platforms should just work, we have also cross compiled
-on a Linux machine for StrongARM.  However note that new byte order
-architectures may not work directly as there is some careful
-byte order constraints in some structures.  These are portable but may
-require reordering of some fields, contact us if you are moving to
-a new archiecture.
-
-News
-----
-
-New in 2.0.0 (Dec 2014)
-    o Indic language support (Hindi, Tamil and Telugu)
-    o SSML support
-    o CG voices as files accessilble by file:/// and http://
-      (and set of 13 voices to load)
-    o random forest (multimodel support) improves voice quality
-    o Supports diffrent sample rates/mgc order to tune for speed
-    o Kal diphone 500K smaller
-    o Fixed lots of API issues
-    o thread safe (again) after initialization
-    o Generalized tokenstreams (used in Bard Storyteller)
-    o simple-Pulseaudio support
-    o Improved Android support
-    o Removed PalmOS support from distribution
-    o Companion multilingual ebook reader Bard Storyteller 
-       http://festvox.org/bard/
-
-New in 1.4.1 (March 2010)
-    o better ssml support (actually does something)
-    o better clunit support (smaller)
-    o Android support
-
-New in 1.4 (December 2009)
-    o crude multi-voice selection support (may change)
-    o 4 basic voices are included 3 clustergen (awb, rms and slt) plus
-      the kal diphone database
-    o CMULEX now uses maximum onset for syllabification
-    o alsa support
-    o Clustergen support (including mlpg with mixed excitation) 
-      But is still slow on limited processors
-    o Windows support with Visual Studio (specifically for the Olympus 
-        Spoken Dialog System)
-    o WinCE support is redone with cegcc/mingw32ce with example
-        example TTS app: Flowm: Flite on Windows Mobile
-    o Speed-ups in feature interpretation limiting calls to alloc
-    o Speed-ups (and fixes) for converting clunits festvox voices
-
-New in 1.3-release (October 2005)
-    o fixes to lpc residual extraction to give better quality output
-    o An updated lexicon (festlex_CMU from festival-2.0.95) and better
-      compression its about 30% of the previous size, with about 
-      the same accuracy
-    o Fairly substantial code movements to better support PalmOS and 
-      multi-platform cross compilation builds
-    o A PalmOS 5.0 port with an small example talking app ("flop")
-    o runs under ix86_64 linux
-
-New in 1.2-release  (February 2003)
-    o A build process for diphone and clunits/ldom voices
-      FestVox voices can be converted (sometimes) automatically
-    o Various bug fixes
-    o Initial support for Mac OS X (not talking to audio device yet)
-      but compiles and runs
-    o Text files can be synthesize to a single audio file
-    o (optional) shared library support (Linux)
+o Android
 
 Compilation
 -----------
 
 In general
 
-    tar zxvf flite-2.0.0-release.tar.gz
-    cd flite-2.0.0-release
-    ./configure 
-    make
-
-Where tar is gnu tar (gtar), and make is gnu make (gmake).
-
-Configuration should be automatic, but maybe doesn't work in all cases
-especially if you have some new compiler.  You can explicitly set to
-compiler in config/config and add any options you see fit.   Configure
-tries to guess these but it might be able for cross compilation cases
-Interesting options there are
-
--DWORDS_BIGENDIAN=1  for bigendian machines (e.g. Sparc, M68x)
--DNO_UNION_INITIALIZATION=1  For compilers without C 99 union inintialization
--DCST_AUDIO_NONE     if you don't need/want audio support
-
-There are different sets of voices and languages you can select between
-them (and your own sets if you make config/XXX.lv).  For example
-
-   ./configure --with-langvox=transtac
-
-Will use the languages and voices defined in config/transtac.lv
+    #TODO update this to reflect compilation
 
 Usage:
 ------
 
+    #TODO Shorten and update this to reflect current process,
+    update relevant filenames.
+
 The ./bin/flite voices contains all supported voices and you may
 choose between the voices with the -voice flag and list the supported
 voices with the -lv flag.  Note the kal (diphone) voice is a different
@@ -218,7 +59,7 @@ wave format often called .WAV).
 Will play the text file doc/alice.  If the first argument contains
 a space it is treated as text otherwise it is treated as a filename.
 If a second argument is given a waveform file is written to it,
-if no argument is given or "play" is given it will attempt to 
+if no argument is given or "play" is given it will attempt to
 write directly to the audio device (if supported).  if "none"
 is given the audio is simply thrown away (used for benchmarking).
 Explicit options are also available.
@@ -245,7 +86,7 @@ debugging.  Some typical examples are
 ./bin/flite --sets join_type=simple_join doc/intro.txt
      Use simple concatenation of diphones without prosodic modification
 ./bin/flite -pw doc/alice
-     Print sentences as they are said 
+     Print sentences as they are said
 ./bin/flite --setf duration_stretch=1.5 doc/alice
      Make it speak slower
 ./bin/flite --setf int_f0_target_mean=145 doc/alice
@@ -256,7 +97,7 @@ http://festvox.org/ldom it requires a single argument HH:MM
 under Unix you can call it
     ./bin/flite_time `date +%H:%M`
 
-./bin/flite -lv 
+./bin/flite -lv
     List the voices linked in directly in this build
 
 ./bin/flite -voice rms -f doc/alice
@@ -276,47 +117,10 @@ Voice names are identified as loadable files if the name includes a
 want to load voices from the current directory you need to prefix them
 with "./".
 
-Voice quality
+Voices
 -------------
 
-So you've eagerly downloaded flite, compiled it and run it, now you
-are disappointed that it doesn't sound wonderful, sure its fast and
-small but what you really hoped for was the dulcit tones of a deep
-baritone voice that would make you desperately hang on every phrase it
-mellifluously produces.  But instead you get an 8Khz diphone voice that
-sounds like it came from the last millenium.
-
-Well, first, you are right, it is an 8KHz diphone voice from the last
-millenium, and that was actually deliberate.  As we developed flite we
-wanted a voice that was stable and that we could directly compare with
-that very same voice in Festival.  Flite is an *engine*.  We want to
-be able take voices built with the FestVox process and compile them
-for flite, the result should be exactly the same quality (though of
-course trading the size for quality in flite is also an option).  The
-included voice is just a sample voice that was used in the testing
-process.  
-
-We expect that often voices will be loaded from external files, and we
-have now set up a voice repository on 
-   http://festvox.org/flite/voices/LANG/*.flitevox
-If you visit there with a browser you can hear the examples.  You can
-also download the .flitevox files to you machine so you don't need a
-network connect everytime you need to load a voice.
-Feb 2016: above repo no longer exists, voices can be found here
-  http://www.festvox.org/flite/packed/flite-2.0/voices/
-
-We are now actively adding to this list of available voices in English
-and other languages.
-
-Bard Storyteller:  http://festvox.org/bard/
--------------------------------------------
-
-Bard is a companion app that read ebooks, both displaying them and
-actually reading them to you using flite.  Bard supports a wide
-range of fonts, and flite voices, and books in text, html and
-epub format.  Bard is used as a evaluation of flites capabilities
-and an example of a serious application using flite.
-
-
-
+#TODO Explain where to find voices, and how to obtain new ones.
 
+You can find existing Flite voices here:
+  http://www.festvox.org/flite/packed/flite-2.0/voices/