GitHub - ulkursln/sphinx4: Pure Java speech recognition library

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 4,469 Commits
doc		doc
sphinx4-core		sphinx4-core
sphinx4-data		sphinx4-data
sphinx4-samples		sphinx4-samples
tests		tests
.gitignore		.gitignore
README		README
RELEASE_NOTES		RELEASE_NOTES
build.gradle		build.gradle
license.terms		license.terms
settings.gradle		settings.gradle

Repository files navigation


*************************************************MAGiC***********************************************


What's new in modified version
-------------------------------------------------------------------

CMUSphinx does not generate segments for whole audio; for instance, 
it does not generate segments for the parts especially when the speaker could not be identified. 
However, those non-segmented parts might contain information useful for researchers. 
Thus, we carried out additional development to automatically generate audio segments from non-segmented parts in an audio .  
Furthermore, a function was implemented to create an output file including time-intervals of each segment in millisecond resolution.
Also, we added a new functionality to segment audio with the specified intervals.
Lastly, fatJAr plugin as added to create an executable jar file with all dependencies.
  
	
Instructions to run jar file with added features:
-------------------------------------------------------------------
 
   Jar file is named: sphinx4-core-all-1.0.jar
   
   1.Segment whole audio and create the segments-interval file
      $path_JavaExe -cp $path_jarFile $className -i $input -o $segmentsOutputFolder -a $outputTextFilePath
      (Eg: C:javapath\java.exe -cp sphinx4-core-all-1.0.jar edu.cmu.sphinx.tools.endpoint.Segmenter -i D:\1.wav -o D:\1\ -a D:\1\
 
   2. Segment audio according to the specific segments-interval file
      $path_JavaExe -cp $path_jarFile $className -i $input -o $segmentsOutputFolder -comb $combinedIntervalsFile
      (Eg: C:javapath\java.exe -cp sphinx4-core-all-1.0.jar edu.cmu.sphinx.tools.endpoint.Segmenter -i D:\1.wav -o D:\1\ -comb D:\combined_segments_interval.txt



*************************************************MAGiC***********************************************



Sphinx-4 Speech Recognition System
-------------------------------------------------------------------

Sphinx-4 is a state-of-the-art, speaker-independent, continuous speech
recognition system written entirely in the Java programming language. It
was created via a joint collaboration between the Sphinx group at
Carnegie Mellon University, Sun Microsystems Laboratories, Mitsubishi
Electric Research Labs (MERL), and Hewlett Packard (HP), with
contributions from the University of California at Santa Cruz (UCSC) and
the Massachusetts Institute of Technology (MIT).

The design of Sphinx-4 is based on patterns that have emerged from the
design of past systems as well as new requirements based on areas that
researchers currently want to explore.  To exercise this framework, and
to provide researchers with a "research-ready" system, Sphinx-4 also
includes several implementations of both simple and state-of-the-art
techniques.  The framework and the implementations are all freely
available via open source under a very generous BSD-style license.

Because it is written entirely in the Java programming language, Sphinx-4
can run on a variety of platforms without requiring any special
compilation or changes.  We've tested Sphinx-4 on the following platforms
with success.

To get started with sphinx4 visit our wiki

    http://cmusphinx.sourceforge.net/wiki

Please give Sphinx-4 a try and post your questions, comments, and
feedback to one of the CMU Sphinx Forums:

    http://sourceforge.net/p/cmusphinx/discussion/sphinx4
    
We can also be reached at cmusphinx-devel@lists.sourceforge.net.

Sincerely,

The Sphinx-4 Team:  
(in alph. order)    

Evandro Gouvea, CMU (developer and speech advisor)
Peter Gorniak, MIT (developer)
Philip Kwok, Sun Labs (developer)
Paul Lamere, Sun Labs (design/technical lead)
Beth Logan, HP (speech advisor)
Pedro Moreno, Google (speech advisor)
Bhiksha Raj, MERL (design lead)
Mosur Ravishankar, CMU (speech advisor)
Bent Schmidt-Nielsen, MERL (speech advisor)
Rita Singh, CMU/MIT (design/speech advisor)
JM Van Thong, HP (speech advisor)
Willie Walker, Sun Labs (overall lead)
Manfred Warmuth, USCS (speech advisor)
Joe Woelfel, MERL (developer and speech advisor)
Peter Wolf, MERL (developer and speech advisor)