Skip to content

Commit

Permalink
New pocketsphinx (#373)
Browse files Browse the repository at this point in the history
* Updated the pocketsphinx installation

Using APT and PyPI packages instead of building from source.
Switched from cmuclmtk to KenLM.

* Moved KenLM info

Moved the KenLM directory info out of Pocketsphinx and into its
own section since it is quite likely that KenLM will be needed for
other STT engines.

* Remove STT installers

Remove openfst, mitlm, cmuclmtk, Phonetisaurus, sphinxbase,
pocketsphinx, and the pocketsphinx python module because they
are now installed via apt or pip repositories (at least on
debian type systems) which is way faster.

* Fixing unit tests

The test_g2p tests were using a mock object to replace a call to
subprocess with a custom object. Since I'm using the phonetisaurus
package directly now, this was replacing the wrong object and
causing errors. I replaced the results of the predict method from
phonetisaurus instead.

* Fix phonetisaurus install

For some reason, sometimes the phonetisaurus python package can
only be installed using the `pip install phonetisaurus` command
for x86_64 platform users. If you are running on a Raspberry Pi,
then sometimes you have to download the binary from GitHub.

* Add python3-dev as apt requirement

Python3-dev is not installed by default on Armbian

* Reinit Pocketsphinx vocabularies

I am working on a Libre ROC-RK3328-CC Renegade computer as my main
development board for Naomi. I am suddenly getting this error
after using "expect" (in the joke plugin):

ERROR:naomi.mic:Passive transcription failed!
Traceback (most recent call last):
  File "/home/naomi/Naomi/naomi/mic.py", line 318, in wait_for_keyword
    transcribed = [word.upper() for word in self.passive_stt_engine.transcribe(f)]
  File "/home/naomi/Naomi/plugins/stt/pocketsphinx-stt/sphinxplugin.py", line 417, in transcribe
    self._decoder.process_raw(data, False, True)
  File "_pocketsphinx.pyx", line 960, in _pocketsphinx.Decoder.process_raw
IndexError: Out of bounds on buffer access (axis 0)

I'm not sure what this error means. It always seems to affect the
passive listening engine. Usually I use the pocketsphinx_kws plugin
for passive listening, so this may have been happening for a while
and I just didn't notice. Re-initializing the engine seems to work
fine, so I'm just keeping an eye on it. The biggest problem with
re-initializing is that with our current audio system, Naomi tends
to stutter when the re-initialization happens. This is one reason
for wanting to move to a multi-threaded approach and see if that
helps.

* Remove the fst file setting

The fst file is now generated inside the pocketsphinx hmm dir from
the cmudict.txt file. This means I can eliminate the fst_model
setting in addition to the phonetisaurus_executable setting. This
leaves only the hmm_dir setting.

* Fix G2P word translations

Fixed error in NaomiSTTTrainer.py when transcribing the last
available record.

Put words fed to g2pconverter.translate into a list. Otherwise,
words get translated letter by letter which causes long words to
match just about any noise.

* Fix "Too many open files" error

Recently, I am getting a "Too many open files" error. This appears
to be because a file handle is opened to /dev/null so STDERR can
be redirected to it before performing some operations in
Pocketsphinx, but never getting closed. This gives the io.open
method context so it will close automatically when it goes out
of scope.

Also fixed an issue reported by CodeQL where a user-provided path
is being used to serve content.

* 2nd attempt to control wavfile path

CodeQL still didn't like my solution, I think because the path
check is disconnected from the use. 2nd attempt.
  • Loading branch information
aaronchantrill committed Apr 21, 2023
1 parent e2d3707 commit 7272c31
Show file tree
Hide file tree
Showing 15 changed files with 481 additions and 794 deletions.
146 changes: 78 additions & 68 deletions NaomiSTTTrainer.py

Large diffs are not rendered by default.

28 changes: 13 additions & 15 deletions apt_requirements.txt
Expand Up @@ -8,6 +8,7 @@ jq
libatlas3-base
libncurses5-dev
libssl-dev
python3-dev
python3-pip
util-linux

Expand All @@ -21,23 +22,20 @@ pulseaudio-utils

## STT
### PocketSphinx is default for wakeword spotting
bison
libpulse-dev
swig
pocketsphinx
sphinxtrain

#### Phonetisaurus is used to build dictionaries
autoconf
autoconf-archive
bison
g++
gcc
gfortran
git
libtool
make
subversion
swig
### KenLM for building ARPA models
build-essential
cmake
libboost-system-dev
libboost-thread-dev
libboost-program-options-dev
libboost-test-dev
libeigen3-dev
zlib1g-dev
libbz2-dev
liblzma-dev

# TTS
## flite
Expand Down
193 changes: 0 additions & 193 deletions installers/script.deb.sh
Expand Up @@ -485,199 +485,6 @@ setup_wizard() {
find ~/.config/naomi -maxdepth 1 -iname '*.sh' -type f -exec chmod a+x {} \;
find ~/Naomi/installers -maxdepth 1 -iname '*.sh' -type f -exec chmod a+x {} \;

echo
printf "${B_W}=========================================================================${NL}"
printf "${B_W}PLUGIN SETUP${NL}"
printf "${B_W}Now we'll tackle the default plugin options available for Text-to-Speech, Speech-to-Text, and more.${NL}"
echo
sleep 3
echo

# Build Phonetisaurus
# Building and installing openfst
echo
printf "${B_G}Building and installing openfst...${B_W}${NL}"
cd ~/.config/naomi/sources

if [ ! -f "openfst-1.6.9.tar.gz" ]; then
wget http://www.openfst.org/twiki/pub/FST/FstDownload/openfst-1.6.9.tar.gz
fi
tar -zxvf openfst-1.6.9.tar.gz
cd openfst-1.6.9
autoreconf -i
./configure --enable-static --enable-shared --enable-far --enable-lookahead-fsts --enable-const-fsts --enable-pdt --enable-ngram-fsts --enable-linear-fsts --prefix=/usr
make
if [ $REQUIRE_AUTH -eq 1 ]; then
SUDO_COMMAND "sudo make install"
if [ $? -ne 0 ]; then
echo $! >&2
exit 1
fi
else
printf "${B_W}${NL}"
sudo make install
if [ $? -ne 0 ]; then
echo $! >&2
exit 1
fi
fi

if [ -z "$(which fstinfo)" ]; then
printf "${ERROR} ${B_R}Notice:${B_W} openfst not installed${NL}" >&2
exit 1
fi

# Building and installing mitlm-0.4.2
echo
printf "${B_G}Installing & Building mitlm-0.4.2...${B_W}${NL}"
cd ~/.config/naomi/sources
if [ ! -d "mitlm" ]; then
git clone https://github.com/mitlm/mitlm.git
if [ $? -ne 0 ]; then
printf "${ERROR} ${B_R}Notice:${B_W} Error cloning mitlm${NL}"
exit 1
fi
fi
cd mitlm
./autogen.sh
make
printf "${B_G}Installing mitlm${B_W}${NL}"
if [ $REQUIRE_AUTH -eq 1 ]; then
SUDO_COMMAND "sudo make install"
if [ $? -ne 0 ]; then
echo $! >&2
exit 1
fi
else
printf "${B_W}${NL}"
sudo make install
if [ $? -ne 0 ]; then
echo $! >&2
exit 1
fi
fi

# Building and installing CMUCLMTK
echo
printf "${B_G}Installing & Building cmuclmtk...${B_W}${NL}"
cd ~/.config/naomi/sources
svn co https://svn.code.sf.net/p/cmusphinx/code/trunk/cmuclmtk/
if [ $? -ne 0 ]; then
printf "${ERROR} ${B_R}Notice:${B_W} Error cloning cmuclmtk${NL}" >&2
exit 1
fi
cd cmuclmtk
./autogen.sh
make
printf "${B_G}Installing CMUCLMTK${B_W}${NL}"
if [ $REQUIRE_AUTH -eq 1 ]; then
SUDO_COMMAND "sudo make install"
else
printf "${B_W}${NL}"
sudo make install
fi

printf "${B_G}Linking shared libraries${B_W}${NL}"
if [ $REQUIRE_AUTH -eq 1 ]; then
SUDO_COMMAND "sudo ldconfig"
else
printf "${B_W}${NL}"
sudo ldconfig
fi

# Building and installing phonetisaurus
echo
printf "${B_G}Installing & Building phonetisaurus...${B_W}${NL}"
cd ~/.config/naomi/sources
if [ ! -d "Phonetisaurus" ]; then
git clone https://github.com/AdolfVonKleist/Phonetisaurus.git
if [ $? -ne 0 ]; then
printf "${ERROR} ${B_R}Notice:${B_W} Error cloning Phonetisaurus${NL}" >&2
exit 1
fi
fi
cd Phonetisaurus
./configure --enable-python
make
printf "${B_G}Installing Phonetisaurus${B_W}${NL}"
printf "${B_G}Linking shared libraries${B_W}${NL}"
if [ $REQUIRE_AUTH -eq 1 ]; then
SUDO_COMMAND "sudo make install"
else
printf "${B_W}${NL}"
sudo make install
fi

printf "[$(pwd)]\$ ${B_G}cd python${B_W}${NL}"
cd python
echo $(pwd)
cp -v ../.libs/Phonetisaurus.so ./
if [ $REQUIRE_AUTH -eq 1 ]; then
SUDO_COMMAND "sudo python setup.py install"
else
printf "${B_W}${NL}"
sudo python setup.py install
fi

if [ -z "$(which phonetisaurus-g2pfst)" ]; then
printf "${ERROR} ${B_R}Notice:${B_W} phonetisaurus-g2pfst does not exist${NL}" >&2
exit 1
fi

# Installing & Building sphinxbase
echo
printf "${B_G}Building and installing sphinxbase...${B_W}${NL}"
cd ~/.config/naomi/sources
if [ ! -d "pocketsphinx-python" ]; then
git clone --recursive https://github.com/bambocher/pocketsphinx-python.git
if [ $? -ne 0 ]; then
printf "${ERROR} ${B_R}Notice:${B_W} Error cloning pocketsphinx${NL}" >&2
exit 1
fi
fi
cd pocketsphinx-python/deps/sphinxbase
./autogen.sh
make
if [ $REQUIRE_AUTH -eq 1 ]; then
SUDO_COMMAND "sudo make install"
else
printf "${B_W}${NL}"
sudo make install
fi

# Installing & Building pocketsphinx
echo
printf "${B_G}Building and installing pocketsphinx...${B_W}${NL}"
cd ~/.config/naomi/sources/pocketsphinx-python/deps/pocketsphinx
./autogen.sh
make
if [ $REQUIRE_AUTH -eq 1 ]; then
SUDO_COMMAND "sudo make install"
else
printf "${B_W}${NL}"
sudo make install
fi

# Installing PocketSphinx Python module
echo
printf "${B_G}Installing PocketSphinx module...${B_W}${NL}"
cd ~/.config/naomi/sources/pocketsphinx-python
python setup.py install

cd $NAOMI_DIR
if [ -z "$(which text2wfreq)" ]; then
printf "${ERROR} ${B_R}Notice:${B_W} text2wfreq does not exist${NL}" >&2
exit 1
fi
if [ -z "$(which text2idngram)" ]; then
printf "${ERROR} ${B_R}Notice:${B_W} text2idngram does not exist${NL}" >&2
exit 1
fi
if [ -z "$(which idngram2lm)" ]; then
printf "${ERROR} ${B_R}Notice:${B_W} idngram2lm does not exist${NL}" >&2
exit 1
fi

# Compiling Translations
echo
printf "${B_G}Compiling Translations...${B_W}${NL}"
Expand Down
4 changes: 2 additions & 2 deletions naomi/coloredformatting.py
Expand Up @@ -31,7 +31,7 @@ class naomidefaults:
lt='\033[1;90m' #Bright Black For lower text
pq='\033[1;94m' #Bright Blue For prompt question
pc='\033[1;95m' #Bright Magenta For prompt choices
sto='\033[1;97m' #Bright White For standard text output
sto='\033[0m' #Default For standard text output

# How to use
# from coloredformatting import colors
Expand Down Expand Up @@ -122,4 +122,4 @@ class logd:
success="\033[1;90m SUCCESS \033[1;32m"
error="\033[1;90m ERROR \033[0;31m"
critical="\033[1;90m CRITICAL \033[1;31m"


23 changes: 7 additions & 16 deletions naomi/data/standard_phrases/en-US.txt
@@ -1,19 +1,10 @@
BE
BEING
BUT
DID
FIRST
HEY
IN
IS
IT
NOW
OF
ON
NO
OKAY
OOPS
PLEASE
RIGHT
SAY
WHAT
WHICH
WITH
WORK
SO
THANK
WHOOPS
YOU
3 changes: 2 additions & 1 deletion naomi/run_command.py
Expand Up @@ -47,8 +47,9 @@ def run_command(command, capture=0, stdin=None, cwd=None):
elif (capture == 4):
completedprocess = subprocess.run(
command,
stdout=subprocess.PIPE,
input=stdin.encode() if hasattr(stdin, "encode") else stdin,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
cwd=cwd
)
else:
Expand Down
21 changes: 13 additions & 8 deletions naomi/testutils.py
Expand Up @@ -26,21 +26,26 @@ def test_profile():
}
try:
TEST_PROFILE['pocketsphinx'] = config['pocketsphinx']
except(KeyError, NameError):
except (KeyError, NameError):
pass
try:
TEST_PROFILE['language'] = config['language']
except(KeyError, NameError):
except (KeyError, NameError):
pass
try:
TEST_PROFILE['google'] = {}
TEST_PROFILE['google']['credentials_json'] = config['google']['credentials_json']
except(KeyError, NameError):
TEST_PROFILE['google'] = {
'credentials_json': config['google']['credentials_json']
}
except (KeyError, NameError):
pass
try:
TEST_PROFILE['key'] = config['key']
TEST_PROFILE['email'] = config['email']
except(KeyError, NameError):
except (KeyError, NameError):
pass
try:
TEST_PROFILE['kenlm'] = {'source_dir': config['kenlm']['source_dir']}
except (KeyError, NameError):
pass
return TEST_PROFILE

Expand Down Expand Up @@ -69,8 +74,8 @@ def expect(self, prompt, phrases, name='expect', instructions=None):
# For now, assume the input is "YES" or "NO"
def confirm(self, prompt):
(matched, phrase) = self.expect("confirm", prompt, ['YES', 'NO'])
if(matched):
if(phrase in ['YES']):
if matched:
if phrase in ['YES']:
phrase = 'Y'
else:
phrase = 'N'
Expand Down

0 comments on commit 7272c31

Please sign in to comment.