Dr.VOT : Measuring Positive and Negative Voice Onset Time in the Wild

Yosi Shrem (joseph.shrem@campus.technion.ac.il)
Joseph Keshet (jkeshet@technion.ac.il)

Dr.VOT is a software package for automatic measurement of voice onset time (VOT). We propose a neural-based architecture composed of a recurrent neural network(RNN) and a structured prediction model. Dr.VOT can handle both positive and negative VOTs and is robust to variations across annotators.

This is a beta version of Dr.VOT. Any reports of bugs, comments on how to improve the software or documentation, or questions are greatly appreciated, and should be sent to the authors at the addresses given above.

The paper can be found at (https://arxiv.org/pdf/1910.13255.pdf).
If you find our work useful please cite :

@article{shrem2019dr,
  title={Dr. VOT: Measuring Positive and Negative Voice Onset Time in the Wild},
  author={Shrem, Yosi and Goldrick, Matthew and Keshet, Joseph},
  journal={Proc. Interspeech 2019},
  pages={629--633},
  year={2019}
}

Installation instructions

Python 3.9

Download the code:

git clone https://github.com/MLSpeech/Dr.VOT.git

Download Praat from: http://www.fon.hum.uva.nl/praat/ .
Download SoX from: http://sox.sourceforge.net/ .
Download pipenv(https://github.com/pypa/pipenv):
- MacOS :
```
$ brew install pipenv
```
- Linux :
```
$ sudo apt install pipenv
```
To verify everything is installed run the check_installations.sh script:
```
$ ./check_installations.sh
```
Note: make sure it has an execute permission. ($ chmod +x ./check_installations.sh)

If you encounter any problem, please check the log.txt.

How to use:

Place your .wav files in the ./data/raw/ directory. Each file should contain a single word.
Note:You can also place directories that contain the .wav files, the is no need to re-arrange your data. For example:

./data/raw
        └───dir1
        │   │   1.wav
        │   │   2.wav
        │   │   3.wav
        │               │   
        └───dir2
            │   1.wav
            │   2.wav
            │   3.wav

Run the following script (runtime is approx. ~1sec per file):
```
$ ./run_script.sh
```
Note: make sure it has an execute permission. ($ chmod +x ./run_script.sh)
Thats it :)
The predictions can be found at ./data/out_tg/ in the same hierarchy as the original data. The classification of the VOT's type(positive/negative) was also added to each .TextGrid for convenience.
A summary.csv file is also created for easy analysis.
For example:
```
./data/out_tg
|   summary.csv
└───dir1
│   │   1_predNEG.TextGrid
│   │   2_predNEG.TextGrid
│   │   3_predNEG.TextGrid
│               
└───dir2
    │   1_predPOS.TextGrid
    │   2_predPOS.TextGrid
    │   3_predPOS.TextGrid
```

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
data		data
figures		figures
final_models		final_models
helpers		helpers
model		model
process_data		process_data
Pipfile		Pipfile
README.md		README.md
__init__.py		__init__.py
check_installations.sh		check_installations.sh
data_utils.py		data_utils.py
linux_praat		linux_praat
post_predict_script.py		post_predict_script.py
predict.py		predict.py
process_data_pipeline.py		process_data_pipeline.py
requirements.txt		requirements.txt
run.sh		run.sh
run_script.sh		run_script.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Dr.VOT : Measuring Positive and Negative Voice Onset Time in the Wild

Installation instructions

How to use:

Examples

About

Releases

Packages

Languages

MLSpeech/Dr.VOT

Folders and files

Latest commit

History

Repository files navigation

Dr.VOT : Measuring Positive and Negative Voice Onset Time in the Wild

Installation instructions

How to use:

Examples

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages