By using all OR ANY PART of this project you MUST follow our license at https://github.com/DYVAUX/nnsvs-english-support/blob/main/LICENSE.md This includes using parts of our work in yours. If you have questions about the license please contact us. We should not have to include this notice in our files but bad actors had forced our hand.
Cleaned up reference provided by subpum
Phoneme | X-SAMPA | Description |
---|---|---|
br | Inhale | |
exh | Exhale | |
axh | Voiced Exhale | |
pau | Silence | |
q | ? | Glottal Stop |
cl | Closure (Held Consonants) | |
vf | Vocal Fry | |
trash | Junk Phoneme |
Phoneme | X-SAMPA | Explaination | Example |
---|---|---|---|
a | a | central open unrounded vowel | Ana |
e | e | close-mid front unrounded vowel | Eva |
i | i | close front unrounded vowel | Linda |
o | o | close-mid back rounded vowel | Tope |
u | u | close back rounded vowel | Uno |
(Semi-vowels function as consonants and perform "diphthong-like" tasks.)
Phoneme | X-SAMPA | Explaination | Example |
---|---|---|---|
y | j | palatal approximant | Aire (a y r e), tierra (t y e rr a), sayo (s a y o) |
w | w | labial-velar approximant | Pausa (p a w s a), fuego (f w e gh o) |
Phoneme | X-SAMPA | Explanation | Example |
---|---|---|---|
b | b | voiced bilabial plosive | baño (b a ny o) |
bh | B | voiced bilabial fricative | abaco (a bh a k o) |
ch | tS | voiceless postalveolar affricate | choripan (ch o r i p a n) |
d | d | voiced alveolar plosive | día (d i a) |
dh | D | voiced dental fricative | andina (a n dh i n a) |
f | f | voiceless labiodental fricative | farol (f a r o l) |
g | g | voiced velar plosive | gato (g a t o) |
gh | G | voiced velar fricative | agua (a gh w a) |
k | k | voiceless velar plosive | casa (k a s a) |
l | l | lateral alveolar approximant | lame (l a m e) |
ll | L | voiced palatal lateral approximant | pollo (p o ll o) |
m | m | bilabial nasal | madre (m a dh r e) |
n | n | alveolar nasal | nadie (n a dh j e) |
ny | J | palatal nasal | ñoquis (ny o k i s) |
p | p | voiceless bilabial plosive | puerta (p w e r t a) |
r | 4 | alveolar flap | larga (l a r gh a) |
rr | r | alveolar trill | rara (rr a r a) |
s | s | voiceless alveolar fricative | suena (s w e n a) |
sh | S | voiceless postalveolar fricative | show (sh o w) |
t | t | voiceless alveolar plosive | tema (t e m a) |
x | x | voiceless velar fricative | jamón (x a m o n) |
y | j\ | voiced palatal fricative | yace (y a s e) |
z | T | voiced alveolar fricative | zeta (z e t a) |
READ THE NOTES for how to handle these. Most of these will never be used.
Phoneme | X-SAMPA | Explanation | Example | Dialect |
---|---|---|---|---|
ts | ts | voiceless alveolar sibilant affricate | chala (ts a l a) | Reference allophone for ch. |
zh | Z | voiced postalveolar fricative | ellos (e zh o s) | Reference allophone for ll. |
sh | S | voiceless postalveolar fricative | chala (sh a l a) | Reference allophone for ll/ch. |
h | h | voiceless glottal fricative | obispo (o bv i h p o) | Reference allophone for s/x. |
v | v | voiced labiodental fricative | afgano (a v gh a n o) | Reference allophone for f (before voiced consonants). |
dj | J\ | voiced palatal plosive | guía (dj i a) | Reference allophone for g (before some vowels). |
ng | N | velar nasal | dingo (d i ng g o) | Reference allophone for n (before velar consonants). |
Notes:
-
sh
andzh
are sometimes used as allophones forll
. Still label it asll
for dictionary compatiblity. However, these phonmemes are useful for foreign words. -
Some speakers enounter "yeísmo", or the merging of "ll" and "y". These speakers can label "ll" as
y
instead to slightly reduce the amount of training data needed. -
Some speakers merge the hard and soft "b/d/g" sounds. These speakers can label all "b/d/g" as [
b
/d
/g
] and ignore the [bh
/dh
/gh
] phonemes. Speakers without this merger can also drop [bh
/dh
/gh
] to simplify the dataset, which will cause most "b/d/g” instances to sound softer (due to more instances of the soft sound in speech). Doing so also creates a slight risk that the model may use a hard "b/d/g" at an incorrect time. -
The
y
phoneme works as both a consonant AND semivowel. This is thanks to NNSVS being able to learn context. -
x
covers all "h/soft J/aspirated s" sounds contextually. So only one phoneme is required. For speakers who want to differentiate, theh
phoneme could be used along with the others. Note that this will increase the amount of data required and won't be supported in the dictionary. -
f
is used for both devoiced "f" and voiced "v" sounds.
- Upper-class speakers use
ts
instead ofch
. - Lower-class speakers use
sh
instead ofch
. - In most cases you should still label these as
ch
for dictionary compatiblity. Only differenciate if the speaker betweents
/sh
andch
(optional).