-
Notifications
You must be signed in to change notification settings - Fork 229
Pitch accent for compound words #1542
Comments
The issue is that the source data represents it as a single word, and Yomichan doesn't attempt to do lookups of the individual parts of compound words, as there is not a good way to reliably do this. The source data for the term you listed is the following:
And I don't believe that the multiple comma-separated values generally represent the accents of the compounds, although the format of this file isn't really documented. |
I see, there doesn't seem to be any great solutions for automatic pitch accent generation of compound words. For now I'll just manually edit the pitch accent data for my cards. |
@toasted-nutbread Yomichan doesn't seem to support this anyways though. The JSON format assumes that there can only be one pitch accent phrase in a word. I don't think it would be effective to get Yomichan to do lookups for each part since those lookups could lead to erroneous accents. It would be best if I could just add multiple phrases like: [
"一子相伝",
"pitch",
{
"reading": "いっしそうでん",
"pitches": [
[{
"pronunciation": "イッシ",
"position":1,
"nasal":[],
"devoice":[]
}, {
"pronunciation": "ソーデン",
"position":0,
"nasal":[],
"devoice":[]
}]
]
}
] This would also have the added benefit of allowing a specific pronunciation instead of using the reading (which is currently used to correlate to other dictionary entries). I.e. 通う(カヨウ) vs 火曜(カヨー). |
Although I do agree that this source data from Kanjium doesn't make use of having multiple phrases, I still would like to add that it would be a good idea so that we can utilise sources that do use multiple phrases |
Currently, Yomichan can't display the pitch accent for compound words correctly (or maybe the data from Kanjium is lacking?).
For example with: 一子相伝
Yomichan would display this:
But the word actually consists of two different pitch accents, atamadaka for the first part of the compound word and heiban for the second one.
For reference, this is what is displayed in the NHK pitch accent dictionary:
Maybe what's happening is that the pitch accents Yomichan is displaying above are simply the two parts of a single compound pitch, but Yomichan is incorrectly treating these two pitch accents as if they are simply two accent variants. But this is just a wild guess.
The text was updated successfully, but these errors were encountered: