Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect pitch accent #819

Open
torazem opened this issue Jan 17, 2018 · 4 comments
Open

Incorrect pitch accent #819

torazem opened this issue Jan 17, 2018 · 4 comments

Comments

@torazem
Copy link

@torazem torazem commented Jan 17, 2018

The accent for 辺 (へん) should be heiban.

img_20180117_224613

Sources:

  • OJAD
  • Japanese teacher

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@mvysny mvysny self-assigned this Jan 25, 2018
@mvysny mvysny added the bug label Jan 25, 2018
@mvysny

This comment has been minimized.

Copy link
Owner

@mvysny mvysny commented Jan 25, 2018

Thanks! The data is taken from https://github.com/javdejong/nhk-pronunciation ; according to the ACCDB_unicode.csv file:

84846,68444,J68444.wav,1,5405150030,ヘン,辺,辺,辺(数),2,,,ヘンオ,0,K68444.wav,ヘン,1,0,20

The pitch is as shown in the Aedict (it's the trailing 20 that's important).
However, please feel free to submit patches and corrections to the abovementioned project; Aedict will then pick up the changes automatically on the next dictionary index round.

@torazem

This comment has been minimized.

Copy link
Author

@torazem torazem commented Feb 2, 2018

Thanks! I'll submit a patch this weekend.

Edit
It looks like ACCDB_unicode.csv has a heiban entry for :

84846,68444,J68444.wav,1,5405150030,ヘン,辺,辺,辺(数),2,,,ヘンオ,0,K68444.wav,ヘン,1,0,20
84847,68445,J68445.wav,1,5405160010,ヘン,辺,辺,辺,2,,,ヘンオ,0,K68445.wav,ヘン,1,0,1

The first entry looks like it is intended as a counter whereas the second entry is in the desired heiban form, so it looks like this file is correct.

I'm not yet familiar with Aedict's codebase; is it possible for Aedict to pick up both entries, or match based on other criteria?

@k3zi

This comment has been minimized.

Copy link

@k3zi k3zi commented Jun 25, 2018

'counter' isn't the right word. 数 means math. That へん literally means the side/edge of like a shape. The other へん is for general area. That file does come from NHK's 1998 Accent Dictionary. Unless you recognize the 数 and compare it to maybe JMDicts info field to see if it has a 'math related term' entry then listing both is a good idea. Unfortunately that project isn't outputting the NHKexpr field which basically determines when an accent applies in cases where a words accent may change with meaning and or placement in a sentence.

@torazem

This comment has been minimized.

Copy link
Author

@torazem torazem commented Jul 4, 2018

Ah, thanks @k3zi, that makes more sense! In that case, I like the idea of listing both, even if it's impossible to tell which accent belongs to which context.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.