Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WaniKani data #838

Closed
realmayus opened this issue Nov 20, 2021 · 15 comments · Fixed by #1276
Closed

Add WaniKani data #838

realmayus opened this issue Nov 20, 2021 · 15 comments · Fixed by #1276

Comments

@realmayus
Copy link

It would be cool if there was a label next to kanji and words indicating which WaniKani level they are taught in. Jisho already does that, and it would be convenient if one could see that data in the addon as well.

@realmayus
Copy link
Author

I extracted word and kanji data from wkstats.com, here are two json files for kanji categorized by level and words categorized by level, respectively:
https://gist.github.com/realmayus/1dd170a1da3a82734d67389035f71c0d

@birtles
Copy link
Member

birtles commented Nov 22, 2021

Hi @realmayus!

Thank you for filing this and for also preparing the data. I'm more than happy to add this as an extra reference source in the kanji data. Unfortunately I'm currently in the middle of rewriting the system that generates those files so it could be a couple of months before it's fully ready I'm afraid. Is that ok?

As for the word data, if it's simply a matter of finding the highest level kanji in the word and reporting that as the level then it seems like we could generate that automatically. Is that the case or is there more to it than that?

@melink14
Copy link

One thing to be careful of is that wanikani gets data updates about once a week so a static json won't stay up to date for long. (Accessing Wanikani API works well though.)

Word level doesn't directly map to highest kanji level though it's a fairly accurate heuristic. As a counter example:
https://www.wanikani.com/kanji/%E6%B8%A9 (温) is level 12 but 温める is level 13 and 温まる is level 14.

@birtles
Copy link
Member

birtles commented Nov 22, 2021

One thing to be careful of is that wanikani gets data updates about once a week so a static json won't stay up to date for long. (Accessing Wanikani API works well though.)

Oh, that's good to know. Thank you!

Word level doesn't directly map to highest kanji level though it's a fairly accurate heuristic. As a counter example:
https://www.wanikani.com/kanji/%E6%B8%A9 (温) is level 12 but 温める is level 13 and 温まる is level 14.

Oh I didn't realise WaniKani was teaching vocabulary now too. That's good to know. It might be good to focus just on kanji initially.

@Orzelius
Copy link

Orzelius commented Aug 9, 2023

Heyo, I'd like to bump this issue as the level indicators for kanji would be very useful. I think using a static json of all the kanji levels would be fine as they're updated very rarely.

Thank you for all the work so far, the extension is magnificent and a great help!

@birtles
Copy link
Member

birtles commented Aug 9, 2023

Thank you so much for bumping this. I really appreciate it as it helps me know what to focus on.

Adding the levels to kanji should be do-able (I am once again in the middle of reworking that data pipeline). Showing them as another reference in the kanji reference table would be easy enough but I wonder if that stands out enough? Then again, that's where we show the 漢字検定 levels so at least it would be consistent with that.

For words, I guess we could add a little badge after the reading?

Does WaniKani use color coding for levels or anything like that? Or would users want to be able to enter their WaniKani level and highlight the badge in a different color for words beyond / below their level?

@Orzelius
Copy link

Orzelius commented Aug 9, 2023

Showing them as another reference in the kanji reference table would be easy enough but I wonder if that stands out enough?

That would work just fine for me.

For words, I guess we could add a little badge after the reading?

I personally don't care for words, but I guess that would work for people who do.

Does WaniKani use color coding for levels or anything like that?

no, a level is just a number from 1 to 60

would users want to be able to enter their WaniKani level and highlight the badge in a different color for words beyond / below their level?

I don't think so. If the users pace is fast they will level up every one to two weeks so it would be tedious to change it.


My personal use case would be to check if a kanji is on wanikani and if not make a better effort to memorize it. If it is on WK I'll know that I'll be learning it in a future level. If I find a kanji from a past level that I've forgotten then I can unfreeze it on wanikani to go through the srs process again.

Hope this helps.

@birtles
Copy link
Member

birtles commented Aug 10, 2023

Yes, that's super helpful. Thank you so much. I'll let you know how I go.

@birtles
Copy link
Member

birtles commented Aug 17, 2023

I've done a bunch of work on this but most of it is in branches because I've been waiting to hear back from the WaniKani team to see if they're ok with me using the level data here.

Given that jisho.org is using this data I'm going to assume it's ok and start merging the different pieces. If there's a problem there should still be time to back it out before releasing.

@birtles
Copy link
Member

birtles commented Aug 17, 2023

Current status:

  • Job to download the WaniKani data from the API twice a week
  • Add WaniKani data to word and kanji export
  • Expose WaniKani data in jpdict-idb
  • Add prefs and popup rendering to 10ten reader

Regarding the prefs, there are two things I'm not sure about it:

  1. I'm not sure if it should be two prefs covering kanji and vocab separately, or just one. Two prefs is easier to implement and gives the user more control, but it might be less user-friendly.
  2. Should this be off by default? If there's just one pref then it definitely should be off by default—I don't want to bother existing users who don't use WaniKani with extra clutter on the word lookup screen. If there are two prefs, however, it might make sense to include WaniKani levels in the default kanji references.

@birtles
Copy link
Member

birtles commented Aug 17, 2023

Here's another question. At the moment what I've implemented is a direct mapping from the WaniKani kanji and vocab levels to kanji and words in the dictionary.

However, it doesn't try to determine the equivalent level for words that are not in the dictionary.

For example, suppose the user looks up 大浴場. That's not in the WaniKani vocab database so they won't see any WaniKani annotation next to the word.

However, each of the individual kanji are:

  • 大 = 1
  • 浴 = 16
  • 場 = 8

So we could annotate 大浴場 with 16 in that case. We might display it slightly differently to indicate it's not actually a WaniKani vocab word.

Is that useful? Necessary?

birtles added a commit that referenced this issue Aug 17, 2023
@birtles
Copy link
Member

birtles commented Aug 17, 2023

I made a start on this. Here's what it currently looks like:

2023-08-17_17-52-22.mp4

I think at very least I want to change the pref from a boolean to an enum value so that even if we don't add the synthesized WaniKani level right away, we can add it later without too much difficulty.

@Orzelius
Copy link

I'm not sure if it should be two prefs covering kanji and vocab separately, or just one.

I think two as you have right now is good.

Should this be off by default?

Yes I think so. For kanji I'll leave it up to you.

I don't think the automatic level determination like with 大浴場 is useful for me. I'd just look up the individual kanji.

Thank you for getting to this so quickly!

birtles added a commit that referenced this issue Aug 19, 2023
@birtles
Copy link
Member

birtles commented Aug 19, 2023

Great, thanks for that feedback. That's very helpful. I've made a few tweaks and I'll merge it soon but I have a couple of other fixes I want to get in before the next release.

birtles added a commit that referenced this issue Aug 19, 2023
birtles added a commit that referenced this issue Aug 19, 2023
@birtles
Copy link
Member

birtles commented Sep 13, 2023

Version 1.15 has now been released which includes this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants