Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vietnamese keyboard: TELEX convention with minor restrictions #9589

Merged
merged 2 commits into from Oct 12, 2022

Conversation

weijiuqiao
Copy link
Contributor

@weijiuqiao weijiuqiao commented Oct 4, 2022

This PR adds a modified version of the Vietnamese TELEX IME.

Based on info found at:

Said restrictions:

  • Letters composed by 2 strokes (đ, ă, â, ô, ê, ư, ơ) must be entered in one go. For example, while a "full" TELEX IME allows dad for entering 'đa', this one only allows dda.
  • No tone overriding. Entering multiple tone marks to have only the last one take effect is not supported.
  • No syllable-tone limitation considered. So you can enter syllables with tones that do not actually exist in the language.
  • Other minor inconsistencies.

Why it's not so bad:

  • Tones can however still be entered at the end of a word.

The current set-up follows what's recommended from the first linked reference above:

...if you have asked if it’s possible to defer the “breve” on top of the "a" till finishing the whole word ... the answer is yes. It’s possible ... However, this type of deferral is not used by many people and is also not recommended for the reason that it may actually slow you down. ... In addition, since ă ... is part of the Vietnamese Alphabet, it makes sense that we type it as one single unit.

(on deferring tone input) ...In fact many Vietnamese use this second way of typing the accent mark after completing the alphabet (word).

As I do not speak the language or have any Vietnamese friend yet, it would be great to get some help in testing it from native or proficient users.


This change is Reviewable

@weijiuqiao weijiuqiao force-pushed the chinese_keyboard branch 3 times, most recently from 0b67d8a to 33cbe3c Compare October 4, 2022 07:00
@poire-z
Copy link
Contributor

poire-z commented Oct 4, 2022

As I do not speak the language or have any Vietnamese friend yet, it would be great to get some help in testing it from native or proficient users.

Pinging @yukari186 (#8202).
Pinging @vietchovui (#8041).

(@weijiuqiao : less lucky granchildren :)

@vietchovui
Copy link

I don't know how to test it :( . Could you instruct me? I use koreader on kobo forma

@weijiuqiao
Copy link
Contributor Author

I suppose you can see the KOReader root folder? If so, download the following zip file and replace files according to the unzipped folder names.
vi_keyboard.zip

You should first backup your old files by renaming them something else. Then move the new files into your device. In case of crash, delete the new files and restore the old files' names.

@TranHHoang
Copy link

Thanks for your hard work! I tested it on the emulator and found an issue: typing awt should produces ăt, but current it is ăwt. Also, IMO there is a minor annoyance when it comes to deleting a letter composed by two strokes with/or a tone mark: you have to delete it multiple times. Normally I can press delete and be done with it.

Comment on lines 38 to 40
-- Wrap all of the navigation and non-single-character-input keys with
-- a callback to clear the tap window, but pass through to the
-- original function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a dedicated method for the Delete key in #9586 (delNextChar) ;).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to know! Luckily we don't really care what's behind (next) when using zh_ime so no need to hook it :) Also I copied the comment from another keyboard file and it's not exactly accurate here, so I made a little update.

@weijiuqiao
Copy link
Contributor Author

Thanks for testing it @TranHHoang and finding the error! I did let that extra "w" slip through. Have fixed it and double checked the others, seems no problem now.

IMO there is a minor annoyance when it comes to deleting a letter composed by two strokes with/or a tone mark: you have to delete it multiple times. Normally I can press delete and be done with it.

As to this, it'll be a bit of a hassle to implement. So how about we instead view it as a feature ;)
Now you don't have to delete the whole letter and compose from the start but can just delete a part of it! Since I don't know maybe it's more likely that you mistyped one key (say ấ aas for ẫ aax) than a whole letter?

@poire-z
Copy link
Contributor

poire-z commented Oct 6, 2022

Conflicts, as base with your change to utf8proc has already been bumped.
Also, IME: ?! I, me think about your grandchildren (sorry for insisting :)

@NiLuJe
Copy link
Member

NiLuJe commented Oct 6, 2022

Input Method Engine?

(I don't particularly feel offended by that one, given the context, FWIW ;o).

@poire-z
Copy link
Contributor

poire-z commented Oct 6, 2022

(There's no context given, that's my point :) Give some and I'm fine: Chinese IME..., Chinese keyboard: IME... as first commit is about the Chinese keyboard - the 2nd commit is adequately titled.)

@weijiuqiao
Copy link
Contributor Author

It's because the first commit is not necessarily about the Chinese keyboard but more about a general capacity of the engine (auto separate when input key is unique). But I did change one method name resulting in the changes in zh_keyboard. How about I make this clearer in the commit message?

@weijiuqiao weijiuqiao force-pushed the chinese_keyboard branch 2 times, most recently from 1e0ec42 to e460e4e Compare October 7, 2022 04:56
@poire-z
Copy link
Contributor

poire-z commented Oct 7, 2022

How about I make this clearer in the commit message?

I'm all about making things clearer :)

It's because the first commit is not necessarily about the Chinese keyboard but more about a general capacity of the engine

Many things are clear to you, as you're fully into this code. You can title your commits as you work on them as quickly/short/dumb as you wish (see @NiLuJe PRs :)
But when merged into master, they will get a title for eternity, and they are mixed with many others which are about other topics.
IME bump: supports auto seperation when code is unique is all very clear for you who has worked on a Chinese IME. But not for other devs/watchers who didn't follow this with you: they'll wonder and will have to look at the code to get what "IME" means and see it's only about the Chinese keyboard and they can skip being interested in this commit.
Also, all very clear to you as you just done it, and I didn't even notice until you said it: this Chinese IME is used by the Vietnamese keyboard: put this in the commit message of the Vietnamese commit, to make it known (ok, we can see it in the code, but let it be known to people reading commits, may be us later when bissecting, etc...).
Also, all very clear to you as you've just done it, there is only one IME, THE IME, your IME that you named IME. But for me, it's THE Chinese IME :) it is still named zh_ime.lua (and we have a Korean IME that is not named "ime" but just "helper").
(If this zh_ime.lua is really generic enough and planned to be used by many other xy_keyboard, you should rename it to generic_ime.lua - and then, you'll be allowed to name it Generic keyboad IME: supports separation... :)

I don't mind your commits titles when you do a single topic PR, as we can tweak the single commit title when "squashing" them all, and I don't mind doing it (ie. all your Vocabulary builder: PRs :) but when you do a multi topics PR like this one, I can't do it for you :)
(sorry for being insistent, but you're not alone - many times plugin authors forgot to mention the plugin name in their PRs title - they just go with "Fixed this" :)

@weijiuqiao weijiuqiao force-pushed the chinese_keyboard branch 3 times, most recently from 04963f3 to 1912469 Compare October 11, 2022 16:50
@weijiuqiao
Copy link
Contributor Author

If this zh_ime.lua is really generic enough and planned to be used by many other xy_keyboard, you should rename it to generic_ime.lua - and then, you'll be allowed to name it Generic keyboad IME: supports separation... :)

Done as suggested :)

Comment on lines 58 to 59
switch_char = "下一字", -- default
separator = "分隔", -- default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should these stay in a generic IME ?
Does the vietnamese keyboard show them ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Vietnamese keyboard doesn't show them or use their functionality. I can change them to some other strings but that would mean rebase and force push yet again! Can I open a PR after if this one is merged? (Will stick to single PRs from now on...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that would mean rebase and force push yet again!

Nothing wrong with that, we all do that and many times per PR when we have multiple topic commits.
But as you wish.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, then let me force push again. I just thought the many not-so-meaningful push records might be too distracting.

@poire-z
Copy link
Contributor

poire-z commented Oct 11, 2022

That wasn't really a suggestion :) I was just making a point as to why you should prefix with "Chinese keyboard/IME".
But well, ok :) (Having look at it, it indeed "looks" generic.)

@poire-z poire-z added this to the 2022.10 milestone Oct 11, 2022
Comment on lines +1 to +3
---------------------------------
-- Generic input method engine --
---------------------------------
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(We don't usually have this kind of boxed comments.)

@poire-z poire-z merged commit ae8156f into koreader:master Oct 12, 2022
This was referenced Oct 12, 2022
W = W -- has wildcard function
}

local wrappedAddChars = function(inputbox, char)
ime:wrappedAddChars(inputbox, char)
end

local function wrappedSeparate(inputbox)
ime:wrappedSeparate(inputbox)
local function seperate(inputbox)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit for a followup: separate ;).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants