Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Foreign language character support #24

Open
Philippe-Collignon opened this issue Jun 22, 2019 · 7 comments
Open

Foreign language character support #24

Philippe-Collignon opened this issue Jun 22, 2019 · 7 comments

Comments

@Philippe-Collignon
Copy link

Hi,
I try to optimize french Azerty keyboard with the optimizer.

I defined a new DEFAULT_KEYBOARD_FRENCH with some french characters in it like "éçàù .."
I created new data (allChars.txt and allDigraphs.txt for french containing also those chararacters)

But if I run the optimizer, I have unknown characters in the result map (instead of the french characters) ?
Could you tell if the optimizer supports foreign language chars ?

Thanks,

@michaeldickens
Copy link
Owner

Hi Philippe,

The optimizer uses ASCII, not Unicode, which means it doesn't support those French characters.

There's a trick you might be able to use to get around this. You can replace all the Unicode characters in your data set with ASCII characters as long as they aren't already getting used. For example, replace é with 1, ç with 2, etc. Then when the optimizer generates a keyboard layout, where the layout contains a 1, you know that really means é. I hope that works for your purposes.

Cheers,
Michael

@Philippe-Collignon
Copy link
Author

Hi Michael,

Thanks for the trick ... I also thought about it but as my corpus was also containing the numbers and think to use the optimizer more deeply with numbers also I did not do it.
But it is a good trick for a first board. I'll published the result and the french char files based on the corpus used by bépo team (the alternative french keyboard layout project)

@Philippe-Collignon
Copy link
Author

Philippe-Collignon commented Jun 22, 2019

Unfortunately I could not get it working . .. with a layout K_NO there is not enough keys so I had to remove some letters (w,..) I get an interesting result but incomplete

ê P O B Y X V L M Q
A I E U é D S R T N
: Z K è W G C J F H

à p o b y x v l m q
a i e u , d s r t n
; z k . ç g c j f h

and with K_STANDARD (which by the way is not working if I set it in values.c as said in the readme, but with setksize standard it works) numbers are mixed with the real numbers. So I replaced with special characters and got this ... better than Bépo layout yet but I'll keep optimizing for an english-french-programmer layout. Thanks

K ! @ ç $ % ^ & * ( ) ~ |
> P O B Y G D L " Q J Z +
A I E U ê M S R T N X
_ ? à < : V C H F W

k 1 2 3 4 5 6 7 8 9 0 `
. p o b y g d l ' q j z =
a i e u é m s r t n x
- / è , ; v c h f w

@phcollignon
Copy link

I've forked with a workaround for french accents. https://github.com/phcollignon/Typing

@HughP
Copy link
Contributor

HughP commented Dec 4, 2020

@phcollignon interesting work. If this modification doesn't work to your satisfaction and your up to the task of making this Unicode compatible, I'd be most interested in your output. I use Typing to work with decomposed UTF-8 characters. there are also several analysis tools in the BEBE keyboard layout community which is french focused. Another tool you might look at is http://509.ch/opt.7z // https://509.ch/opt.htm which has/had the ability to add up to three states (something like default, shift, and alt). But I have found Typing to be a really nice tool and chose it as the bases of my testing for use in African languages (the issues of states and unicode combining diacritics aside)

@phcollignon
Copy link

phcollignon commented Dec 4, 2020

Thanks for the links, I did not know about that german project. I have no plan to update the code to Unicode for now. The workaround gives an acceptable result. It's just that the paranthesis, brackets and curly braces are not taken into account because they are used as unicode letters mapping. One problem not yet solved is the dead keys ^ and ¨.

Funny enough, one of the best result so far for 50% french/english is the ".YOU" layout .. ;-))

Hands: 51% 48%
Fingers: 9.0% 9.0% 19% 15% 0.00% 0.00% 17% 11% 11% 9.0% 

 ^  @  *  /  &  %   #  $  ?  +  <  Z  ~    
    :  Y  O  U ç   X  L  P  C  B  V  K  \
    A  I  E  N  '   M  R  T  S  D  F       
    |  !  = à  `   J  H  W  G  Q          

ê  1  2  3  4  5   6  7  8  9  0  z ù    
    .  y  o  u  -   x  l  p  c  b  v  k  >
    a  i  e  n  ,   m  r  t  s  d  f       
    ;  " è é  _   j  h  w  g  q          

Fitness:       1255024607
Distance:      971772980
Finger work:   1315297
Inward rolls:  7.64%
Outward rolls: 1.94%
Same hand:     32.87%
Same finger:   1.86%
Row change:    13.87%
Home jump:     0.82%
Ring jump:     0.61%
To center:     1.38%
To outside:    0.69%
	

@phcollignon
Copy link

I added a french corpus parser (in go) to build the frequency files from text content.
With :

  • accents letters replaced by Ascii characters
  • dead key based letters splitted in two key pressed (ie: ï => ¨i )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants