Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplified 6-dot worldwide unified mapping for the hexadecimal value of Unicode characters #689

Open
DrSooom opened this Issue Jan 25, 2019 · 3 comments

Comments

Projects
None yet
2 participants
@DrSooom
Copy link

DrSooom commented Jan 25, 2019

Introduction:

Yesterday I opened issue #688, where I described how to display the hexadecimal value of Unicode characters with just three and four 8-dot braille characters. After that I wasn't able to stop thinking about a suitable way for doing the same for 6-dot too. And well, I guess I have now solved the puzzle. The following solution for 6-dot is just a little bit more complex than the one for 8-dot, but the syntax of both are still quite easy to learn and to understand.
And to clarify it right at the beginning: The following 6-dot solution will not replace or influence any existing 6-dot braille table, unless the prefix braille character ⠿ (U+283F, dots 123456) is already used for another character. Sadly there are hundreds of rules for 6-dot braille tables around the globe, so finding the braille characters, which works everywhere the same way, isn't quite easy. I have chosen ⠿ for one specific reason: It's displayed for "deleted" characters on paper, if it isn't possible to put the wrong dots back into the paper.
And please don't forget: My solution here is nothing more than a replacement how hexadecimal values can be displayed in 6-dot braille. It doesn't redefine any Unicode character, it just convert its hexadecimal value to the shortest possible way in 6-dot.

Definition:

Prefix and suffix braille characters:

  • The prefix character must stand in front of every single Unicode character. To avoid confusion, grouping of two or more unseparated Unicode characters isn't allowed.
  • The very last hexadecimal value stands always in the area of the dots 1245 in the fourth braille character. The dots 3 and 6 define the suffix braille character, which is also always at the fourth position. In other words: The fourth braille character is always a combination of a hexadecimal value and the suffix.
  • The two dashes ("-") below between the prefix and the suffix are placeholders for three hexadecimal values. The fourth last one is at the position of the first braille character in the area of the dots 1245, the third last one is split between the first and second braille character (dots 36 and 14) and the second last one is placed in the area of the dots 2356 in the second braille character. The last two hexadecimal values on a Unicode character are the most important one. That's why they shouldn't split between two 6-dot braille characters.
  • And to avoid misunderstanding, ⠀ (U+2800, dot 0) isn't allowed as a suffix.
  • ⠿--⠄ = characters between U+0000 and U+FFFF
    U+283F and U+2804, dots 123456 and 3; Defines the first 65536 Unicode characters.
  • ⠿--⠠ = characters between U+10000 and U+1FFFF
    U+283F and U+2820, dots 123456 and 6; Defines the second 65536 Unicode characters.
  • ⠿--⠤⠇ = characters between U+20000 and U+2FFFF
    U+283F, U+2824 and U+2807, dots 123456, 36 and 123; Defines the third 65536 Unicode characters.
    And beginning from here four braille characters are needed to define a Unicode character correctly.
    At the moment 60859 characters are defined in the block U+2xxxx and 337 more in the blocks higher than U+30000.
    The first hexadecimal value for U+2xxxx must stand behind the fourth braille character in the area of the dots 1245 to define the correct Unicode block. And for the Unicode characters from U+100000 to U+10FFFF the braille character ⠥ (U+2825, dots 136) is used to define this Unicode block.
    Here the full list from U+20000 to U+10FFFF:
    • ⠿--⠤⠇ = characters between U+20000 and U+2FFFF
    • ⠿--⠤⠍ = characters between U+30000 and U+3FFFF
    • ⠿--⠤⠝ = characters between U+40000 and U+4FFFF
    • ⠿--⠤⠕ = characters between U+50000 and U+5FFFF
    • ⠿--⠤⠏ = characters between U+60000 and U+6FFFF
    • ⠿--⠤⠟ = characters between U+70000 and U+7FFFF
    • ⠿--⠤⠗ = characters between U+80000 and U+8FFFF
    • ⠿--⠤⠎ = characters between U+90000 and U+9FFFF
    • ⠿--⠤⠌ = characters between U+A0000 and U+AFFFF
    • ⠿--⠤⠜ = characters between U+B0000 and U+BFFFF
    • ⠿--⠤⠖ = characters between U+C0000 and U+CFFFF
    • ⠿--⠤⠆ = characters between U+D0000 and U+DFFFF
    • ⠿--⠤⠔ = characters between U+E0000 and U+EFFFF
    • ⠿--⠤⠄ = characters between U+F0000 and U+FFFFF
    • ⠿--⠤⠥ = characters between U+100000 and U+10FFFF

Converting hexadecimal values into braille:

  • 0 = ⠚, 1 = ⠁, 2 = ⠃, 3 = ⠉, 4 = ⠙, 5 = ⠑, 6 = ⠋, 7 = ⠛
  • 8 = ⠓, 9 = ⠊, A = ⠈, B = ⠘, C = ⠒, D = ⠂, E = ⠐, F = ⠀

Combining hexadecimal values:

  • 0000 = ⠺⠽⠚, 0001 = ⠺⠽⠁, 0010 = ⠺⠋⠚, 0100 = ⠞⠴⠚
  • 1000 = ⠡⠽⠚, FFEF = ⠀⠠⠀, FFFE = ⠀⠀⠐, FFFF = ⠀⠀⠀

Examples:

  • Digit Zero = 0 = U+0030 = '\x0030' = ⠿⠺⠛⠞
    But: Two Digit Zero = 00 = U+0030U+0030 = '\x0030''\x0030' = ⠿⠺⠛⠞⠿⠺⠛⠞
    Reducing it to ⠿⠺⠛⠞⠺⠛⠞ isn't allowed.
  • Music Sharp Sign = ♯ = U+266F = '\x266f' = ⠿⠧⠗⠄
  • Braille Pattern Dots-12 = ⠃ = U+2803 = '\x2803' = ⠿⠇⠽⠍
  • Musical Symbol G Clef = 𝄞 = U+1D11E = '\xd834''\xdd1e' = ⠿⠆⠂⠰
  • Grinning Face = 😀 = U+1F600 = '\xd83d''\xde00' = ⠿⠤⠵⠺

Technical solution:

Please read the same section in issue #688.
Every thought and suggestion from the community are highly welcome. Maybe I have overlooked something here too.

Additional sources:

@DrSooom

This comment has been minimized.

Copy link
Author

DrSooom commented Jan 28, 2019

I created a test file for 6-dot and for 8-dot for demonstrating how my idea would look like with emoticons (U+1F600 to U+1F64F, UTF-16 encoding). You can include these text files into another table and test it with NVDA.

@DrSooom

This comment has been minimized.

Copy link
Author

DrSooom commented Feb 7, 2019

I'm going to create all 8- and 6-dot tables for UTF-16 and UTF-32. I will add a new comment here, after I've finished everything.

@DrSooom

This comment has been minimized.

Copy link
Author

DrSooom commented Mar 22, 2019

Currently 25 % of the first HUC6 Braille Table are finished. I guess I will still need approximately four more weeks to finalize them all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.