Skip to content

smittytone/Unicoder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Unicoder 1.0.1

Unicoder is a utility for converting UTF-8 character codes, such as U+0024 ($) or U+20AC (€), into sequences of bytes presented as hexadecimal strings.

Unicoder's output is in the form of Squirrel string assignments that are ready to be cut and pasted into Squirrel code. For example:

/Users/smitty > ./unicoder.py U+20AC U+24 U+0939 U+0025 U+10348

local unicodeString="\xE2\x82\xAC";
local unicodeString="\x24";
local unicodeString="\xE0\xA4\xB9";
local unicodeString="\x25";
local unicodeString="\xF0\x90\x8D\x88";

As you can see from the example above, just call the script with one or more UTF-8 codes separated by spaces.

If you don't require Squirrel-oriented output, used the -j/--justhex switch:

/Users/smitty > ./unicoder.py -j U+20AC U+24 U+0939 U+0025 U+10348

E282AC
24
E0A4B9
25
F0908D88

Note Unicode contains more than 137,000 characters so at this time there is no way to derive a hex string from a character, only that character's UTF-8 code.

Release Notes

  • 1.0.1 12 June 2019
    • Add -j/--justhex option for hex-only output.
    • Help text cleaned up.
  • 1.0.0 5 June 2019
    • Initial release.

Licence And Copyright

This software is copyright © 2019, Tony Smith (@smittytone).

The UTF-8 character codes and encoding scheme is copyright © The Unicode Consortium.

About

Python 3 script for generating hex byte strings from UTF-8 character codes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages