cha(rs) is a silly commandline tool to display information about unicode characters
Rust Makefile
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
data
generator
src
.gitignore
.travis.yml
CONTRIBUTING.md
Cargo.toml
CoC.md
LICENSE
Makefile
README.md

README.md

Cha(rs)

Build Status

Use this tool to display names and codes for various ASCII (and unicode) characters / code points!

It's strongly inspired by ascii(1), but supports unicode characters; it's also inspired by unicode.py, but it attempts to support whitespace/control characters better.

Cha(rs) is currently probably failing at some other edge case, but I hope not.

Pronunciation

How do you pronounce "chars"? This is a contentious thing.

Installation

Prereqs: I am building chars on Rust 1.10. Rusts at 1.9 and above should work, but no guarantees.

Plain crate installation without source code

cargo install chars --git https://github.com/antifuchs/chars.git

Source installation

  1. Clone this repo
  2. cargo install

Running

Look up a character by its face value:

chars 'ß'

Screenshot:

LATIN1 df, 223, 0xdf, 0337, bits 11011111
Width: 1 (2 in CJK context), prints as ß
Lower case. Upcases to SS
Quotes as \u{df}
Unicode name: LATIN SMALL LETTER SHARP S

Look up a character by its unicode point:

chars U+1F63C

Screenshot:

U+0001F63C, 😼 0x0001F63C, \0373074, UTF-8: f0 9f 98 bc, UTF-16BE: d83dde3c
Width: 1, prints as 😼
Quotes as \u{1f63c}
Unicode name: CAT FACE WITH WRY SMILE

Look up a character by ambiguous "char code" handwaving:

chars 10

Screenshot:

U+0001F0EA, 🃪 0x0001F0EA, \0370352, UTF-8: f0 9f 83 aa, UTF-16BE: d83cdcea
Width: 1, prints as 🃪
Quotes as \u{1f0ea}
Unicode name: PLAYING CARD TRUMP-10

U+0001DAA9, 𝪩 0x0001DAA9, \0355251, UTF-8: f0 9d aa a9, UTF-16BE: d836dea9
Width: 0, prints as 𝪩
Quotes as \u{1daa9}
Unicode name: SIGNWRITING ROTATION MODIFIER-10

U+0001D209, 𝈉 0x0001D209, \0351011, UTF-8: f0 9d 88 89, UTF-16BE: d834de09
Width: 1, prints as 𝈉
Quotes as \u{1d209}
Unicode name: GREEK VOCAL NOTATION SYMBOL-10

U+0001D1A4, 𝆤 0x0001D1A4, \0350644, UTF-8: f0 9d 86 a4, UTF-16BE: d834dda4
Width: 1, prints as 𝆤
Quotes as \u{1d1a4}
Unicode name: MUSICAL SYMBOL ORNAMENT STROKE-10

U+FE09, ︉ 0xFE09, \0177011, UTF-8: ef b8 89, UTF-16BE: fe09
Width: 0, prints as ︉
Quotes as \u{fe09}
Unicode name: VARIATION SELECTOR-10

ASCII 1/0,  16, 0x10, 0020, bits 00010000
Control character; quotes as \u{10}, called ^P
Called: DLE
Also known as: Data Link Escape

ASCII 0/a,  10, 0x0a, 0012, bits 00001010
Control character; quotes as \n, called ^J
Called: LF, NL
Also known as: Line Feed, Newline, \n

ASCII 0/8,   8, 0x08, 0010, bits 00001000
Control character; quotes as \u{8}, called ^H
Called: BS
Also known as: Backspace, \b

ASCII 0/2,   2, 0x02, 0002, bits 00000010
Control character; quotes as \u{2}, called ^B
Called: STX
Also known as: Start of Text

Look a control character:

chars "^C"

Screenshot:

ASCII 0/3,   3, 0x03, 0003, bits 00000011
Control character; quotes as \u{3}, called ^C
Called: ETX
Also known as: End of Text