## Characters

* most of the time, you'll be using the primitive char type

In [1]:
char ch = 'a';

// unicode for uppercase Greek omega character
char uniChar = '\u03A9';

// an array of chars
char[] charArray = { 'a', 'b', 'c', 'd', 'e'};

* there are times when you need to use char as an object
    - Java provides a wrapper class that "wraps" the char in a Character object
    - an object of type Character contains a single field, whose type is char
    - also provides useful static methods
* the Java compiler will also create a character object for you sometimes
    - e.g. if you pass a primitive char into a method that expects an object, Java will convert the char to a Character object for you
        * this is called autoboxing
    - and if a method expects a primitive char type and not an object, then the Java compiler will unwrap the Character object for you
        * this is called unboxing
* the Character class is immutable so once it is created, a Character object cannot be changed
* Useful methods:
    - boolean isLetter() 
    - boolean isDigit()
    - boolean isWhiteSpace()
    - boolean isUpperCase
    - boolean isLowerCase()
    - char toUpperCase()
    - char toLowerCase()
    - char toString(): returns a one-character string object

In [None]:
// can create a Character object with the Character constructor:

Character ch = new Character('a');

## Characters and Code Points

* char data type and the Character class are based on the original Unicode specification:
    - characters are fixed-width 16-bit entities
    - the Unicode Standard has allowed more characters that require more than 16-bits though
        * range is now U+0000 to U+10FFFF, known as the Unicode scalar value
* a char value is encoded with 16-bits and can represent numbers from 0x0000 to 0xFFFF
    - this is referred to as the _Basic Multilingual Plane (BMP)_
    - characters whose code points are greater than 0xFFFF (noted U+FFFF) are called _supplementary characters_
* a char value, therefore, represents the Basic Multilingual Plane (BMP)
    - an int value represnets all Unicode code points, including supplementary code points
* the behavior of supplementary characters and surrogate char values is as follows:
    - the methods that only accept a char value cannot support supplementary characters
        * they treat char values from the surrogate ranges as undefined characters
    - the methods that can accept an int value support all Unicode characters, including supplementary characters

## Escape Sequence

* a character precede by a backslash (\) is an escape sequence and has special meaning to the compiler
* common ones:
    - \n: insert newline in text at this point
    - \': insert single quote at this point
    - \": insert double quote at this point
    - \\: insert backslash at this point

In [2]:
System.out.println("She said \"Hello!\" to me.");

She said "Hello!" to me.
