## Converting from Codepoint to Char and Back

In [None]:
# wrong way
[char]103
[char]0x1f468

In [None]:
'was it too big to fit into one [char]?'
0x1f468 -gt [char]::MaxValue

In [None]:
# Codepoint (int32) to char/string
[char]::ConvertFromUtf32( 0x1f468 )

# codepoint from [char] to codepoint
$charList = '👨'.ToCharArray()
[char]::ConvertToUtf32( $charList[0], $charList[1] )

# it outputs in decimal, it's the same number
128104 -eq 0x1f468


Note: The return type of `[char]::ConvertFromUtf32` is actually a `[string]`, not a `[char]`

In [None]:
function inspectString {
    param(
        [string]$InputText
    )
    
    $info = @{
        'Input' = $InputText
        'String.Length' = $InputText.Length        
        # 'Utf8SequenceLength' = $stats.Utf8SequenceLength.Sum
        # 'Utf16SequenceLength' = $stats.Utf16SequenceLength.Sum
        # 'Utf8SequenceCount' = $stats.Utf8SequenceLength.count
        # 'Utf16SequenceCount' = $stats.Utf16SequenceLength.Count
        'TotalCodepoints' = $inputText.EnumerateRunes().Value.count
        'Details' = $InputText.EnumerateRunes()
    }
    # | Ft -AutoSize Property, Sum, Count

    return [pscustomobject]$Info
    # return $Info
}

$str = '👨👨‍👩‍👧‍👦'
inspectString $str | ft -auto
$strMan = '👨'
inspectString $strMan | ft -auto

$str.EnumerateRunes()
    | measure -Property Utf8SequenceLength, Utf16SequenceLength -Sum
    | ft Property, Sum, Count

In [None]:
(inspectString $str ).Details | Ft -auto

In [None]:
# 'other numbers'
$str.Length
[Text.Encoding]::UTF8.GetByteCount( $str ) # Byte of char

# [Text.Encoding]::Unicode.GetByteCount( $bstr )        
$bStr_16 = [Text.Encoding]::Unicode.GetBytes( $str ) # char of bytes
$bStr_16.Length
[Text.Encoding]::Unicode.GetCharCount( $bstr_16 ) #char of bytes

Converting a codepoint to a `[char]`, (actually a `[string]`)


Assume that Dotnet uses the encoding `UTF-16LE` by default, unless the docs explicitly say another. 

| Type | Description |
| - | -  |
| `[char]` | represesnts a single UTF16le `code-unit`<br>max value is **2-bytes** |
| `[String]` | a list of utf16le code-units | 
| `[Rune]` | represents a single Unicode Scalar Codepoint<br>in the range `[0..0x10ffff]`<br>Except the Surrogate range `[0xd800..0xdfff]`   |

docs:

- [Introduction to Character Encoding in `.NET`](https://docs.microsoft.com/en-us/dotnet/standard/base-types/character-encoding-introduction)
- [`[Char]`](https://docs.microsoft.com/en-us/dotnet/api/system.char?view=net-6.0)
- [`[String]`](https://docs.microsoft.com/en-us/dotnet/api/system.char?view=net-6.0)
- [`[Text.Rune]`](https://docs.microsoft.com/en-us/dotnet/api/system.text.rune?view=net-6.0)
- [`[Globalization.StringInfo]`](https://docs.microsoft.com/en-us/dotnet/api/system.globalization.stringinfo?view=net-6.0)

Note: **WindowsPowershell** does not have any `[Rune]` functions

See more:
- [Introduction to Character Encoding in `.NET`](https://docs.microsoft.com/en-us/dotnet/standard/base-types/character-encoding-introduction)
- [`System.String` and `char`](https://docs.microsoft.com/en-us/dotnet/api/system.string?view=net-6.0#char-objects-and-unicode-characters)