# PostgreSQL & Python Character Sets

In this notebook, we explore character encoding and mapping in PostgreSQL and Python:
- PostgreSQL ASCII functions: `ASCII()`, `CHR()`
- Unicode support in PostgreSQL (`UTF-8`)
- Working with emojis and accented characters
- Python equivalents: `ord()`, `chr()`
- Converting between bytes and string: `.encode()` / `.decode()`

We'll use both a dedicated table `char_examples` and Python examples for demonstration.

In [1]:
%load_ext sql

In [2]:
%sql postgresql://fahad:secret@localhost:5432/people

---
## 1. Create a Table for Character Examples

We'll create `char_examples` to hold sample strings including accented characters and emojis.

In [3]:
%%sql
DROP TABLE IF EXISTS char_examples;
CREATE TABLE char_examples (
    id SERIAL PRIMARY KEY,
    sample_text VARCHAR(100)
);

 * postgresql://fahad:***@localhost:5432/people
Done.
Done.


[]

In [4]:
%%sql
INSERT INTO char_examples (sample_text) VALUES
('A'),
('é'),
('😊'),
('ñ');

 * postgresql://fahad:***@localhost:5432/people
4 rows affected.


[]

---
## 2. PostgreSQL ASCII / CHR Functions

- `ASCII(text)` → returns the ASCII code of the first character
- `CHR(int)` → returns the character corresponding to the ASCII code

In [5]:
%%sql
SELECT
    sample_text,
    ASCII(sample_text) AS ascii_code,
    CHR(65) AS chr_example
FROM char_examples;

 * postgresql://fahad:***@localhost:5432/people
4 rows affected.


sample_text,ascii_code,chr_example
A,65,A
é,233,A
😊,128522,A
ñ,241,A


---
## 3. Unicode / UTF-8 Support

- PostgreSQL supports `UTF-8`, allowing emojis and accented characters.
- You can store, query, and manipulate these characters just like normal strings.

In [6]:
%%sql
SELECT sample_text, LENGTH(sample_text) AS length_utf8
FROM char_examples;

 * postgresql://fahad:***@localhost:5432/people
4 rows affected.


sample_text,length_utf8
A,1
é,1
😊,1
ñ,1


---
## 4. Python Character Functions

- `ord(char)` → get Unicode code point of a character
- `chr(code)` → get character from Unicode code point
- `.encode('utf-8')` → convert string to bytes
- `.decode('utf-8')` → convert bytes back to string

In [7]:
# Python examples
chars = ['A', 'é', '😊', 'ñ']
for c in chars:
    code = ord(c)
    char_again = chr(code)
    bytes_val = c.encode('utf-8')
    str_val = bytes_val.decode('utf-8')
    print(f"Char: {c}, ord: {code}, chr: {char_again}, bytes: {bytes_val}, decoded: {str_val}")

Char: A, ord: 65, chr: A, bytes: b'A', decoded: A
Char: é, ord: 233, chr: é, bytes: b'\xc3\xa9', decoded: é
Char: 😊, ord: 128522, chr: 😊, bytes: b'\xf0\x9f\x98\x8a', decoded: 😊
Char: ñ, ord: 241, chr: ñ, bytes: b'\xc3\xb1', decoded: ñ


---
## Notes

- PostgreSQL `ASCII()` and `CHR()` are equivalent to Python `ord()` and `chr()`.
- UTF-8 support ensures you can store emojis, accented characters, and non-Latin scripts.
- `.encode()` / `.decode()` are essential when working with bytes in Python.
- Always test your database encoding (`UTF-8`) to prevent character corruption.