# Regular Expressions: Quick Guide & Cheat Sheet

This notebook serves as a quick reference guide for the most common regular expression syntax used in PostgreSQL.

In [1]:
%load_ext sql

In [2]:
%sql postgresql://fahad:secret@localhost:5432/people

---
## Core Syntax Cheat Sheet

| Symbol | Name | Description & Example |
| :--- | :--- | :--- |
| `.` | Wildcard | Matches any single character. `h.t` matches 'hot', 'hat'. |
| `^` | Start Anchor | Asserts position at the start of the string. `^A` matches 'Apple'. |
| `$` | End Anchor | Asserts position at the end of the string. `\.$` matches a string ending in a period. |
| `*` | Quantifier | Matches the preceding element zero or more times. `ab*c` matches 'ac', 'abc', 'abbc'. |
| `+` | Quantifier | Matches the preceding element one or more times. `ab+c` matches 'abc', 'abbc'. |
| `?` | Quantifier | Matches the preceding element zero or one time. `colou?r` matches 'color', 'colour'. |
| `{n,m}` | Quantifier | Matches the preceding element between n and m times. `[0-9]{2,4}` matches 2 to 4 digits. |
| `[]` | Character Class | Matches any single character within the brackets. `[aeiou]` matches any vowel. |
| `[^]` | Negated Class | Matches any character NOT in the brackets. `[^0-9]` matches any non-digit. |
| `()` | Group | Groups expressions. `(abc)+` matches 'abc', 'abcabc'. |
| `\|` | Alternation (OR) | Matches either the expression before or after. `cat\|dog` matches 'cat' or 'dog'. |
| `\` | Escape | Escapes a special character. `\.` matches a literal period. |

---
## Common Ready-to-Use Patterns

In [3]:
%%sql
DROP TABLE IF EXISTS regex_patterns CASCADE;
CREATE TABLE regex_patterns (id SERIAL, test_string TEXT);
INSERT INTO regex_patterns (test_string) VALUES
('My email is test@example.com'),
('Date: 2025-09-21'),
('Another date format is 21/09/2025'),
('Not an email: test@localhost');

 * postgresql://fahad:***@localhost:5432/people
Done.
Done.
4 rows affected.


[]

In [4]:
%%sql
-- Match a simple email pattern
SELECT test_string FROM regex_patterns WHERE test_string ~ '\w+@\w+\.\w+';

 * postgresql://fahad:***@localhost:5432/people
1 rows affected.


test_string
My email is test@example.com


In [5]:
%%sql
-- Match a YYYY-MM-DD date format
SELECT test_string FROM regex_patterns WHERE test_string ~ '[0-9]{4}-[0-9]{2}-[0-9]{2}';

 * postgresql://fahad:***@localhost:5432/people
0 rows affected.


test_string


---
## Regex in SQL Functions

In [8]:
%%sql
-- Extract the domain name from an email using SUBSTRING
SELECT SUBSTRING('Contact me at user@example.com' FROM '@([^ ]+)');

 * postgresql://fahad:***@localhost:5432/people
1 rows affected.


substring
example.com


In [9]:
%%sql
-- Censor all numbers in a string using REGEXP_REPLACE
SELECT REGEXP_REPLACE('Order 123 for item 4567', '[0-9]', 'X', 'g');

 * postgresql://fahad:***@localhost:5432/people
1 rows affected.


regexp_replace
Order XXX for item XXXX
