# Regular Expressions in PostgreSQL

This notebook provides a detailed tutorial on using regular expressions (regex) in PostgreSQL for advanced pattern matching. We will cover core concepts from basic matching to complex patterns using metacharacters.

In [1]:
%load_ext sql

In [2]:
%sql postgresql://fahad:secret@localhost:5432/people

---
## 1. Setup: Create a Sample Table

We need some text data to practice our regex patterns on. Let's create a `user_comments` table.

In [3]:
%%sql
DROP TABLE IF EXISTS user_comments CASCADE;
CREATE TABLE user_comments (
    id SERIAL PRIMARY KEY,
    comment_text TEXT
);
INSERT INTO user_comments (comment_text) VALUES
('This is a great product! I love it.'),
('The price is 100 dollars.'),
('Contact me at user@example.com for more info.'),
('Another email is test.user@domain.co.uk.'),
('The cat sat on the mat.'),
('What a cool feature.'),
('Error code: 404 not found.');

 * postgresql://fahad:***@localhost:5432/people
Done.
Done.
7 rows affected.


[]

---
## 2. Basic Matching with `~`

The `~` operator is used to match a string against a regular expression.

In [12]:
%%sql 
SELECT comment_text FROM user_comments WHERE comment_text ~ 'cat';

 * postgresql://fahad:***@localhost:5432/people
1 rows affected.


comment_text
The cat sat on the mat.


---
## 3. Anchors: `^` and `$`

- `^`: Matches the beginning of the string.
- `$`: Matches the end of the string.

In [13]:
%%sql
-- Find comments that start with 'The'
SELECT comment_text FROM user_comments WHERE comment_text ~ '^The';

 * postgresql://fahad:***@localhost:5432/people
2 rows affected.


comment_text
The price is 100 dollars.
The cat sat on the mat.


In [14]:
%%sql
-- Find comments that end with a period '.'
SELECT comment_text FROM user_comments WHERE comment_text ~ '\.$';

 * postgresql://fahad:***@localhost:5432/people
7 rows affected.


comment_text
This is a great product! I love it.
The price is 100 dollars.
Contact me at user@example.com for more info.
Another email is test.user@domain.co.uk.
The cat sat on the mat.
What a cool feature.
Error code: 404 not found.


---
## 4. Character Classes `[]` and Wildcard `.`

- `.` : Matches any single character.
- `[abc]`: Matches either 'a', 'b', or 'c'.
- `[a-z]`: Matches any lowercase letter.
- `[^abc]`: Matches any character that is NOT 'a', 'b', or 'c'.

In [15]:
%%sql
-- Find comments containing 'c.t' (cat, cot, etc.)
SELECT comment_text FROM user_comments WHERE comment_text ~ 'c.t';

 * postgresql://fahad:***@localhost:5432/people
1 rows affected.


comment_text
The cat sat on the mat.


In [16]:
%%sql
-- Find comments that contain a number
SELECT comment_text FROM user_comments WHERE comment_text ~ '[0-9]';

 * postgresql://fahad:***@localhost:5432/people
2 rows affected.


comment_text
The price is 100 dollars.
Error code: 404 not found.


---
## 5. Quantifiers: `*`, `+`, `?`, `{}`

- `*`: Zero or more times.
- `+`: One or more times.
- `?`: Zero or one time.
- `{n,m}`: Between n and m times.

In [17]:
%%sql
-- Find comments with two or more 'o's in a row
SELECT comment_text FROM user_comments WHERE comment_text ~ 'o{2,}';

 * postgresql://fahad:***@localhost:5432/people
0 rows affected.


comment_text


---
## 6. Grouping `()` and Alternation `|`

- `()`: Groups expressions together.
- `|`: Acts as an OR operator.

In [18]:
%%sql
-- Find comments that contain 'cat' or 'dog' (no dogs in our data)
SELECT comment_text FROM user_comments WHERE comment_text ~ '(cat|dog)';

 * postgresql://fahad:***@localhost:5432/people
1 rows affected.


comment_text
The cat sat on the mat.


---
## 7. Case-Insensitive Matching with `~*`

In [19]:
%%sql
-- Find comments starting with 'this' or 'The' case-insensitively
SELECT comment_text FROM user_comments WHERE comment_text ~* '^the';

 * postgresql://fahad:***@localhost:5432/people
2 rows affected.


comment_text
The price is 100 dollars.
The cat sat on the mat.
