Name	Name	Last commit message	Last commit date
parent directory ..
01_basic_regular_expressions	01_basic_regular_expressions
basic_regular_expressions.md	basic_regular_expressions.md
readme.md	readme.md

🌟 Understanding Regular Expressions

Regular Expressions (often abbreviated as regex or regexp) are like magic wands for text! 🪄 They are special patterns that help programmers and applications perform powerful text manipulations. You can use regex to:

✅ Validate Input: Check if an input follows a specific pattern (e.g., confirming an email format).
🔍 Find and Extract: Search for text matching a pattern in a larger body of text.
✂️ Replace Text: Substitute text that matches a pattern with something new.
📑 Split Text: Break down a block of text into smaller pieces based on a pattern.

🤕 Beware: Regex can be tricky! If not used carefully, they can make your code confusing and difficult to debug!

📜 History of "Regular Expressions"

The term "regular expression" comes from the fields of mathematics and computer science. It originally referred to expressions with a specific property called regularity. Here’s a brief history:

🏛️ Old-School Regex: The earliest versions of regex were implemented using something called a Deterministic Finite Automaton (DFA). Think of DFA as a path with no backtracking—once you move forward, there's no going back! 🔄
⚡ Modern-Day Regex: Today's popular regex, especially Perl-style regex, don't follow the old mathematical definition strictly. Instead, they use a Nondeterministic Finite Automaton (NFA), which involves backtracking (going back to recheck previous paths). This allows for more flexibility and power, but also more complexity!

📝 Note: Some computer scientists get a bit upset when modern regex is called "regular" since it technically isn't by their standards. But for us practical programmers, what matters is knowing how to wield this tool effectively! 💪

💡 Why Use Regular Expressions?

When used wisely, regular expressions can make your life as a programmer a lot easier! Here’s why:

🚀 Efficiency: With regex, tasks that would normally take hundreds of lines of code can be accomplished in just a few! For example, extracting all email addresses from a document might need a lot of complex procedural code—but with regex, it’s a breeze! 🌬️
🧩 Simplicity: Regex can simplify many text-processing tasks that would otherwise be tedious and hard to maintain.

However, remember the saying:

“Some people, when confronted with a problem, think ‘I know, I’ll use regular expressions.’ Now they have two problems.” 😅

This joke highlights a truth: regex can sometimes complicate things if overused or applied in the wrong context. That’s why you need to read this section carefully to master regex and avoid these pitfalls! 📖

🎨 The Many Flavors of Regular Expressions

So, what exactly are regular expressions? 🤔 Surprise! There isn't a single answer. Unlike some concepts with strict definitions, regex lacks a universal standard, leading to various flavors.

🍦 Perl-Style Regex: The most common and popular flavor is inspired by the Perl programming language. These "Perl-style" regular expressions are the basis for most modern regex patterns. While they are mostly similar and compatible, small differences can exist between implementations.
✍️ Shortcut Terms: Instead of always saying "regular expression," developers often use the shorthand regex or regexp for singular and regexes for plural.

🛠️ How to Use the `re` Module

Creating a Pattern 🧩:
A regular expression (RE) is formed using a pattern string. Patterns define what kind of text you're looking for. For example, a pattern could look for any sequence of digits or specific characters in a text.
The compile() Function ⚙️:
The re module provides a compile() function to create a regex object from a pattern. This object allows you to perform different regex operations efficiently.
Regex Object Methods 🔍:
Once you have a regex object, you can use its methods to interact with text:
- search(): Finds the first occurrence of the pattern in the text.
- match(): Checks if the pattern matches from the start of the text.
- findall(): Returns all occurrences of the pattern in the text.
- sub(): Replaces the matched pattern with a new string.
Shortcut Functions ⚡:
The re module also has shortcut functions like re.search(), re.match(), re.findall(), and re.sub() that can be used without creating a regex object.

📋 Table Regular Expression Pattern Syntax

🔤 Element	📖 Meaning
`.`	Matches any single character except a newline (`\n`). This is like a wildcard that can be any letter, number, or symbol.
`^`	Matches the start of a string. If `MULTILINE` mode is enabled, it also matches the position right after a newline (`\n`).
`$`	Matches the end of a string. In `MULTILINE` mode, it also matches the position right before a newline (`\n`).
*``**	Matches zero or more occurrences of the preceding pattern; it's greedy (tries to match as much as possible).
`+`	Matches one or more occurrences of the preceding pattern; greedy.
`?`	Matches zero or one occurrence of the preceding pattern; greedy. It is useful for optional characters or patterns.
*`?`, `+?`, `??`**	These are nongreedy (or lazy) versions of ``, `+`, and `?` respectively. They match as few characters as possible*.
`{m,n}`	Matches between `m` and `n` occurrences of the preceding pattern; greedy. You can use it to specify exact repetition.
`{m,n}?`	Matches between `m` and `n` occurrences of the preceding pattern; nongreedy.
`[...]`	Matches any one character from a set inside the brackets. For example, `[abc]` matches `a`, `b`, or `c`.
`[^...]`	Matches any one character not in the set inside the brackets. For example, `[^abc]` matches any character except `a`, `b`, or `c`.
**`	`**
`(...)`	Matches the pattern inside the parentheses and creates a capturing group. Use it to group patterns for extraction or backreferences.
`(?:...)`	Matches the pattern inside the parentheses but does not create a capturing group. Useful for grouping without capturing.
`(?P<id>...)`	Matches the pattern and names the group as `id`. Named groups make patterns more readable and easier to reference.
`(?P=id)`	Matches the same text as previously matched by the named group `id`. Use it to refer back to the named group.
`(?#...)`	A comment inside the regex pattern; helps improve readability without affecting the match.
`(?=...)`	Positive lookahead: Matches if the pattern matches what comes next, but does not consume any characters. Used to assert that something follows.
`(?!...)`	Negative lookahead: Matches if the pattern does not match what comes next, without consuming any characters. Useful for excluding patterns.
`(?<=...)`	Positive lookbehind: Matches if the pattern precedes the current position; must match a fixed length. Ensures something comes before the match.
`(?<!...)`	Negative lookbehind: Matches if the pattern does not precede the current position; must match a fixed length.
`\number`	Matches the text previously matched by the group numbered `number`. Useful for reusing matched patterns.
`\A`	Matches the start of the entire string. Unlike `^`, it is not affected by `MULTILINE`.
`\b`	Matches a word boundary (the start or end of a word). Helps match whole words only.
`\B`	Matches a position that is not a word boundary. Useful for ensuring a match occurs within a word.
`\d`	Matches a digit (equivalent to `[0-9]`). In Unicode mode, it matches any Unicode digit.
`\D`	Matches a non-digit (equivalent to `[^0-9]`).
`\s`	Matches a whitespace character (spaces, tabs, newlines, etc.).
`\S`	Matches a non-whitespace character.
`\w`	Matches an alphanumeric character (letters, digits, underscore). Equivalent to `[a-zA-Z0-9_]`.
`\W`	Matches a non-alphanumeric character.
`\Z`	Matches the end of the entire string. Unlike `$`, it is not affected by `MULTILINE`.
`\\`	Matches a literal backslash (`\`) character. Useful for escaping special characters.
`(?iLmsux)`	Alternate way to set optional flags within a pattern. Flags control the regex behavior (case insensitivity, multiline mode, etc.).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

20_regular_expressions

20_regular_expressions

readme.md

🌟 Understanding Regular Expressions

📜 History of "Regular Expressions"

💡 Why Use Regular Expressions?

🎨 The Many Flavors of Regular Expressions

🛠️ How to Use the `re` Module

📋 Table Regular Expression Pattern Syntax

Files

20_regular_expressions

Directory actions

More options

Directory actions

More options

Latest commit

History

20_regular_expressions

Folders and files

parent directory

readme.md

🌟 Understanding Regular Expressions

📜 History of "Regular Expressions"

💡 Why Use Regular Expressions?

🎨 The Many Flavors of Regular Expressions

🛠️ How to Use the re Module

📋 Table Regular Expression Pattern Syntax

🛠️ How to Use the `re` Module