# Basics of Regular Expressions (Regex)

Regular expressions (regex) are patterns used to match character combinations in strings. They are powerful tools for searching, editing, and manipulating text.

## What is a Regular Expression?

A regular expression is a sequence of characters that define a search pattern. It can be used for string matching, searching, and replacing.

### Basic Syntax

- **Literal Characters**: Match themselves. For example, `a` matches the character 'a'.
- **Dot (`.`)**: Matches any single character except newline.
- **Caret (`^`)**: Matches the start of a string.
- **Dollar (`$`)**: Matches the end of a string.
- **Asterisk (`*`)**: Matches 0 or more repetitions of the preceding element.
- **Plus (`+`)**: Matches 1 or more repetitions of the preceding element.
- **Question Mark (`?`)**: Matches 0 or 1 repetition of the preceding element.
- **Braces (`{}`)**: Matches a specific number of repetitions of the preceding element. For example, `a{3}` matches 'aaa'.
- **Square Brackets (`[]`)**: Matches any one of the enclosed characters. For example, `[abc]` matches 'a', 'b', or 'c'.
- **Parentheses (`()`)**: Groups multiple tokens together and remembers the matched text.
- **Pipe (`|`)**: Matches either the pattern before or the pattern after the `|`.

### Special Characters

- **Backslash (`\`)**: Escapes a special character. For example, `\\` matches a single backslash.
- **Word Character (`\w`)**: Matches any word character (alphanumeric + underscore).
- **Digit (`\d`)**: Matches any digit.
- **Whitespace (`\s`)**: Matches any whitespace character (spaces, tabs, line breaks).
- **Non-Word Character (`\W`)**: Matches any character that is not a word character.
- **Non-Digit (`\D`)**: Matches any character that is not a digit.
- **Non-Whitespace (`\S`)**: Matches any character that is not a whitespace character.

# Quick Recap

Visit <https://regexone.com/> and complete as many of the lessons as you can.

# Testing

Visit <https://regex101.com/> when you want to test your regex against test data.

## Basic Understanding of the `re` Module in Python

The `re` module in Python provides support for working with regular expressions. Regular expressions are powerful tools for matching patterns in text, and the `re` module provides a variety of functions to work with these patterns.

### Importing the `re` Module

Before you can use the `re` module, you need to import it:

<code>
import re
</code>

### Basic Functions in the `re` Module

1. **`re.search()`**
   - Searches for the first location where the regular expression pattern matches in the string.
   - Returns a match object if a match is found, or `None` if no match is found.

   <code>
   import re
   pattern = r'\bhello\b'
   text = "hello world"
   match = re.search(pattern, text)
   if match:
       print("Match found:", match.group())
   </code>

2. **`re.match()`**
   - Checks for a match only at the beginning of the string.
   - Returns a match object if the pattern matches at the start of the string, or `None` if no match is found.

   <code>
   import re
   pattern = r'hello'
   text = "hello world"
   match = re.match(pattern, text)
   if match:
       print("Match found:", match.group())
   </code>

3. **`re.findall()`**
   - Finds all non-overlapping matches of the pattern in the string and returns them as a list of strings.
   - Returns an empty list if no matches are found.

   <code>
   import re
   pattern = r'\b\w+\b'
   text = "hello world"
   matches = re.findall(pattern, text)
   print("Matches found:", matches)
   </code>

4. **`re.sub()`**
   - Replaces the matches of the pattern in the string with a specified replacement string.
   - Returns the modified string.

   <code>
   import re
   pattern = r'world'
   replacement = 'Python'
   text = "hello world"
   result = re.sub(pattern, replacement, text)
   print("Modified string:", result)
   </code>

### Summary

The `re` module in Python provides powerful functions for working with regular expressions, allowing you to search, match, find all occurrences, and replace patterns in strings. Here are some of the most commonly used functions:
- `re.search()`: Searches for the first match of the pattern in the string.
- `re.match()`: Checks for a match only at the beginning of the string.
- `re.findall()`: Finds all non-overlapping matches of the pattern in the string.
- `re.sub()`: Replaces the matches of the pattern in the string with a specified replacement.

These functions can be used to perform various text processing tasks efficiently and effectively.


In [None]:
import re

In [None]:
# Read email text from the file
with open('email.txt', 'r') as file:
    text = file.read()

# Regular expression to find email addresses
email_pattern = r'[\w.]+@[\w.]+\.[a-zA-Z]{2,}'

# Find all email addresses in the text
email_addresses = re.findall(email_pattern, text)


# Print number of email addresses found
print(len(email_addresses))
      
# Print the found email addresses
for email in email_addresses:
    print(email)


In [None]:
# Read ip text from the file
with open('ip.txt', 'r') as file:
    text = file.read()

# Regular expression to find ip addresses
ip_pattern = 

# Find all ip addresses in the text
ip_addresses = re.findall(ip_pattern, text)


# Print number of email addresses found
print(len(ip_addresses))
      
# Print the found email addresses
for ip in ip_addresses:
    print(ip)

# This is a timed Challenge

In the file flags.txt find the FLAG{}. The flag is contained in the {} braces

So the flag could look like FLAG{You found the flag}

Use the cell below to write code

In [None]:
# WRITE YOUR CODE HERE