# From a Website

## Import JSON Data from an API

**Steps:**
1. Import the `requests` library
2. Define the API URL
3. Send a GET request with timeout
4. Parse the JSON response
5. Display the data

**Code Example:**
```python
url = "your_url"
resp = requests.get(url, timeout=15)
data = resp.json()
print(data)
```

In [1]:
# Example
import requests
url = "https://jsonplaceholder.typicode.com/todos"
resp = requests.get(url, timeout=15)
data = resp.json()
print(data)

[{'userId': 1, 'id': 1, 'title': 'delectus aut autem', 'completed': False}, {'userId': 1, 'id': 2, 'title': 'quis ut nam facilis et officia qui', 'completed': False}, {'userId': 1, 'id': 3, 'title': 'fugiat veniam minus', 'completed': False}, {'userId': 1, 'id': 4, 'title': 'et porro tempora', 'completed': True}, {'userId': 1, 'id': 5, 'title': 'laboriosam mollitia et enim quasi adipisci quia provident illum', 'completed': False}, {'userId': 1, 'id': 6, 'title': 'qui ullam ratione quibusdam voluptatem quia omnis', 'completed': False}, {'userId': 1, 'id': 7, 'title': 'illo expedita consequatur quia in', 'completed': False}, {'userId': 1, 'id': 8, 'title': 'quo adipisci enim quam ut ab', 'completed': True}, {'userId': 1, 'id': 9, 'title': 'molestiae perspiciatis ipsa', 'completed': False}, {'userId': 1, 'id': 10, 'title': 'illo est ratione doloremque quia maiores aut', 'completed': True}, {'userId': 1, 'id': 11, 'title': 'vero rerum temporibus dolor', 'completed': True}, {'userId': 1, 'i

# Import Text File from Local Storage

## Reading a Local Text File

**Steps:**
1. Use `open()` function with the file path
2. Specify read mode `'r'`
3. Read the file contents using `read()`
4. Use `with` statement for automatic file closing

**Code Example:**
```python
with open("your text file", "r") as f:
    content = f.read()
    print(content)
```

In [4]:
with open("Email.txt", "r") as f:
    content = f.read()
    print(content)

Dave Martin
615-555-7164
173 Main St., Springfield RI 55924
davemartin@bogusemail.com

Charles Harris
800-555-5669
969 High St., Atlantis VA 34075
charlesharris@bogusemail.com

Eric Williams
560-555-5153
806 1st St., Faketown AK 86847
laurawilliams@bogusemail.com

Corey Jefferson
900-555-9340
826 Elm St., Epicburg NE 10671
coreyjefferson@bogusemail.com

Jennifer Martin-White
714-555-7405
212 Cedar St., Sunnydale CT 74983
jenniferwhite@bogusemail.com

Erick Davis
800-555-6771
519 Washington St., Olympus TN 32425
tomdavis@bogusemail.com

Neil Patterson
783-555-4799
625 Oak St., Dawnstar IL 61914
neilpatterson@bogusemail.com

Laura Jefferson
516-555-4615
890 Main St., Pythonville LA 29947
laurajefferson@bogusemail.com

Maria Johnson
127-555-1867
884 High St., Braavosâ€Ž ME 43597
mariajohnson@bogusemail.com

Michael Arnold
608-555-4938
249 Elm St., Quahog OR 90938
michaelarnold@bogusemail.com

Michael Smith
568-555-6051
619 Park St., Winterfell VA 99000
michaelsmith@bogusemail.com

Erik St

# Filter / Data Storage
## Sorting

- **1. Example: You want to only print out the gmail**
- **2. Search for a name, if it's found, print their name, phone number, address and email**
- **3. Replace names, if the word "..." exists, replace it to some "...".**
- **4. Gmail, if length of gmail is >= 15, duplicate it in another file**
- **5. Delete the gmail that contains the letter y"**
- **6. Insert a new data"**
- **7. Check for duplication, if present, delete it. Else continue**


In [None]:
# 1. Filter and save only Gmail addresses to new file
with open("Email.txt", "r") as f:
    content = f.read()
    lines = content.split('\n')
    
    # Filter lines containing gmail
    gmail_lines = [line for line in lines if 'gmail' in line.lower()]
    
    # Save to new file
    with open("Gmail_Only.txt", "w") as output_file:
        for line in gmail_lines:
            if line.strip():
                output_file.write(line.strip() + "\n")
    
    print(f"Saved {len([l for l in gmail_lines if l.strip()])} Gmail addresses to Gmail_Only.txt")

In [12]:
# 2. Search for a name and print the lines below it
search_name = "Jane Stuart"

with open("Email.txt", "r") as f:
    lines = f.readlines()
    
    for i, line in enumerate(lines):
        if search_name in line:
            print(f"Found '{search_name}' at line {i+1}")
            print("Details:")
            # Print the next 3 lines
            for j in range(i+1, min(i+4, len(lines))):
                print(lines[j].strip())
            break
    else:
        print(f"Name '{search_name}' not found.")

Found 'Jane Stuart' at line 476
Details:
623-555-3006
983 Oak St., Old-town RI 15445
janestuart@bogusemail.com


In [None]:
# 3. Replace names and save to new file
with open("Email.txt", "r") as f:
    content = f.read()

# Replace the name
updated_content = content.replace("John", "Jonathan")

# Write to new file
with open("Email_Names_Updated.txt", "w") as f:
    f.write(updated_content)

print("Replaced 'John' with 'Jonathan' and saved to Email_Names_Updated.txt")

In [None]:
# 4. Gmail - if length >= 15, duplicate it in another file
with open("Email.txt", "r") as f:
    lines = f.readlines()

long_gmails = []
for line in lines:
    if "@gmail.com" in line and len(line.strip()) >= 15:
        long_gmails.append(line.strip())

# Write to new file
with open("Long_Gmails.txt", "w") as f:
    for gmail in long_gmails:
        f.write(gmail + "\n")

print(f"Found {len(long_gmails)} long Gmail addresses and saved to Long_Gmails.txt")

In [None]:
# 5. Delete Gmail with "y" and save to new file
with open("Email.txt", "r") as f:
    lines = f.readlines()

# Filter out lines with Gmail containing "y"
filtered_lines = []
removed_count = 0
for line in lines:
    if "@gmail.com" in line and "y" in line.lower():
        print(f"Removing: {line.strip()}")
        removed_count += 1
    else:
        filtered_lines.append(line)

# Write to new file
with open("Email_No_Y_Gmail.txt", "w") as f:
    f.writelines(filtered_lines)

print(f"Removed {removed_count} Gmail addresses containing 'y' and saved to Email_No_Y_Gmail.txt")

In [None]:
# 6. Insert new data and save to new file
with open("Email.txt", "r") as f:
    original_content = f.read()

new_data = """
Name: Alice Brown
ID: E004
Department: HR
Salary: 65000
Email: alice.brown@company.com
"""

# Combine original and new data
updated_content = original_content + new_data

# Write to new file
with open("Email_With_New_Data.txt", "w") as f:
    f.write(updated_content)

print("Added new employee data and saved to Email_With_New_Data.txt")

Added new employee data to file


In [18]:
# 7. Remove duplicate emails and save to new file (ignore spaces)
with open("Email.txt", "r") as f:
    lines = f.readlines()

# Extract and deduplicate only email addresses
seen_emails = set()
unique_lines = []
duplicates_found = 0

for line in lines:
    # Check if line contains an email
    if "@" in line:
        # Extract email and normalize (remove spaces, lowercase)
        email_part = line.strip().replace(" ", "").lower()
        
        if email_part not in seen_emails:
            seen_emails.add(email_part)
            unique_lines.append(line)
        else:
            duplicates_found += 1
            print(f"Duplicate email found: {line.strip()}")
    else:
        # Keep non-email lines as they are
        unique_lines.append(line)

# Write to new file
with open("Email_No_Duplicate_Emails.txt", "w") as f:
    f.writelines(unique_lines)

print(f"Removed {duplicates_found} duplicate emails and saved to Email_No_Duplicate_Emails.txt")

Duplicate email found: laurajefferson@bogusemail.com
Duplicate email found: elizabetharnold@bogusemail.com
Duplicate email found: barbarawilliams@bogusemail.com
Duplicate email found: travisjackson@bogusemail.com
Duplicate email found: jenniferdavis@bogusemail.com
Duplicate email found: jamestaylor@bogusemail.com
Removed 6 duplicate emails and saved to Email_No_Duplicate_Emails.txt


# Regular Expressions (Regex) in Python

Regular expressions (regex) are patterns used to match sequences of characters in strings. They are powerful tools for searching, extracting, and manipulating text data.

## Use Cases

- **Validation:** Check if a string matches a pattern (e.g., email, phone number).
- **Extraction:** Find and extract specific data (e.g., emails from a document).
- **Replacement:** Substitute matched patterns with new text.
- **Splitting:** Divide text based on patterns.

## Examples

1. Find Words
2. Split Words
3. Replacing Words

In [22]:
import re
text = "The quick brown fox"
match = re.search(r"quick", text)
if match:
    print("Match found:", match.group())
else:
    print("No match.")

Match found: quick


In [24]:
import re
text = "one,two,three"
parts = re.split(r",", text)
print(parts) # Output: ['one', 'two', 'three']

['one', 'two', 'three']


In [25]:
import re
text = "This is a test string."
new_text = re.sub(r"test", "sample", text)
print(new_text) # Output: This is a sample string.

This is a sample string.
