# 2. Working with Text: String Functions and Methods

As explorers of the digital world, we frequently encounter information as text (strings, `str`). Python provides powerful tools – functions and methods – to process and shape this textual data, much like refining raw data from sensors or preparing messages for transmission.

- **Functions:** General tools applicable to various data types. They act *on* the data you provide: `print(message)`, `len(log_entry)`. Think of these as standard-issue gear available for many tasks across your toolkit.
- **Methods:** Actions specific to a particular data type, like strings. They are *called on* the object using dot notation: `signal.strip()`, `report.title()`. These are like specialized attachments designed for your string 'equipment', allowing specific manipulations.

This lesson covers key tools for string manipulation:
- Getting string length with `len()`.
- Changing text case (`.lower()`, `.upper()`, `.capitalize()`, `.title()`).
- Cleaning up whitespace (`.strip()`) and formatting spacing (`.ljust()`, `.rjust()`, `.center()`).
- Locating substrings (`.find()`).
- Checking string properties (`.startswith()`, `.endswith()`, `.isalnum()`, etc.).
- Modifying content (`.replace()`) and counting occurrences (`.count()`).
- Different ways to define strings (quotes).
- Structuring data with `split()` and `join()`.

In [None]:
# Example: A log entry from our virtual exploration mission
log_entry = "Day 05: System check complete."


# Case Modification Methods
print(log_entry.lower()) # -> "day 05: system check complete."
print(log_entry.upper()) # -> "DAY 05: SYSTEM CHECK COMPLETE."
print(log_entry.capitalize()) # -> "Day 05: system check complete." (Capitalizes only the first character)
print(log_entry.title()) # -> "Day 05: System Check Complete." (Capitalizes the start of each word)


# Length and Whitespace Handling
received_callsign = "  Alpha One  "
print(f"Received length: {len(received_callsign)}") # Includes the spaces

# .strip() cleans unwanted whitespace from the beginning and end
cleaned_callsign = received_callsign.strip()
print(f"Cleaned callsign: '{cleaned_callsign}'")
print(f"Cleaned length: {len(cleaned_callsign)}") # Length after removing surrounding spaces

# Padding methods adjust string length by adding characters, useful for aligning text in reports
status = "OK"
print(status.ljust(10, ".")) # Left-justify: 'OK........' (padded with '.' to total length 10)
print(status.rjust(10)) # Right-justify: '        OK' (padded with spaces)
print(status.center(10, "-")) # Center: '----OK----' (padded with '-')


# Searching in Strings
# .find(substring) locates the first occurrence of 'substring', returning its starting index.
report = "System Status: All systems nominal. Grid reference: 4B."
print(report.find("Status:")) # -> 7 (index where 'Status:' begins)
print(report.find("Grid")) # -> 34 (index where 'Grid' begins)
print(report.find("Alert")) # If the substring is not found, it returns -1


# Boolean-Returning Methods
# They return True or False. Essential for validation checks.
filename = "Report_Day05.txt"
data_packet = "ID:A1B2"
print(filename.startswith("Report_")) # Does it start with "Report_" ? -> True
print(filename.endswith(".log")) # Does it end with ".log" ? -> False
print(filename.endswith(".txt")) # Does it end with ".txt" ? -> True

# Check character types:
sensor_id = "SensorA1"
reading = "1024"
message = "Status OK"
print(sensor_id.isalnum()) # Is it alphanumeric (letters or numbers only)? -> True
print(reading.isdecimal()) # Is it composed of decimal digits (0-9) only? -> True
print(message.isalpha()) # Is it alphabetic (letters only)? -> False (contains space)
print("Alert!".isalnum()) # Contains '!', which is not alphanumeric -> False


# Replacing and Counting
# .replace(old, new) returns a *new* string with all occurrences of 'old' replaced by 'new'.
alert_message = "WARNING: System pressure high. Repeat: pressure high."
normal_message = alert_message.replace("high", "nominal")
print(normal_message) # -> "WARNING: System pressure nominal. Repeat: pressure nominal."
print(alert_message) # Original is unaffected

# .count(substring) tallies non-overlapping occurrences.
log_data = "entry;data;entry;status;entry;error;entry_data"
print(log_data.count("entry")) # Counts "entry" -> 3 (entry_data doesn't start a new match)


# String Concatenation - Combining strings
agent_id_prefix = "Agent"
call_sign = "Phoenix"

# Using + creates a new combined string
full_agent_id = agent_id_prefix + "_" + call_sign
print(full_agent_id) # -> "Agent_Phoenix"

# Using print() with commas automatically inserts spaces between arguments
print("Agent ID:", agent_id_prefix, "Callsign:", call_sign) # -> Agent ID: Agent Callsign: Phoenix


# Splitting and Joining
# .split(separator) breaks a string into a list of substrings. Imagine parsing a coded transmission.
sensor_reading = "ID:SNSR007,Temp:25.5C,Unit:C,Timestamp:1501Z"
reading_parts = sensor_reading.split(',') # Split the string at each comma
print(reading_parts)
# Output: ['ID:SNSR007', 'Temp:25.5C', 'Unit:C', 'Timestamp:1501Z']

# Access individual parts using list index (starting from 0)
print(f"Sensor ID value: {reading_parts[0]}")

# separator.join(list_of_strings) assembles a single string from a list,
# inserting the 'separator' string between each element. Useful for formatting output.
report_elements = ['Mission Report', 'Day 05', 'Status: All Clear']
report_line = " | ".join(report_elements) # Join list elements with " | "
print(report_line)
# Output: Mission Report | Day 05 | Status: All Clear


# Quotes and Multi-line Strings
# Use triple quotes (""" or ''') for text that spans multiple lines.
mission_briefing = """
Mission Objectives:
1. Deploy probe in Sector Gamma.
2. Monitor energy readings.
3. Transmit results by EOD.
"""
print(mission_briefing)

# Multi-line comment:
"""
This block explains the purpose
of the following code section,
which handles data validation.
"""

## practise
## Scenario: Processing incoming status emails for mission control
```python
email_address = "  Agent_Blue@Command.base  " # Example email with extra spaces and mixed case
```
1.  **Data Standardization:**
    - Get the original length of the `email_address` string.
    - Remove leading/trailing whitespace using `.strip()`. Store the result in `cleaned_email`.
    - Get the length of the `cleaned_email`.
    - Convert `cleaned_email` to lowercase. Store the result in `standard_email`.
    - Print the original length, cleaned length, and the final `standard_email`.

---

2.  **Format Verification:**
    - Using the `standard_email` (cleaned, lowercase) from step 1:
    - Verify it contains the `"@"` symbol (use the `in` operator for this check).
    - Verify it ends with the `.base` domain suffix using `.endswith()`.
    - Print the boolean results (`True`/`False`) of these two checks.

---

3.  **Content Processing & Extraction:**
    - Using the `standard_email`:
    - Replace the user part `"agent_blue"` with `"field_op_red"` using `.replace()`. Store the result in `updated_email`.
    - Count the number of occurrences of the letter `"e"` in the `updated_email` using `.count()`.
    - Verify if the `updated_email` starts with `"field_op_red"` using `.startswith()`.
    - Split the `updated_email` into user part and domain part based on the `"@"` symbol using `.split()`. Store the resulting list in `email_components`.
    - Print the `updated_email`, the count of `"e"`, the boolean result of the `startswith` check, and the `email_components` list.


---
#### © Jiří Svoboda (George Freedom)
- Web: https://GeorgeFreedom.com
- LinkedIn: https://www.linkedin.com/in/georgefreedom/
- Book me: https://cal.com/george-freedom-tech-mentor