<a href="https://colab.research.google.com/github/zmuhls/ccny-data-science/blob/main/learningStrings.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Learning String Methods



## Basic String Methods

In this section, we'll explore some fundamental string methods in Python. These methods are essential tools for manipulating and analyzing text data.



In [1]:
text = "Hello, world!"
length = len(text)
print(length)  # Output: 13

13


#### **a. Calculating String Length**

- **`len()`**: Returns the number of characters in a string.

  **Example:**

**Explanation:**

  - The `len()` function calculates the total number of characters in the string `text`, including letters, spaces, and punctuation.
  - This is useful for determining the size of the string or validating input length.

 **Use Cases:**

  - Validating user input (e.g., passwords, usernames)
  - Controlling loops when iterating over strings
  - Checking if a string meets certain length criteria

#### **b. Changing Case**

- **`.lower()` / `.upper()`**: Converts all characters in a string to lowercase or uppercase.

  **Example:**


In [None]:
  text = "Hello, World!"
  lower_text = text.lower()
  print(lower_text)

In [None]:
  upper_text = text.upper()
  print(upper_text)

**Explanation:**

  - These methods do not change the original string but return a new string with the case changed
  - Useful for case-insensitive comparisons or standardizing text

**Use Cases:**<br>

  - Comparing user input regardless of case
  - Formatting text output
  - Preparing text data for analysis


#### **c. Removing Whitespace**

- **`.strip()`**: Removes any leading (beginning) and trailing (end) whitespace or specified characters from a string.

  **Example:**


In [None]:
text = "   Hello   "
stripped_text = text.strip()
print(stripped_text)  # Output: 'Hello'

**Explanation:**

  - By default, `.strip()` removes spaces, tabs (`\t`), and newlines (`\n`) from both ends of the string.
  - You can specify characters to remove by passing them as an argument.

  **Example with Specified Characters:**

In [None]:
text = "---Hello---"
stripped_text = text.strip('-')
print(stripped_text)  # Output: 'Hello'

  **Use Cases:**

  - Cleaning up user input.
  - Processing data from files.
  - Preparing strings for parsing or storage.

#### **d. Replacing Substrings**

- **`.replace(old, new)`**: Replaces all occurrences of a specified substring (`old`) with another substring (`new`).

  **Example:**


In [None]:
text = "Hello, world!"
replaced_text = text.replace("world", "goodbye")
print(replaced_text)

**Explanation:**

  - This method scans the string for the specified `old` substring and replaces it with `new`.
  - It returns a new string; the original string remains unchanged.<br>

**Use Cases:**

  - Modifying text content dynamically.
  - Censoring or sanitizing input.
  - Updating file paths or URLs in text data.


#### **e. Splitting Strings**

- **`.split(separator)`**: Splits the string into a list of substrings based on a specified `separator`.

  **Example:**

In [None]:
  text = "Hello, world!"
  parts = text.split(",")
  print(parts)

  **Explanation:**

  - The method looks for the separator (in this case, a comma) and splits the string at each occurrence.
  - If no separator is specified, it defaults to splitting on whitespace.

  **Example with Default Separator:**

In [None]:
  text = "Hello world!"
  words = text.split()
  print(words)


  **Use Cases:**

  - Parsing CSV (Comma-Separated Values) files.
  - Tokenizing sentences into words for text analysis.
  - Processing user input that contains multiple values.

#### **f. Joining Strings**

- **`.join(iterable)`**: Joins elements of an iterable (like a list) into a single string, with the string acting as a separator.

  **Example:**

In [None]:
words = ['Hello', 'world']
sentence = " ".join(words)
print(sentence)

**Explanation:**

  - The string `" "` (space) is used as a separator between the elements.
  - All elements in the iterable must be strings.

  **Example with Different Separator:

In [None]:
words = ['Hello', 'world']
hyphenated = "-".join(words)
print(hyphenated)

  **Use Cases:**

  - Reassembling strings after processing.
  - Creating delimited strings from lists.
  - Generating readable output from data structures.

#### **g. Accessing Characters and Slices**

- **Indexing Characters:**

  - Access individual characters in a string using square brackets `[]` with the index number.

  **Example:**

In [None]:
  text = "Python"
  first_char = text[0]
  print(first_char)

**Explanation:**

  - The index `0` refers to the first character.
  - **Note:** Since Python indexing starts at 0, the last character is at index `len(text) - 1`.


**Negative Indexing:**

  - Use negative numbers to index from the end of the string.

  **Example:**

In [None]:
last_char = text[-1]
print(last_char)

**Explanation:**

  - `-1` refers to the last character, `-2` to the second last, and so on.

**Slicing Strings:**

  - Extract a substring by specifying a range of indices: `[start:stop]`.

  **Example:**

In [None]:
text = "Hello, world!"
substring = text[7:12]
print(substring)

  **Explanation:**

  - The slice `[7:12]` includes characters from index 7 up to, but not including, index 12.
  - **Note:** The character at the `stop` index is not included.

- **Slicing with Steps:**

  - Use an optional third parameter to specify the step size: `[start:stop:step]`.

  **Example:**


In [None]:
text = "abcdef"
every_other = text[::2]
print(every_other)

**Explanation:**

  - The slice `[::2]` starts at the beginning and selects every second character.

**Use Cases:**

  - Extracting substrings.
  - Reversing strings (e.g., `text[::-1]`).
  - Selecting specific patterns within a string.


In [None]:
text = "Hello"
text.lower()
print(text)

  - To change the string, you need to assign the result back to a variable:

In [None]:
text = text.lower()
print(text)

**Zero-Based Indexing:**

  - Remember that indexing starts at 0.
  - This is crucial when accessing specific positions in a string.

**Example:**


In [None]:
text = "Python"
print(text[0])

# Cipher Exercise: Applying String Methods

**Objective:** Utilize the string methods we've just learned to encrypt text from a randomly selected Project Gutenberg book.

---

## **Exercise Steps**

### **1. Random Selection of Text Files (2 minutes)**

- **Select a Book:**

  - Each student will draw a slip from a hat containing different Project Gutenberg book IDs or titles, ensuring a random selection.

- **Download the Text File:**

  - Visit [Project Gutenberg](https://www.gutenberg.org/).
  - Use the book ID or title to find your assigned book.
  - Download the plain text UTF-8 version of the book.


### **2. Reading the Text File into Python (5 minutes)**

**Read the File:**

In [None]:
  # Replace 'your_book.txt' with the actual filename
  with open('your_book.txt', 'r', encoding='utf-8') as file:
      text = file.read()

### **3. Creating a Simple Cipher (10 minutes)**

#### **a. Define a Cipher Dictionary**

- **Create a Simple Substitution Cipher:**

  We'll create a cipher by shifting each letter in the alphabet by a fixed number of positions. For this example, we'll use a shift of 2 positions.

In [None]:
import string

shift = 2  # You can change the shift value
letters = string.ascii_lowercase
shifted_letters = letters[shift:] + letters[:shift]
cipher_dict = dict(zip(letters, shifted_letters))
cipher_dict.update({k.upper(): v.upper() for k, v in cipher_dict.items()})
translation_table = str.maketrans(cipher_dict)

# **Creating the Cipher Dictionary:**

  - The `zip()` function pairs each original letter with its shifted counterpart.
  - `dict()` converts the zipped pairs into a dictionary for mapping.

**Use Cases:**

  - Simple encryption by substituting each character.
  - Understanding how character mapping works.