# Lecture 3 - Python Strings

## Purpose

- Create strings
- Access individual characters in strings
- Access portions of strings, AKA string slicing
- Learn about string methods
- Use select string modification methods
- Use select string methods for counting, finding, and replacing parts of strings

## Some Creative Commons Reference Sources for This Material

- *Think Python 2nd Edition*, Allen Downey, chapter 8
- *The Coder's Apprentice*, Pieter Spronck, chapter 10
- *A Practical Introduction to Python Programming*, Brian Heinold, chapter 6
- *Algorithmic Problem Solving with Python*, John Schneider, Shira Broschat, and Jess Dahmen, chapter 9

## Creating Strings

- Objects that contain alphabetic text characters are called **strings**
- Strings can be as short as single character
- Strings do not have to contain just letters 
- They can contain nearly any character (including Unicode characters like emoticons)
- Strings are defined by surrounding the text with a set of single or double quotes
- For example, `'parrot'` and `"Lumberjack"`
- Including apostrophes and/or a set of quotes within strings
  - Use an *escape sequence*
    - `\'` to include an apostrophe in a single quoted string, i.e. `'isn\'t'`
    - `\"` to include double quotes in a double quoted string, i.e. `"\"hello\""`
    - `\\` to include an actual back slash character
  - Use double quotes for strings with apostrophes in them, i.e. `"can't"`
  - Use single quotes for strings with double quotes in them, i.e. `'She said "hello."'`
  - Enclose the string between three sets of single or double quotes, i.e. `""""It's only a flesh wound," he declared"""`
  - Three sets of single or double quotes can also be used to wrap multi-line quotes

    ```python
    """Hello there. This assignment
    is all about strings."""
    ```

## String Length
- The `len()` function returns the number of characters in a string
- `len("Python")` will tell you that the string is 6 characters long

## Accessing Individual String Characters

- Strings are lists of characters grouped together but treated as one object
- Access specific characters in strings based on their position in the string
- All counting in *Python* starts with zero, not one
- `"H"` in `"Hello"` is in the zeroth position (also called the zeroth index)
- Access the zeroth character using `"Hello"[0]`
- Access the first `l` using `"Hello"[2]`
- This process is referred to as **string indexing**
- The last item in `my_string` can be accessed using `my_string[len(my_string)-1]`
- You can also index strings from the right end instead of the left
- First character from the left is accessed with the index `[0]`
- The last character can be indexed directly using `[-1]`
- The negative sign says to count from the right end of the string
- Indexing from the right starts with `-1` not `-0`, because...
  - `-0` is the same as `0` mathematically 
  - Using `-0` could cause confusion


  ```
  | 0 | 1 | 2 | 3 | 4 | <-- Indexing from the left
  |   |   |   |   |   |
  | H | e | l | l | o | <-- String characters "Hello"
  |   |   |   |   |   |
  |-5 |-4 |-3 |-2 |-1 | <-- Indexing from the right
  ```

## Accessing Portions of Strings, AKA Slicing

- Can access part of a string (a sub-string) in addition to single characters
- Called **slicing**
- Place square brackets after a string or variable with start, stop, and step values separated by colons
- Slice includes the character at the start index
- Slice ends one character before the end argument
- All three of the slice arguments are optional
  - If no step argument, it is assumed to be `1`
  - If no start argument, it is assume to be...
    - `0` for positive steps
    - Last character for negative steps
  - If no end argument, the last character included is assumed to be...
    - Last character of the string for positive steps 
    - First character for negative steps
- The diagram below describes how slice numbering works
- Slices happen between characters
- Examples
  - `my_string[0:3]` and `my_string[:3]` and `my_string[0:3:1]` are the same
  - `my_string[0:]` and `my_string[0::1]` and `my_string[:]`
  - `Hello[0:3]` would slice out `"Hel"` 
  - `Hello[2:]` would slice out `"llo"`
  - A slice argument of `[:]` will return a copy of the entire string

  ```
  0   1   2   3   4   5 <-- Slicing from the left; my_string[1:4] => 'ell'
  |   |   |   |   |   |
  | H | e | l | l | o | <-- String characters "Hello"
  |   |   |   |   |   |
  -5  -4  -3  -2  -1   | <-- Slicing from the right with a positive step; my_string[-4:-1] => 'ell'
  |   |   |   |   |   |
  -6  -5  -4  -3  -2  -1 <-- Slicing from the right with a negative step; my_string[-1:-5:-1] => 'olle'
  ```

## String Methods

- *Python* has a number of **methods** that can act on or with strings. 
- Use `x.my_method()` for a method called `my_method()` working with `x`
- Execute the following code cell to see a list of string methods, functions, and operations
- The methods are the items without leading and trailing underscores
- Can also use `print(dir(str))` to list the methods

In [None]:
my_string = "Hello, World!"
print(dir(my_string))

In [None]:
print(dir(str))

Even better, use the following for a cleaner looking list of methods. This technique uses list comprehension that we have not covered yet, so don't worry about how the command is constructed.

In [None]:
print([x for x in dir(str) if "_" not in x])

Don't worry, we won't dig into all of these methods, just some of them. Specifically, we'll investigate methods related to modifiying strings and methods used for counting, searching, and replacing parts of strings.

### Methods for Modifying Strings

#### Immutability of Strings
- *Python* strings are **immutable**
- Cannot be changed (mutated or modified) after they are created
- To "modify" a string, you need to create a copy that has your desired changes
- This is what happens with the methods that "modify" strings
- Original string is unchanged and a new string is created with the modifications
- "Replace" the original by assigning the new string to the original variable name

#### List of String Modifying Methods
- First five of the six methods are associated with changing the case
- Their names essentially describe what they do
  - `.lower()`
  - `.upper()`
  - `.title()`
  - `.swapcase()`
  - `.capitalize()`
- Only one method accepts an argument
  - `.center(x)`
  - Argument is the length of the new string in which the original string will be centered
- Get help on any string method using `help(str.method_name)`, i.e. `help(str.center)`




### Methods for Counting, Searching, and Replacing

- `.count()`
  - Count the number of times that a specific character or sub-string appears in a larger string
  - Requires at least one argument, the sub-string that you want counted (with enclosing quotes)
  - Can also include two optional arguments
    - Where to start the count 
    - Where to end the count
  - Examples
    - `my_string.count("a")` returns the number of times `"a"` occurs in `my_string`
    - `my_string.count("a", 2, 7)` counts starting at index 2 and ends just before index 7
    - `my_string.count("a", 4)` counts from the 4th index to the end
    - You cannot just specify an ending index
- `.find()` 
  - Look for a search string within a target string; `target_string.find(search_string)`
  - If found, it returns the index from the target matching the search string's starting position
  - If not found, it returns a `-1` 
  - If the search string occurs more than once, only the position of the first occurrence will be returned
  - Accepts optional starting and ending arguments
  - Examples
    - `my_string.find("the")` returns the index of the first occurrence of `"the"`
- `.replace()` 
  - Replace all occurrences of a search string with a new string within a target string
  - Two required and one optional argument
    - Search string 
    - New string
    - Optional argument is the number of occurrences to replace
    - Examples
      - `my_string.replace('I', 'you', 2)` replaces the first 2 occurrences of `'I'` with `'you'`
- The original string is not changed using any of these methods
  - For example, use `my_string = my_string.replace('I', 'you', 2)` to replace the original string

### The `is` String Methods

- These methods ask a question of the string or portion of a string on which they act
- For instance, `my_string.isalpha()`
  - Asks if the entire string is composed of alphabetic characters only
  - If it does, then the method returns `True`
  - If not, the method returns `False`
- Portions of strings can be used by slicing them.
  - `my_string[:3].islower()` asks if the first three characters are all lower case
- Partial list of the `is` string methods.
  - `.islower()` checks if all characters are lowercase
  - `.isupper()` checks if all characters are uppercase
  - `.isspace()` checks if all characters are spaces
  - `.istitle()` checks if each word starts with an uppercase letter and the rest are lowercase
  - `.isalpha()` checks if all characters are alphabetic
  - `.isalnum()` checks if all characters are either alphabetic or numeric
  - `.isnumeric()` checks if all characters are numeric (0-9)