# Lecture 3 - Python Strings
___

In [None]:
name = "Your name here"
print("Name:", name.upper())

## Purpose

- Create strings

- Access individual characters in strings

- Access portions of strings, AKA string slicing

- Learn about string methods

- Use select string modification methods

- Use select string methods for counting, finding, and replacing parts of strings

## Instructions

1. Replace "Your name here" in the cell below the assignment title with your first and last names and then execute the cell using "Shift-Enter"
2. Execute the time stamp cell 
3. Follow along with the instructor in class as we use *Python* to work with strings (text)
4. Execute the date stamp cell at the end of the document and submitting your saved `.ipynb` file to *Canvas* for credit

## Some Creative Commons Reference Sources for This Material

- *Think Python 2nd Edition*, Allen Downey, chapter 8
- *The Coder's Apprentice*, Pieter Spronck, chapter 10
- *A Practical Introduction to Python Programming*, Brian Heinold, chapter 6
- *Algorithmic Problem Solving with Python*, John Schneider, Shira Broschat, and Jess Dahmen, chapter 9

In [None]:
from datetime import datetime
from pytz import timezone
print(datetime.now(timezone('US/Eastern')))

## Creating Strings

In *Python* (like many other programming languages) objects that contain alphabetic text characters are called **strings**. Unlike some languages, *Python* allows strings to be as short as single character. Strings do not have to contain just letters either, they can contain nearly any character you can think of (including Unicode characters like emoticons). Strings are typically defined by surrounding the text with either a set of single or double quotes, but not a mix of both. For example, `'parrot'` and `"Lumberjack"` are proper string definitions.

Sometimes you need to include a apostrophe and/or a set of quotes within a string. This can be done a few different ways. First, you can use what are called **escape sequences**. The backslash (`\`) character is used as an escape sequence prefix. It is used just before specific characters in order to get specific results, usually in strings. For example, `\n` is a newline escape sequence that forces a line return and `\t` adds a tab to a string. If you need a backslash in a string, you need to use `\\` to get one. Using `\'` or `\"` forces a single or double quote within a string. Therefore, creating a string using single quotes with an apostrophe can be accomplised like so; `'isn\'t'`. Creating a string using double quotes that has quotation marks is done like this; `"\"hello\""`.

There is an easier way, however, to add apostrophes or quotes to strings than using escape characters. If you need an apostrophe, simply use double quotes to define the string and use a regular (not escaped) apostrophe, like this; `"can't"`. If a double quote is needed in a string, simply use single quotes to define the string; `'She said "hello."'`.

There is a third method that can be used if either or both an apostrophe or double quotation mark is needed in a string. You can enclose the entire string between three single or three double quotes, such as; `""""It's only a flesh wound," he declared"""`.

Three single or three double quotes can also be used to wrap multiline quotes, even if they don't have apostrophes or quotes. The example below illustrates this.

```
"""Hello there. This assignment
is all about strings."""
```

You can find out how long a string is by using the `len()` function. For example, `len("Python")` will tell you that the string is 6 characters long.

>**Practice it**
>
>Create and assign each of the described strings in the provided code cells. When the cell is executed the string will be displayed and/or printed.
>
>- Your first and last name using double quotes
</div>

In [None]:
name = 
name

>- "Mechanical Engineering Technology" using single quotes

In [None]:
MET = 
MET

>- A string with just numeric values

In [None]:
just_numbers = 
just_numbers

>- A string with letters, numbers, and other special characters

In [None]:
letters_nos_chars = 
letters_nos_chars

>- A word with an apostrophe

In [None]:
apostrophe = 
apostrophe

>- A short phrase with double quotes

In [None]:
short_phrase_double = 
short_phrase_double

>- A sentence that contains an apostrophe and double quotes

In [None]:
sentence_apost_double = 
print(sentence_apost_double)
sentence_apost_double

>- A string that spans three lines

In [None]:
three_liner = 
print(three_liner)
three_liner

>- A two word string that has a line return between the words

In [None]:
two_words_newline = 
print(two_words_newline)
two_words_newline

>- A three word string that has two tabs between each word

In [None]:
three_words_2_tabs_between = 
print(three_words_2_tabs_between)

>**Practice it**
>
>Define the string `"Nobody expects the Spanish Inquisition!"` and assign it to the name `funny_quote` then find out how many characters it contains. Do this using two expressions in the same code cell.

>**Practice it**
>
>Create a string that contains the unicode characters for a smiling emoticon `'\U0001F600'` and assign it to the variable name `smiley` then print the variable.

## Accessing Individual String Characters

Strings are essentially lists of characters grouped together and treated as one object. *Python* lets us access specific characters within strings based on their position in the string. A very, very important concept to remember when working with *Python* is that all counting starts with zero, not one. For instance, the letter `"H"` in the string `"Hello"` is in the zeroth position (also called the zeroth index). To access the zeroth character of this string we use the expression `"Hello"[0]`. In fact, to access any specific character from a string place a set of square brackets after the string (or string name) with the desired position (index) within the brackets. This process is referred to as **string indexing**.

Previously we learned that the length (number of characters) of a string can be found with the `len()` function. Since counting starts with zero, the index for the last character in a string of length 10 is actually 9; the string length minus 1. Therefore, the last item in a string named `my_string` can be accessed using `my_string[len(my_string)-1]`.

>**Practice it**
>
>Use string indexing to access each of the following characters from `funny_quote`. Print the string using the provided expression first.
>
>1. The capital **"S"** from the word **"Spanish"**
>1. The exclamation mark at the end using the `len()` function
>1. The letter **"x"**
>1. The letter **"q"**

In [None]:
print(funny_quote)

>The following function, `string_ruler(message)`, was created to check the index positions of any letter or character in a string 100 characters long or less. Try it out by calling it with `funny_quote`. Don't worry how it works at this point in time. Just use it if you need it.

In [None]:
def string_ruler(message):
    new_message = spacer = '|'
    left_index_top = '|'
    left_index_bottom = '|'
    for i, letter in enumerate(message):
        new_message += letter + '|'
        spacer += ' |'
        if i > 9:
            left_index_bottom += str(i%10) + '|'
            left_index_top += str(i//10) + '|'
        else:
            left_index_bottom += str(i) + '|'
            left_index_top += ' |'
    print('string: {}'.format(new_message))
    print('        {}'.format(spacer))
    print('  left  {}'.format(left_index_top))
    print(' index: {}'.format(left_index_bottom))


*Python* also allows us to index a string from the right end instead of the left. Whereas the first character from the left of a string is accessed with the index `[0]`, the last character can be indexed directly by counting backwards. The index for the last character is `[-1]`. The negative sign tells *Python* that we want to count from the end (right) of the string. Indexing from the right starts with `-1` not `-0`, because `-0` is the same as `0` mathematically and using `-0` could cause confusion. The diagram below should help you remember.


```
| 0 | 1 | 2 | 3 | 4 | <-- Indexing from the left
|   |   |   |   |   |
| H | e | l | l | o | <-- String characters "Hello"
|   |   |   |   |   |
|-5 |-4 |-3 |-2 |-1 | <-- Indexing from the right
```

>**Practice it**
>
>Use indexing from the **right** to access each of the following characters from `funny_quote`. Print the string again using the provided expression.
>
>1. The capital **"S"** from the word **"Spanish"**
>1. The exclamation mark at the end
>1. The letter **"x"**
>1. The letter **"q"**
>1. The first letter on the left using the `len()` function (think about this for a second)

In [None]:
print(funny_quote)

## Accessing Portions of Strings, AKA Slicing

*Python* allows for accessing part of a string (a sub-string) in addition to single characters. The act of doing this is referred to as **slicing**. In order to slice a string you place square brackets after the string or variable name with start, stop, and step values separated by colons inside the brackets. However, all three of the slice arguments are optional. If the step value is not used, it is assumed to be `1`. If the start argument is not used, it is assumed to be `0` for positive steps and the last character for negative steps. If the end argument is not used, it is assumed to be the last character for positive steps and the first character for negative steps.

For example, you can use `my_string[0:3]` or `my_string[:3]` or `my_string[0:3:1]` to slice the string starting with the 0th index and ending just before the 3rd index. The diagram below describes how slice numbering works. Slices happen between characters, so the slice argument `[0:3]` would slice out `"Hel"` from the string `"Hello"`. A slice argument of `[:]` will return a copy of the entire string (think about it for a second). Slicing with `[2:]` would return `"llo"` from `"Hello"`.

```
 0   1   2   3   4   5 <-- Slicing from the left; my_string[1:4] => 'ell'
 |   |   |   |   |   |
 | H | e | l | l | o | <-- String characters "Hello"
 |   |   |   |   |   |
-5  -4  -3  -2  -1   | <-- Slicing from the right with a positive step; my_string[-4:-1] => 'ell'
 |   |   |   |   |   |
-6  -5  -4  -3  -2  -1 <-- Slicing from the right with a negative step; my_string[-1:-5:-1] => 'olle'
```

>**Practice it**
>
>Slice the following sub-strings from our string `funny_quote`. Print the string first to make it easier.
>
>1. The first 6 characters
>2. The last 12 characters
>3. Every other character from the left to the right
>4. The entire string written backwards (think for a second)
>5. From 19 to 26

In [None]:
print(funny_string)

## String Methods

*Python* has a number of **methods** that can act on or with strings. Methods are similar to functions except that they are placed after an object or object name. For example, if there was a method called `my_method()` that works on an object named `x` we would type `x.my_method()` to use the method. *Python* methods act on or with the object located directly before the method and may include arguments within the parentheses after the method name. Execute the following code cell to see a list of string methods, functions, and operations. The methods are the items without leading and trailing underscores. Note: you can also use `print(dir(str))` to list the methods.

In [None]:
my_string = "Hello, World!"
print(dir(my_string))

In [None]:
print(dir(str))

Don't worry, we won't dig into all of these, just some of them. Specifically, we'll investigate methods related to modifiying strings and methods used for counting, searching, and replacing parts of strings.

### Methods for Modifying Strings

Strings in *Python* are considered to be **immutable**. This means that they cannot be changed (mutated or modified) they are created. If you want to "modify" a string, you need to create a copy that has your desired changes. This is essentially what happens with the methods that "modify" strings. The original string is left unchanged, but a new string is created with the modifications. This new string can "replace" the original by assigning the new string to the original variable name.

The first five of the six methods we will look at are associated with changing the case of specific letters in a string. Following is the list of all six methods. Their names essentially describe what they do. Only the `.center()` method accepts an argument; it needs the length of the new string in which the original string will be centered. Get help on any string method using the expression `help(str.method_name)`, i.e. `help(str.center)` will return help for the `.center()` method.

- `.lower()`
- `.upper()`
- `.title()`
- `.swapcase()`
- `.capitalize()`
- `.center(x)`

>**Practice it**
>
>Use each of the above string modification methods on `funny_quote` in the following code cells. For the `.center()` method, request a new length that is the original length + 12 (use the `len()` function to do this). After using all of the methods, print the string name to see what it looks like.

In [None]:
funny_quote         # All lowercase letters

In [None]:
funny_quote         # All uppercase letters

In [None]:
funny_quote         # First letter of first word only is capitalized

In [None]:
funny_quote         # Swap upper and lowercase in string

In [None]:
funny_quote         # Uppercase first letter in each word

In [None]:
funny_quote         # Center string, use string length + 12 characters

In [None]:
# use this cell to re-print the original string name

### Methods for Counting, Searching, and Replacing

You can count the number of times that a specific character or sub-string appears in a larger string by using the `.count()` method. This method requires at least one argument, the sub-string that you want counted (don't forget the enclosing quotes). It can also include two optional arguments that indicate where to start and end the count in the original string. For instance, `my_string.count("a")` will return the number of times `"a"` occurs in `my_string`. You can search starting from index position 2 and end just before index position 7 using `my_string.count("a", 2, 7)`. Searching from the 4th index to the end is done with `my_string.count("a", 4)`. You cannot just specify an ending index.

The `.find()` method will look for a search string within a target string using the syntax `target_string.find(search_string)`. If the search string is found, the method  will return the index from the target string that matches the search string's starting position. If the search string is not found in the target string, then a `-1` will be returned. If the search string occurs more than once, only the position of the first occurrence will be returned. This method also accepts optional starting and ending arguments. The command `my_string.find("the")` will return the index of the first occurrence of `"the"` in `my_string` (if found). The return value corresponds with the index position of the `"t"` in `"the"`.

The `.replace()` method will replace all occurrences of a search string with a new string within a target string. The method requires two arguments; the search string and the new string. It can also take a third optional argument for the number of occurrences to replace instead of all occurrences. For example, `my_string.replace('I', 'you', 2)` will replace only the first 2 occurrences of `'I'` with `'you'`. Keep in mind that the original string is not changed. A copy of the original with the replacement(s) will be created and can be assigned to the original variable name or a new variable name, like so; `my_string = my_string.replace('I', 'you', 2)`.

>**Practice it**
>
>Using `funny_quote`, perform the following tasks.
>
>1. Find out how many times `"e"` appears in the entire string
>1. Find the number of times `"is"` appears in the entire string
>1. Find the number of times `"o"` appears starting with position 15
>1. Find the starting location of the sub-string `"the"`
>1. Find the position of `"x"` starting at index position 10
>1. Replace `"Spanish"` with `"Ferris"`
>1. Replace all occurrences of `"i"` with `"I"`

In [None]:
funny_quote       # How many 'e's?

In [None]:
funny_quote       # How many 'is' are there?

In [None]:
funny_quote       # How many 'o's starting with 15

In [None]:
funny_quote       # Location of the first 'the'

In [None]:
funny_quote       # Position of 'x' starting with 10

In [None]:
funny_quote       # Replace 'Spanish' with 'Ferris'

In [None]:
funny_quote       # Replace all 'i' with 'I'

### Other String Methods

If you look at the list of string methods that were found using `dir()` previously, you will see a number that start with `is`. All of these methods ask a question of the string or portion of a string on which they act. For instance, `my_string.isalpha()` asks if the entire string is composed of alphabetic characters only. If it does, then the method returns `True`. If not, `False` is returned. A portion of a string can be used by slicing it. For example, `my_string[:3].islower()` asks if the first three characters of the string are all lower case. Below is a partial list of the `is` string methods.

- `.islower()` checks if all characters are lowercase
- `.isupper()` checks if all characters are uppercase
- `.isspace()` checks if all characters are spaces
- `.istitle()` checks if each word starts with an uppercase letter and the rest are lowercase
- `.isalpha()` checks if all characters are alphabetic
- `.isalnum()` checks if all characters are either alphabetic or numeric
- `.isnumeric()` checks if all characters are numeric (0-9)

>**Practice it**
>
>Test the 1-7 using `is` methods on portions of `funny_quote` and 8-9 on strings containing numeric values
>
>1. Is the slice from `[0:6]` upper case?
>1. Is the slice from `[0:6]` title case?
>1. Is the slice from `[7:14]` lower case?
>1. Is index position `6` a space?
>1. Is index position `10` a space?
>1. Is the slice from `[0:6]` alphabetic?
>1. Is the slice from `[7:14]` alpha-numeric?
>1. Is the string `"42"` numeric?
>1. Is the string `"42.0"` numeric?

In [None]:
funny_quote     # Is 0 to 6 uppercase?

In [None]:
funny_quote     # Is 0 to 6 title case?

In [None]:
funny_quote     # Is 7 to 14 lowercase?

In [None]:
funny_quote     # Is index 6 a space?

In [None]:
funny_quote     # Is index 10 a space?

In [None]:
funny_quote     # Is 0 to 6 alphabetic?

In [None]:
funny_quote     # Is 7 to 14 alpha-numeric?

In [None]:
'42'            # Is the string '42' numeric?

In [None]:
'42.0'          # Is the string '42.0' numeric?

>**Wrap it up**
>
>Assign the string `"My name is Brian"` to variable called `name` in the cell below this one and execute the cell. In the next cell, use the appropriate method to replace `"Brian"` with your first and last name and assign the new string back to `name`. In the last cell print `name`. Then execute the time and date stamp cell.
>
>Click on the **Save** button and then the **Close and halt** button when you are done. **This is an instructor-led assignment that must be completed before the end of the lab session in order to receive credit.**

In [None]:
from datetime import datetime
from pytz import timezone
print(datetime.now(timezone('US/Eastern')))