# Think Python

## Chapter 8 - Strings

HTML version can be found [here](http://greenteapress.com/thinkpython2/html/thinkpython2009.html "Chpt 8").

### 8.1 A string is a sequence

*No notes.*

### 8.2 `len`

*Other languages would use syntax like this to find the last letter in a string:*



In [1]:
fruit = "banana"
last = fruit[len(fruit) - 1]
last

'a'

*But in Python, we can also use negative indexing:*

In [2]:
fruit[-1]

'a'

### 8.3 Traversal with a `for` loop

*As an exercise, write a function that takes a string as an argument and displays the letters backward, one per line.*

In [3]:
def backwards(string):
    """
    Prints string backwards, one letter per line.
    """
    backwards_length = -len(string)
    index = -1
    while index >= backwards_length:
        print(string[index])
        index -= 1

In [4]:
backwards("Snugglebunnies!")

!
s
e
i
n
n
u
b
e
l
g
g
u
n
S


*The following example shows how to use concatenation (string addition) and a for loop to generate an abecedarian series (that is, in alphabetical order). In Robert McCloskey’s book __Make Way for Ducklings__, the names of the ducklings are Jack, Kack, Lack, Mack, Nack, Ouack, Pack, and Quack. As an exercise, use a for loop to print their names correctly:*

In [5]:
prefixes = "JKLMNOPQ"
suffix = "ack"

for letter in prefixes:
    if letter == "O" or letter == "Q":
        print(letter + "u" + suffix)
    else:
        print(letter + suffix)
    

Jack
Kack
Lack
Mack
Nack
Ouack
Pack
Quack


### 8.4 String slices

*As in many other languages, the last number in the slice is excluded.*

### 8.5 Strings are immutable

*No notes.*

### 8.6 

*As an exercise, modify `find` so that it has a third parameter, the index in word where it should start looking.*



In [6]:
def find(word, letter, index):
    """Searches word for the first instance
    of letter, starting at index.
    """
    while index < len(word):
        if word[index] == letter:
            return index
        index += 1
    return -1

In [7]:
find("Snugglebunnies", "u", 0)

2

In [8]:
find("Snugglebunnies", "u", 6)

8

In [9]:
find("Snugglebunnies", "u", 10)

-1

### Looping and counting

*As an exercise, encapsulate this code in a function named `count`, and generalize it so that it accepts the string and the letter as arguments.*


In [10]:
def count(string, letter):
    """
    Counts occurrences of letter in string.
    """
    count = 0
    for char in string:
        if char == letter:
            count += 1
    print(count)

In [11]:
count("Snugglebunnies", "u")

2


*Then rewrite the function so that instead of traversing the string, it uses the three-parameter version of `find` from the previous section.*

In [12]:
def find(word, letter, index):
    """Searches word for the first instance
    of letter, starting at index.
    """
    while index < len(word):
        if word[index] == letter:
            return index
        index += 1
    return -1

def count(string, letter):
    """
    Counts occurrences of letter in string.
    """
    count = 0
    index = find(string, letter, 0)
    while index != -1:
        count += 1
        index = find(string, letter, index + 1)
    print(count)

In [13]:
count("bananarama", "a")

5


### 8.8 String methods

*No notes.*

### 8.9 The `in` operator

*No notes.*

### 8.10 String comparison

*No notes.*

### 8.11 Debugging

*Starting with this diagram, run the program on paper, changing the values of `i` and `j` during each iteration. Find and fix the second error in this function.*





In [14]:
def is_reverse(word1, word2):
    if len(word1) != len(word2):
        return False
    
    i = 0
    j = len(word2) - 1
    
    while j > 0:
        if word1[i] != word2[j]:
            return False
        i += 1
        j -= 1
        
    return True

In [15]:
is_reverse("stop", "pots")

True

In [16]:
is_reverse("stop", "aots")

True

In [17]:
def is_reverse(word1, word2):
    if len(word1) != len(word2):
        return False
    
    i = 0
    j = len(word2) - 1
    
    # in the old version, the program stopped before
    # checking the last values.  I fixed that by 
    # changing `j > 0` to `j >= 0`
    
    while j >= 0:
        if word1[i] != word2[j]:
            return False
        i += 1
        j -= 1
        
    return True
            

In [18]:
is_reverse("stop", "pots")

True

In [19]:
is_reverse("stop", "aots")

False

### 8.12 Glossary

*No notes.*

#### Exercise 1  

*Read the documentation of the string methods at http://docs.python.org/3/library/stdtypes.html#string-methods. You might want to experiment with some of them to make sure you understand how they work. `strip` and `replace` are particularly useful.*

*The documentation uses a syntax that might be confusing. For example, in `find(sub[, start[, end]])`, the brackets indicate optional arguments. So `sub` is required, but `start` is optional, and if you include `start`, then `end` is optional.*

#### Exercise 2  

*There is a string method called `count` that is similar to the function in Section 8.7. Read the documentation of this method and write an invocation that counts the number of `a`’s in `'banana'`.*

In [20]:
fruit = "banana"
fruit.count("a")

3

#### Exercise 3  

*A string slice can take a third index that specifies the “step size”; that is, the number of spaces between successive characters. A step size of 2 means every other character; 3 means every third, etc.*

```{Python}
>>> fruit = 'banana'
>>> fruit[0:5:2]
'bnn'
```

*A step size of -1 goes through the word backwards, so the slice [::-1] generates a reversed string.*

*Use this idiom to write a one-line version of `is_palindrome` from Exercise 3.*

In [21]:
def is_palindrome(word):
    """
    Returns True if word is palindromic.
    """
    return word == word[::-1]

In [22]:
is_palindrome("madam")

True

In [23]:
is_palindrome("racecars")

False

#### Exercise 4  

*The following functions are all intended to check whether a string contains any lowercase letters, but at least some of them are wrong. For each function, describe what the function actually does (assuming that the parameter is a string).*

In [24]:
def any_lowercase1(s):
    for c in s:
        if c.islower():
            return True
        else:
            return False

In [25]:
any_lowercase1("Snafu")

False

In [26]:
any_lowercase1("sNAFU")

True

*The first function only looks at the first letter in the string, since the function terminates at `return`.*

In [27]:
def any_lowercase2(s):
    for c in s:
        if 'c'.islower():
            return 'True'
        else:
            return 'False'

In [28]:
any_lowercase2("Snafu")

'True'

In [29]:
any_lowercase2("SNAFU")

'True'

*This function doesn't look at the characters in the string `s`, but rather the string `'c'`, which is always going to return `'True'`, since `'c'` is lowercase.  Further, it returns the string `'True'`, and not the boolean value `True`.*

In [30]:
def any_lowercase3(s):
    for c in s:
        flag = c.islower()
    return flag

In [31]:
any_lowercase3("SNAFu")

True

In [32]:
any_lowercase3("snafU")

False

*This function only returns the final value of `flag`.  Since the function iterates through the characters in the string, updating its value each time, in effect it's only looking at the last character in the string.*

In [33]:
def any_lowercase4(s):
    flag = False
    for c in s:
        flag = flag or c.islower()
    return flag

In [34]:
any_lowercase4("Snafu")

True

In [35]:
any_lowercase4("SNaFU")

True

In [36]:
any_lowercase4("SNAFU")

False

*This function iterates, updating the value of `flag` each time; but since it uses a boolean `or`, if even one character in the string is lowercase, the function will return `True`.  I.e., this function actually works.*

In [37]:
def any_lowercase5(s):
    for c in s:
        if not c.islower():
            return False
    return True

In [38]:
any_lowercase5("Snafu")

False

In [39]:
any_lowercase5("sNAFU")

False

In [40]:
any_lowercase5("snafU")

False

*This function iterates through the string and returns `False` if even one character is an uppercase letter.*

#### Exercise 5  

*A Caesar cypher is a weak form of encryption that involves “rotating” each letter by a fixed number of places. To rotate a letter means to shift it through the alphabet, wrapping around to the beginning if necessary, so ’A’ rotated by 3 is ’D’ and ’Z’ rotated by 1 is ’A’.*

*To rotate a word, rotate each letter by the same amount. For example, “cheer” rotated by 7 is “jolly” and “melon” rotated by -10 is “cubed”. In the movie 2001: A Space Odyssey, the ship computer is called HAL, which is IBM rotated by -1.*

*Write a function called `rotate_word` that takes a string and an integer as parameters, and returns a new string that contains the letters from the original string rotated by the given amount.*

*You might want to use the built-in function `ord`, which converts a character to a numeric code, and `chr`, which converts numeric codes to characters. Letters of the alphabet are encoded in alphabetical order, so for example:*

```{Python}
>>> ord('c') - ord('a')
2
```

*Because 'c' is the two-eth letter of the alphabet. But beware: the numeric codes for upper case letters are different.*

*Potentially offensive jokes on the Internet are sometimes encoded in ROT13, which is a Caesar cypher with rotation 13. If you are not easily offended, find and decode some of them. Solution: http://thinkpython2.com/code/rotate.py.*

In [41]:
def wraparound(c, lower_limit, upper_limit, i):
    """Rotates a letter by n places.
    Maintains case of letter.
    c: char
    lower_limit: ASCII start of lower/uppercase letters
    upper_limit: ASCII end of lower/uppercase letters
    i: spaces to be rotated
    
    Returns: rotated characters
    """
    new_c = ord(c) + i
    if new_c > upper_limit:
        new_c -= 26
    elif new_c < lower_limit:
        new_c += 26
    return chr(new_c)

def rotate_word(s, i):
    """Rotates an alphabetic string by i places
    
    s: word
    i: int
    
    Returns: rotated string
    """
    new_s = ""
    for c in s:
        if 65 <= ord(c) <= 90:
            new_c = wraparound(c, 65, 90, i)
        elif 97 <= ord(c) <= 122:
            new_c = wraparound(c, 97, 122, i)
        else:
            print("String contains non-alphabetic character.")
            return
        new_s += new_c
    return new_s

In [42]:
rotate_word("PIZZA", -16)

'ZSJJK'

*The author's solution is more elegant, but perhaps not as self-evident (in my solution the ASCII values are more explicit):*

In [43]:
def rotate_letter(letter, n):
    """Rotates a letter by n places.  Does not change other chars.
    
    letter: single-letter string
    n: int
    
    Returns: single-letter string
    """
    if letter.isupper():
        start = ord('A')
    elif letter.islower():
        start = ord('a')
    else:
        return letter
    
    c = ord(letter) - start
    i = (c + n) % 26 + start
    return chr(i)


def rotate_word(word, n):
    """Rotates a word by n places.
    
    word: string
    n: integer
    
    Returns: string
    """
    
    res = ""
    for letter in word:
        res += rotate_letter(letter, n)
    return res

In [44]:
rotate_word("PIZZA", 10)

'ZSJJK'