# Strings 

Materials adapted from *[How to Think Like a Computer Scientist](https://runestone.academy/runestone/static/thinkcspy/index.html)* 

This colab notebook is paired with the page on Canvas: **6-Strings**

So far we have seen built-in types like: ``int``, ``float``, ``bool``, ``str`` and we've seen lists.  ``int``, ``float``, and ``bool`` are considered to be simple or primitive data types because their values are not composed of any smaller parts.  They cannot be broken down. On the other hand, strings and lists are different from the others because they are made up of smaller pieces.  In the case of strings, they are made up of smaller strings each containing one **character**.  

Types that are comprised of smaller pieces are called **collection data types**. Depending on what we are doing, we may want to treat a collection data type as a single entity (the whole), or we may want to access its parts. This ambiguity is useful.

Strings can be defined as sequential collections of characters.  This means that the individual characters that make up the string are assumed to be in a particular order from left to right.

A string that contains no characters, often referred to as the **empty string**, is still considered to be a string.  It is simply a sequence of zero characters and is represented by '' or "" (two single or two double quotes with nothing in between).

## Operation on Strings 

In general, you cannot perform mathematical operations on strings, even if the strings look like numbers. The following are illegal (assuming that ``message`` has type string):

```{python}
message - 1
"Hello" / 123
message * "Hello"
"15" + 2
``` 

Interestingly, the ``+`` operator does work with strings, but for strings, the``+`` operator represents **concatenation**, not addition.  Concatenation means joining the two operands by linking them end-to-end. For example:

In [None]:
fruit = "banana"
bakedGood = " nut bread"
print(fruit + bakedGood)

The output of this program is `"banana nut bread"`. The space before the word `"nut"` is part of the string and is necessary to produce the space between the concatenated strings. Take out the space and run it again.

The `*` operator also works on strings. It performs repetition. For example, `'Fun'*3` is `'FunFunFun'`. One of the operands has to be a string and the other has to be an integer.

In [None]:
print("Go" * 6)

name = "Packers"
print(name * 3)

print(name + "Go" * 3)

print((name + "Go") * 3)

This interpretation of ``+`` and ``*`` makes sense by analogy with addition and multiplication. Just as ``4*3`` is equivalent to ``4+4+4``, we expect ``"Go"*3`` to be the same as ``"Go"+"Go"+"Go"``, and it is.  Note also in the last example that the order of operations for ``*`` and ``+`` is the same as it was for arithmetic. The repetition is done before the concatenation.  If you want to cause the concatenation to be done first, you will need to use parenthesis.

### <a name="exer1"></a>Exercise 1

What is printed by the following statements?

```{python}
s = "python"
t = "rocks"
print(s + t)
```

* A. python rocks 
* B. python 
* C. pythonrocks
* D. Error, you can not add two strings together. 

[exercse 1 answers](#ans1)

### Exercise 2 

What is printed by the following statements?

```python
s = "python"
excl = "!"
print(s+excl*3)
```

* A. python!!! 
* B. python!python!python! 
* C. pythonpythonpython! 
* D. Error, you cannot perform concatenation and repetition at the same time.

[exercise 2 answers](#ans2)

## Index Operator: Working with the Characters of a String 

The **indexing operator** (Python uses square brackets to enclose the index)  selects a single character from a string.  The characters are accessed by their position or  index value.  For example, in the string shown below, the 14 characters are indexed left to right from position 0 to position 13.  

<img src="https://pages.mtu.edu/~lebrown/CADeT/Intro2Python/indexvalues.png">

It is also the case that the positions are named from right to left using negative numbers where -1 is the rightmost index and so on. Note that the character at index 6 (or -8) is the blank character.


In [None]:
school = "Luther College"
m = school[2]
print(m)

lastchar = school[-1]
print(lastchar)



The expression ``school[2]`` selects the character at index 2 from ``school``, and creates a new string containing just this one character. The variable ``m`` refers to the result. 

Remember that computer scientists often start counting from zero. The letter at index zero of ``"Luther College"`` is ``L``.  So at position ``[2]`` we have the letter ``t``.

If you want the zero-eth letter of a string, you just put 0, or any expression with the value 0, in the brackets.  Give it a try.

The expression in brackets is called an **index**. An index specifies a member of an ordered collection.  In this case the collection of characters in the string. The index *indicates* which character you want. It can be any integer expression so long as it evaluates to a valid index value.

Note that indexing returns a *string* --- Python has no special type for a single character. It is just a string of length 1.


### <a name="exer3"></a>Exercise 3 

 What is printed by the following statements?

 * A. t 
 * B. h 
 * C. c 
 * D. Error, you can not use the [] operator with a string. 

[exercise 3 answers](#ans3)

### <a name="exer4"></a> Exercise 4 

What is printed by the following statements?

```{python}
s = "python rocks"
print(s[2] + s[-5])
```

* A. tr 
* B. ps 
* C. nn 
* D. Error, you can not use the [] operator with the + operator. 

[exercise 4 answers](#ans4)

## String Methods 

We previously saw that each turtle instance has its own attributes and a number of methods that can be applied to the instance.  For example, we wrote ``tess.right(90)`` when we wanted the turtle object ``tess`` to perform the ``right`` method to turn to the right 90 degrees.  The "dot notation" is the way we connect the name of an object to the name of a method it can perform.

Strings are also objects.  Each string instance has its own attributes and methods.  The most important attribute of the string is the collection of characters.  There are a wide variety of methods.  Try the following program.


In [None]:
ss = "Hello, World"
print(ss.upper())

tt = ss.lower()
print(tt)

In this example, ``upper`` is a method that can be invoked on any string object to create a new string in which all the characters are in uppercase.  ``lower`` works in a similar fashion changing all characters in the string to lowercase.  (The original string ``ss`` remains unchanged.  A new string ``tt`` is created.)

In addition to ``upper`` and ``lower``, the following table provides a summary of some other useful string methods.

| Method     | Parameters         | Description | 
|------------|--------------------|-------------|
| upper      | none               | Returns a string in all uppercase |
| lower      | none               | Returns a string in all lowercase |
| capitalize | none               | Returns a string with first character capitalized, the rest lower |
| strip      | none               | Returns a string with the leading and trailing whitespace removed |
| lstrip     | none               | Returns a string with the leading whitespace removed |
| rstrip     | none               | Returns a string with the trailing whitespace removed |
| count      | item               | Returns the number of occurrences of item |
| replace    | old, new           | Replaces all occurrences of old substring with new |
| center     | width              | Returns a string centered in a field of width spaces |
| ljust      | width              | Returns a string left justified in a field of width spaces |
| rjust      | width              | Returns a string right justified in a field of width spaces |
| find       | item               | Returns the leftmost index where the substring item is found, or -1 if not found |
| rfind      | item               | Returns the rightmost index where the substring item is found, or -1 if not found |
| index      | item               | Like find except causes a runtime error if item is not found |
| rindex     | item               | Like rfind except causes a runtime error if item is not found |
| format     | substitutions      | Involved! See [documentation](https://docs.python.org/3/library/stdtypes.html#string-methods) |

You should experiment with these methods so that you understand what they do.  Note once again that the methods that return strings do not change the original.  You can also consult the [Python documentation for strings](https://docs.python.org/3/library/stdtypes.html#string-methods).

In [None]:
ss = "    Hello, World    "

els = ss.count("l")
print(els)

print("***" + ss.strip() + "***")
print("***" + ss.lstrip() + "***")
print("***" + ss.rstrip() + "***")

news = ss.replace("o", "***")
print(news)

In [None]:
food = "banana bread"
print(food.capitalize())

print("*" + food.center(25) + "*")
print("*" + food.ljust(25) + "*")     # stars added to show bounds
print("*" + food.rjust(25) + "*")

print(food.find("e"))
print(food.find("na"))
print(food.find("b"))

print(food.rfind("e"))
print(food.rfind("na"))
print(food.rfind("b"))

### <a name="exer5"></a>Exercise 5 

What is printed by the following statements?

* A. 0 
* B. 2 
* C. 3 

[exercise 5 answers](#ans5)

### <a name="exer6"></a> Exercise 6 

What is printed by the following statements?

* A. yyyyy 
* B. 55555
* C. n 
* D. Error, you can not combine all those things together. 

[exercise 6 answers](#ans6)

### String Format Method 

In grade school quizzes a common convention is to use fill-in-the blanks. For instance,

>Hello _____!


and you can fill in the name of the person greeted, and combine given text with a chosen insertion. *We use this as an analogy:* Python has a similar
construction, better called fill-in-the-braces. The string method ``format``,  makes substitutions into places in a string enclosed in braces. Run this code:


In [None]:
person = input('Your name: ')
greeting = 'Hello {}!'.format(person)
print(greeting)

There are several new ideas here!

The string for the ``format`` method has a special form, with braces embedded. Such a string is called a *format string*.  Places where braces are embedded are replaced by the value of an expression taken from the parameter list for the ``format`` method. There are many variations on the syntax between the braces. In this case we use the syntax where the first (and only) location in the string with braces has a substitution made from the first (and only) parameter.

In the code above, this new string is assigned to the identifier ``greeting``, and then the string is printed.

The identifier ``greeting`` was introduced to break the operations into a clearer sequence of steps. However, since the value of ``greeting`` is only referenced once, it can be eliminated with the more concise version:

In [None]:
person = input('Enter your name: ')
print('Hello {}!'.format(person))


There can be multiple substitutions, with data of any type.
Next we use floats.  Try original price $2.50  with a 7% discount:

In [None]:
origPrice = float(input('Enter the original price: $'))
discount = float(input('Enter discount percentage: '))
newPrice = (1 - discount/100)*origPrice
calculation = '${} discounted by {}% is ${}.'.format(origPrice, discount, newPrice)
print(calculation)

The parameters are inserted into the braces in order.

If you used the data suggested, this result is not satisfying. Prices should appear with exactly two places beyond the decimal point, but that is not the default way to display floats.

Format strings can give further information inside the braces showing how to specially format data. In particular floats can be shown with a specific number of decimal places. For two decimal places, put ``:.2f`` inside the braces for the monetary values:

In [None]:
origPrice = float(input('Enter the original price: $'))
discount = float(input('Enter discount percentage: '))
newPrice = (1 - discount/100)*origPrice
calculation = '${:.2f} discounted by {}% is ${:.2f}.'.format(origPrice, discount, newPrice)
print(calculation)

The 2 in the format modifier can be replaced by another integer to round to that specified number of digits.

This kind of format string depends directly on the order of the parameters to the format method. There are other approaches that we will skip here, explicitly numbering substitutions and taking substitutions from a dictionary.






### <a name="exer7"></a>Exercise 7 

What is printed by the following statements?

```python 
x = 2
y = 6
print('sum of {} and {} is {}; product: {}.'.format( x, y, x+y, x*y))
```

* A.  Nothing - it causes an error
* B. sum of {} and {} is {}; product: {}. 2 6 8 12 
* C. sum of 2 and 6 is 8; product: 12. 
* D. sum of {2} and {6} is {8}; product: {12}. 

[exercise 7 answers](#ans7)

## Length 

The ``len`` function, when applied to a string, returns the number of characters in a string.


In [None]:
fruit = "Banana"
print(len(fruit))



To get the last letter of a string, you might be tempted to try something like this:

In [None]:
fruit = "Banana"
sz = len(fruit)
last = fruit[sz]       # ERROR!
print(last)

That won't work. It causes the runtime error ``IndexError: string index out of range``. The reason is that there is no letter at index position 6 in ``"Banana"``.  Since we started counting at zero, the six indexes are numbered 0 to 5. To get the last character, we have to subtract 1 from the length.  Give it a try in the example above.

In [None]:
fruit = "Banana"
sz = len(fruit)
lastch = fruit[sz-1]
print(lastch)

Alternatively in Python, we can use **negative indices**, which count backward from the end of the string. The expression ``fruit[-1]`` yields the last letter, ``fruit[-2]`` yields the second to last, and so on.  Try it!   Most other languages do *not* allow the negative indices, but they are a handy feature of Python!

### <a name="exer8"></a> Exercise 8 

What is printed by the following statements?

```python 
s = "python rocks"
print(s[len(s)-5])
```

* A. o 
* B. r 
* C. s 
* D. Error, len(s) is 12 and there is no index 12. 

[exercise 8 answers](#ans8)

## The Slice Operator 

A substring of a string is called a **slice**. Selecting a slice is similar to selecting a character:



In [None]:
singers = "Peter, Paul, and Mary"
print(singers[0:5])
print(singers[7:11])
print(singers[17:21])

The `slice` operator ``[n:m]`` returns the part of the string from the n'th character to the m'th character, including the first but excluding the last. In other words,  start with the character at index n and go up to but do not include the character at index m. This behavior may seem counter-intuitive but if you recall the ``range`` function, it did not include its end point either.

If you omit the first index (before the colon), the slice starts at the beginning of the string. If you omit the second index, the slice goes to the end of the string.

There is no Index Out Of Range exception for a slice.  A slice is forgiving and shifts any offending index to something legal. 



In [None]:
fruit = "banana"
print(fruit[:3])
print(fruit[3:])
print(fruit[3:-10])
print(fruit[3:99])

What do you think `fruit[:]` means?

## String Comparison 

The comparison operators also work on strings. To see if two strings are equal you simply write a boolean expression using the equality operator.

In [None]:
word = "banana"
if word == "banana":
    print("Yes, we have bananas!")
else:
    print("Yes, we have NO bananas!")

Other comparison operations are useful for putting words in
[lexicographical order](http://en.wikipedia.org/wiki/Lexicographic_order).
This is similar to the alphabetical order you would use with a dictionary,
except that all the uppercase letters come before all the lowercase letters.

In [None]:
word = "zebra"

if word < "banana":
    print("Your word, " + word + ", comes before banana.")
elif word > "banana":
    print("Your word, " + word + ", comes after banana.")
else:
    print("Yes, we have no bananas!")

It is probably clear to you that the word `apple` would be less than (come before) the word ``banana``. After all, `a` is before `b` in the alphabet.  But what if we consider the words ``apple`` and ``Apple``? Are they the same? 

In [None]:
print("apple" < "banana")

print("apple" == "Apple")
print("apple" < "Apple")

It turns out, as you recall from our discussion of variable names, that uppercase and lowercase letters are considered to be different from one another.  The way the computer knows they are different is that each character is assigned a unique integer value.  "A" is 65, "B" is 66, and "5" is 53.  The way you can find out the so-called **ordinal value** for a given character is to use a character function called ``ord``.

In [None]:
print(ord("A"))
print(ord("B"))
print(ord("5"))

print(ord("a"))
print("apple" > "Apple")

When you compare characters or strings to one another, Python converts the characters into their equivalent ordinal values and compares the integers from left to right.  As you can see from the example above, "a" is greater than "A" so "apple" is greater than "Apple".

Humans commonly ignore capitalization when comparing two words.  However, computers do not.  A common way to address this issue is to convert strings to a standard format, such as all lowercase, before performing the comparison. 

There is also a similar function called ``chr`` that converts integers into their character equivalent.

In [None]:
print(chr(65))
print(chr(66))

print(chr(49))
print(chr(53))

print("The character for 32 is", chr(32), "!!!")
print(ord(" "))

## Strings are Immutable 

One final thing that makes strings different from some other Python collection types is that you are not allowed to modify the individual characters in the collection.  It is tempting to use the ``[]`` operator on the left side of an assignment, with the intention of changing a character in a string.  For example, in the following code, we would like to change the first letter of ``greeting``.


In [None]:
greeting = "Hello, world!"
greeting[0] = 'J'            # ERROR!
print(greeting)

Instead of producing the output ``Jello, world!``, this code produces the runtime error ``TypeError: 'str' object does not support item assignment``.

Strings are **immutable**, which means you cannot change an existing string. The best you can do is create a new string that is a variation on the original.




In [None]:
greeting = "Hello, world!"
newGreeting = 'J' + greeting[1:]
print(newGreeting)
print(greeting)   

The solution here is to concatenate a new first letter onto a slice of ``greeting``. This operation has no effect on the original string.

## Traversal and the `for` Loop: By Item 

A lot of computations involve processing a collection one item at a time.  For strings this means that we would like to process one character at a time. Often we start at the beginning, select each character in turn, do something to it, and continue until the end. This pattern of processing is called a **traversal**.

We have previously seen that the ``for`` statement can iterate over the items of a sequence (a list of names in the case below).


In [None]:
for aname in ["Joe", "Amy", "Brad", "Angelina", "Zuki", "Thandi", "Paris"]:
    invitation = "Hi " + aname + ".  Please come to my party on Saturday!"
    print(invitation)

Recall that the loop variable takes on each value in the sequence of names.  The body is performed once for each name.  The same was true for the sequence of integers created by the ``range`` function.

In [None]:
for avalue in range(10):
    print(avalue)

Since a string is simply a sequence of characters, the ``for`` loop iterates over each character automatically.


In [None]:
for achar in "Go Spot Go":
    print(achar)

The loop variable ``achar`` is automatically reassigned each character in the string "Go Spot Go". We will refer to this type of sequence iteration as **iteration by item**.   Note that it is only possible to process the characters one at a time from left to right.

### <a name="exer9"></a> Exercise 9 

How many times is the word HELLO printed by the following statements?

```
s = "python rocks"
for ch in s:
    print("HELLO")
``` 

* A. 10 
* B. 11 
* C. 12 
* D. Error, the for statement needs to use the range function. 

[exercise 9 answer](#ans9)

## Traversal and the `for` Loop: By Index 

It is also possible to use the ``range`` function to systematically generate the indices of the characters.  The ``for`` loop can then be used to iterate over these positions.  These positions can be used together with the indexing operator to access the individual characters in the string.

In [None]:
fruit = "apple"
for idx in range(5):
    currentChar = fruit[idx]
    print(currentChar)

The index positions in "apple" are 0,1,2,3 and 4.  This is exactly the same sequence of integers returned by ``range(5)``.  The first time through the for loop, ``idx`` will be 0 and the "a" will be printed.  Then, ``idx`` will be reassigned to 1 and "p" will be displayed.  This will repeat for all the range values up to but not including 5.  Since "e" has index 4, this will be exactly right to show all  of the characters.

In order to make the iteration more general, we can use the ``len`` function to provide the bound for ``range``.  This is a very common pattern for traversing any sequence by position.    Make sure you understand why the range function behaves correctly when using ``len`` of the string as its parameter value.

In [None]:
fruit = "apple"
for idx in range(len(fruit)):
    print(fruit[idx])

You may also note that iteration by position allows the programmer to control the direction of the traversal by changing the sequence of index values.  Recall that we can create ranges that count down as  well as up so the following code will print the characters from right to left.

In [None]:
fruit = "apple"
for idx in range(len(fruit)-1, -1, -1):
    print(fruit[idx])

## Traveral and the `while` Loop 

The ``while`` loop can also control the generation of the index values.  Remember that the programmer is responsible for setting up the initial condition, making sure that the condition is correct, and making sure that something changes inside the body to guarantee that the condition will eventually fail.

In [None]:
fruit = "apple"

position = 0
while position < len(fruit):
    print(fruit[position])
    position = position + 1

The loop condition is ``position < len(fruit)``, so when ``position`` is equal to the length of the string, the condition is false, and the body of the loop is not executed. The last character accessed is the one with the index ``len(fruit)-1``, which is the last character in the string.


## The `in` and `not in` operators 

The ``in`` operator tests if one string is a substring of another:



In [None]:
print('p' in 'apple')
print('i' in 'apple')
print('ap' in 'apple')
print('pa' in 'apple')

Note that a string is a substring of itself, and the empty string is a  substring of any other string. (Also note that computer scientists  like to think about these edge cases quite carefully!) 


In [None]:
print('a' in 'a')
print('apple' in 'apple')
print('' in 'a')
print('' in 'apple')

The ``not in`` operator returns the logical opposite result of ``in``.

In [None]:
print('x' not in 'apple')

## The Accumulator Pattern with Strings

Combining the ``in`` operator with string concatenation using ``+`` and the accumulator pattern, we can write a function that removes all the vowels from a string.  The idea is to start with a string and iterate over each character, checking to see if the character is a vowel.  As we process the characters, we will build up a new string consisting of only the nonvowel characters.  To do this, we use the accumulator pattern.

Remember that the accumulator pattern allows us to keep a "running total".  With strings, we are not accumulating a numeric total.  Instead we are accumulating characters onto a string.

In [None]:
def removeVowels(s):
    vowels = "aeiouAEIOU"
    sWithoutVowels = ""
    for eachChar in s:
        if eachChar not in vowels:
            sWithoutVowels = sWithoutVowels + eachChar
    return sWithoutVowels

print(removeVowels("compsci"))
print(removeVowels("aAbEefIijOopUus"))

Line 5 uses the ``not in`` operator to check whether the current character is not in the string ``vowels``. The alternative to using this operator would be to write a very large ``if`` statement that checks each of the individual vowel characters.  Note we would need to use logical ``and`` to be sure that the character is not any of the vowels.

```python 
if eachChar != 'a'  and eachChar != 'e'  and eachChar != 'i'  and
   eachChar != 'o'  and eachChar != 'u'  and eachChar != 'A'  and
   eachChar != 'E'  and eachChar != 'I'  and eachChar != 'O'  and
   eachChar != 'U':      
   
     sWithoutVowels = sWithoutVowels + eachChar
```

Look carefully at line 6 in the above program (``sWithoutVowels = sWithoutVowels + eachChar``).  We will do this for every character that is not a vowel.  This should look very familiar.  As we were describing earlier, it is an example of the accumulator pattern, this time using a string to "accumulate" the final result. In words it says that the new value of ``sWithoutVowels`` will be the old value of ``sWithoutVowels`` concatenated with the value of ``eachChar``.  We are building the result string character by character. 

Take a close look also at the initialization of ``sWithoutVowels``.  We start with an empty string and then begin adding new characters to the end.



### <a name="exer10"></a> Exercise 10 

What is printed by the following statements: 

```python 
s = "ball"
r = ""
for item in s:
    r = item.upper() + r
print(r)
```

* A. Ball 
* B. BALL 
* C. LLAB 

[exercise 10 answers](#ans10)

## Looping and Counting 

We will finish this section  with a few more examples that show variations on the theme of iteration through the characters of a string.  We will implement a few of the methods that we described earlier to show how they can be done.


The following program counts the number of times a particular letter, ``aChar``, appears in a string.  It is another example of the accumulator pattern that we have seen in previous chapters.

In [None]:
def count(text, aChar):
    lettercount = 0
    for c in text:
        if c == aChar:
            lettercount = lettercount + 1
    return lettercount

print(count("banana","a"))

The function ``count`` takes a string as its parameter.  The ``for`` statement iterates through each character in the string and checks to see if the character is equal to the value of ``aChar``.  If so, the counting variable, ``lettercount``, is incremented by one. When all characters have been processed, the ``lettercount`` is returned.

## A `find` function 

Here is an implementation for a restricted ``find`` method, where the target is a single character.

In [None]:
def find(astring, achar):
    """
    Find and return the index of achar in astring.
    Return -1 if achar does not occur in astring.
    """
    ix = 0
    found = False
    while ix < len(astring) and not found:
        if astring[ix] == achar:
            found = True
        else:
            ix = ix + 1
    if found:
        return ix
    else:
        return -1

print(find("Compsci", "p"))
print(find("Compsci", "C"))
print(find("Compsci", "i"))
print(find("Compsci", "x"))

In a sense, ``find`` is the opposite of the indexing operator. Instead of taking an index and extracting the corresponding character, it takes a character and finds the index where that character appears for the first time. If the character is not found, the function returns ``-1``.

The ``while`` loop in this example uses a slightly more complex condition than we have seen in previous programs.  Here there are two parts to the condition.  We want to keep going if there are more characters to look through and we want to keep going if we have not found what we are  looking for.  The variable ``found`` is a boolean variable that keeps track of whether we have found the character we are searching for.  It is initialized to *False*.  If we find the character, we reassign ``found`` to *True*.

The other part of the condition is the same as we used previously to traverse the characters of the string.  Since we have now combined these two parts with a logical ``and``, it is necessary for them both to be *True* to continue iterating.  If one part fails, the condition fails and the iteration stops.

When the iteration stops, we must ask a question to find out the individual condition that caused the termination, and then return the proper value.  This is a pattern for dealing with while loops with compound conditions.

*Note* This pattern of computation is sometimes called a eureka traversal because as soon as we find what we are looking for, we can cry Eureka!  and stop looking.  The way	we stop looking is by setting ``found`` to True which causes the condition to fail.

## Optional parameters 

To find the locations of the second or third occurrence of a character in a string, we can modify the ``find`` function, adding a third parameter for the starting position in the search string:



In [None]:
def find2(astring, achar, start):
    """
    Find and return the index of achar in astring.
    Return -1 if achar does not occur in astring.
    """
    ix = start
    found = False
    while ix < len(astring) and not found:
        if astring[ix] == achar:
            found = True
        else:
            ix = ix + 1
    if found:
        return ix
    else:
        return -1

print(find2('banana', 'a', 2))

The call ``find2('banana', 'a', 2)`` now returns ``3``, the index of the first occurrence of 'a' in 'banana' after index 2. What does ``find2('banana', 'n', 3)`` return? If you said, 4, there is a good chance you understand how ``find2`` works.  Try it.

Better still, we can combine ``find`` and ``find2`` using an **optional parameter**.

In [None]:
def find3(astring, achar, start=0):
    """
    Find and return the index of achar in astring.
    Return -1 if achar does not occur in astring.
    """
    ix = start
    found = False
    while ix < len(astring) and not found:
        if astring[ix] == achar:
            found = True
        else:
            ix = ix + 1
    if found:
        return ix
    else:
        return -1

print(find3('banana', 'a', 2))



The call ``find3('banana', 'a', 2)`` to this version of ``find`` behaves just like ``find2``, while in the call ``find3('banana', 'a')``, ``start`` will be set to the **default value** of ``0``.

Adding another optional parameter to ``find`` makes it search from a starting position, up to but not including the end position.


In [None]:
def find4(astring, achar, start=0, end=None):
    """
    Find and return the index of achar in astring.
    Return -1 if achar does not occur in astring.
    """
    ix = start
    if end == None:
        end = len(astring)

    found = False
    while ix < end and not found:
        if astring[ix] == achar:
            found = True
        else:
            ix = ix + 1
    if found:
        return ix
    else:
        return -1

ss = "Python strings have some interesting methods."

print(find4(ss, 's'))
print(find4(ss, 's', 7))
print(find4(ss, 's', 8))
print(find4(ss, 's', 8, 13))
print(find4(ss, '.'))



The optional value for ``end`` is interesting.  We give it a default value ``None`` if the caller does not supply any argument.  In the body of the function we test what ``end`` is and if the caller did not supply any argument, we reassign ``end`` to be the length of the string. If the caller has supplied an argument for ``end``, however, the caller's value will be used in the loop.

The semantics of ``start`` and ``end`` in this function are precisely the same as they are in the ``range`` function.


### <a name="exer11"></a> Exercise 11

Write a function that mirrors its string argument, generating a string containing the original string and the string backwards.



In [None]:
import unittest 

def reverse(mystr):
    reversed = ''
    # Fill in the rest 

    return reversed

def mirror(mystr):
    return mystr + reverse(mystr)

class TestNotebook(unittest.TestCase):
    
    def test_mirror(self):
        self.assertEqual(mirror('good'), 'gooddoog')

    def test_mirror2(self):
        self.assertEqual(mirror('Python'), 'PythonnohtyP')
    
    def test_mirror3(self):
        self.assertEqual(mirror('a'), 'aa')
    
    def test_mirror4(self):
        self.assertEqual(mirror(''), '')

unittest.main(argv=[''], verbosity=2, exit=False)

[exercise 11 answers](#ans11)

---

## Answers to Exercises

### <a name="ans1"></a>Exercise 1

C. pythonrocks 

Yes, the two strings are glued end to end.

[Back to Exercises](#exer1)

### <a name="ans2"></a>Exercise 2 

A. python!!! 

 Yes, repetition has precedence over concatenation

 [Back to Exercises](#exer2)

### <a name="ans3"></a>Exercise 3 

B. h 

Yes, index locations start with 0. 

[Back to Exercises](#exer3)

### <a name="ans4"></a> Exercise 4

A. tr 

Yes, indexing operator has precedence over concatenation. 

[Back to Exercises](#exer4)

### <a name="ans5"></a> Exercise 5 

C. 3 

Yes, add the number of o characters and the number of p characters.

[Back to Exercises](#exer5)

### <a name="ans6"></a> Exercise 6 

A. yyyyy

Yes, s[1] is y and the index of n is 5, so 5 y characters. It is important to realize that the index method has precedence over the repetition operator. Repetition is done last.

[Back to Exercises](#exer6)

### <a name="ans7"></a>Exercise 7 

C. sum of 2 and 6 is 8; product: 12.

Yes, correct substitutions! 

[Back to Exercises](#exer7)

### <a name="ans8"></a>Exercise 8 

B. r 

Yes, len(s) is 12 and 12-5 is 7. Use 7 as index and remember to start counting with 0.

[Back to Exercises](#exer8)

### <a name="ans9"></a> Exercise 9 

C. 12 

Yes, there are 12 characters, including the blank.

[Back to Exercises](#exer9)

### <a name="ans10"></a> Exercise 10 

C. LLAB 

Yes, the order is reversed due to the order of the concatenation.

[Back to Exercises](#exer10)

### <a name="ans11"></a> Exercise 11 




In [None]:
 

def reverse(mystr):
    reversed = ''
    for char in mystr:
        reversed = char + reversed
    return reversed

def mirror(mystr):
    return mystr + reverse(mystr)


unittest.main(argv=[''], verbosity=2, exit=False)

[Back to Exercises](#exer11)