# Chapter 8: Strings

## A string is a sequence

A string is a sequence, which means it is an ordered collection of other values. A string is a **sequence of characters** that can be accessed with the bracket operator:  

In [1]:
fruit = 'banana'
fruit[5]

'a'

The expression in brackets is called the **index**, which indicates which character in the sequence you want.

**Note:** The first character of the sequence is at index 0

## len

len is a built-in function that returns the number of characters in a string:

In [3]:
len(fruit)

6

To get the last letter of a string, you can try something like this:

In [4]:
fruit[len(fruit)-1]

'a'

## Traversal with a for loop

A lot of computations involve processing a string one character at a time. Often they start at the beginning, select each character in turn, do something to it, and continue to the end. This pattern is often called a **traversal**.

One way to write a traversal is with a while loop:

In [5]:
index = 0
while index < len(fruit):
    letter = fruit[index]
    print(letter)
    index = index + 1

b
a
n
a
n
a


Another way to write this traversal would be with a for loop:

In [6]:
for letter in fruit:
    print(letter)

b
a
n
a
n
a


## String slices

A segment of a string is called a **slice**. Selecting a slice is similar to selecting a character:

In [8]:
s = "Monty Python"
s[0:5]

'Monty'

In [9]:
s[6:12]

'Python'

**Note:** The operator returns all the characters in the sequence except the last.

If you omit the first or last index, the slice starts at the beginning or end of the string:

In [11]:
fruit[:3]

'ban'

In [12]:
fruit[3:]

'ana'

In [13]:
fruit[:]

'banana'

## Strings are immutable

It is tempting to use the [] operator on the left side of an assignment, with the intention of changing a character in a string:

In [14]:
greeting = 'Hello, World!'
greeting[0] = 'J'

TypeError: 'str' object does not support item assignment

The reason for the error is that strings are **immutable**, which means you can't change an existing string. The best you can do is create a new string that is a variation of the original:

In [15]:
new_greeting = 'J' + greeting[1:]
new_greeting

'Jello, World!'

## Searching

Traversing a sequence and returning when we find what we are looking for is called a **search**.

In [16]:
def find(word, letter):
    index = 0
    while index < len(word):
        if word[index] == letter:
            return index
        index = index + 1
    return -1

In [22]:
find(s, 'o', 0)

1

In [9]:
def find(word, letter, index):
    index = index
    while index < len(word):
        if word[index] == letter:
            return index
        index = index + 1
    return -1

In [20]:
find(s, 'o', 4)

10

## Looping and counting

The following program counts the number of times the letter a appears in a string. This program demonstrates another pattern of computation called a **counter**.

In [11]:
count = 0
for letter in fruit:
    if letter == 'a':
        count = count + 1
print(count)

3


In [12]:
def count(word, letter):
    count = 0
    for char in word:
        if char == 'a':
            count = count + 1
    return count

In [13]:
count(fruit,'a')

3

## String methods

Strings provodie methods that perform a variety of useful operations. A method is similar to a function - it takes an argument and returns a value - but the syntax is different. For example:

In [14]:
new_word = fruit.upper()
new_word

'BANANA'

The dot notation specifies the name of the method, upper, and the name of the string to apply the method to, word. A method call is called an **invocation**; in this case, we would say we are invoking upper on word.

As it turns out, there is a string method named find that is remarkably similar to the function we wrote.

This function has two examples of **optional arguments**:
* The second argument determines the index where the search will start
* The third argument determines the index where the search will stop

In [20]:
fruit.find('a')

1

In [21]:
fruit.find('a',3)

3

In [22]:
fruit.find('n',3,5)

4

## The in operator

The word in is a boolean operator that atkes two strings and returns True if the first appears as a substring in the second. With well chosen variable names, these statements can look like English.

In [23]:
'a' in 'banana'

True

In [25]:
def in_both(word1,word2):
    for letter in word1:
        if letter in word2:
            print(letter)

In [26]:
in_both('apples', 'oranges')

a
e
s


## String comparison

Other boolean operators include:
* == for equality
* \> for appearing later in the alphabet
* < for appearing earlier in the alphabet

In [27]:
word = 'apples'
if word < 'banana':
    print("The word, " + word +', comes before banana')
else:
    print("The word, " + word +', comes after banana')

The word, apples, comes before banana
