# Strings

## String is a Sequence

A string is a __sequence__ of characters. You can access the characters one at a time with the bracket operator:


In [6]:
fruit = "pinapple"
letter = fruit[1]

The second statement selects character number 1 from ```fruit``` and assigns it to ```letter```.

The expression in brackets is called an index. The index indicates which character in the sequence you want (hence the name).

In [7]:
print(letter)

i


For most people, the first letter of ```'pinapple'``` is p, not i. But for computer scientists, the index is an offset from the beginning of the string, and the offset of the first letter is zero.

In [8]:
letter = fruit[0]
print(letter)

p


So b is the 0th letter (“zero-eth”) of ```'pinapple'```, a is the 1th letter (“one-eth”), and n is the 2th d(“two-eth”) letter.

You can use any expression, including variables and operators, as an index, but the value of the index has to be an integer. Otherwise you get:

In [9]:
letter = fruit[1.5]

TypeError: string indices must be integers

## The ```len``` function
The ```len``` is a built-in function that returns the number of characters in a string:

In [1]:
fruit = 'banana'
len(fruit)

6

If you try to get the last letter in variable ```fruit```, you should use ```n-1``` for last item as in indexing, otherwise you will get an ```index out of range``` error.

In [3]:
length = len(fruit)
fruit[length]

IndexError: string index out of range

In [4]:
fruit[length-1]

'a'

Alternatively, you can use negative indices, which count backward from the end of the string. The expression ```fruit[-1]``` yields the last letter, ```fruit[-2]``` yields the second to last, and so on.

## Traversal with a ```for``` loop
We usually process string one character at a time to make most out of the computations. Going from the first character to the last one is called **traversal**. We can write a traversal using while loop but it's not really efficient in terms of time spend on writing the code: 

In [6]:
fruit = 'pinapple'
index = 0
while index < len(fruit):
    letter = fruit[index]
    print(letter)
    index = index + 1

p
i
n
a
p
p
l
e


This loop traverses the string and displays each letter on a line by itself. The loop condition is ```index < len(fruit)```, so when ```index``` is equal to the length of the string, the condition is false, and the body of the loop is not executed. The last character accessed is the one with the index ```len(fruit)-1```, which is the last character in the string.

### Try Yourself!
__Exercise:__ *Write a function that takes a string as an argument and displays the letters backward, one per line.*

Another and more pythonic way of writing traversal is with for loops:

In [7]:
for char in fruit:
    print(char)

p
i
n
a
p
p
l
e


Each time through the loop, the next character in the string is assigned to the variable ```char```. The loop continues until no characters are left.

The following example shows how to use concatenation (string addition) and a for loop to generate an abecedarian series (that is, in alphabetical order). In Robert McCloskey’s book MakeWay for Ducklings, the names of the ducklings are Jack, Kack, Lack, Mack, Nack, Ouack, Pack, and Quack. This loop outputs these names in order:

In [8]:
prefixes = 'JKLMNOPQ'
suffix = 'ack'
for letter in prefixes:
    print(letter + suffix)

Jack
Kack
Lack
Mack
Nack
Oack
Pack
Qack


Of course, that’s not quite right because “Ouack” and “Quack” are misspelled.

### Try Yourself!
__Exercise__: *Modify the program to fix this error.*

## String Slices
A segment of a string is called a **slice**. Selecting a slice is similar to selecting a character:


In [10]:
s = 'Monty Python'
print(s[0:5])
print(s[6:12])

Monty
Python


The operator ```[n:m]``` returns the part of the string from the “n-eth” character to the “m-eth” character, including the first but excluding the last. This behavior is counterintuitive, but it might help to imagine the indices pointing between the characters.

If you omit the first index (before the colon), the slice starts at the beginning of the string. If you omit the second index, the slice goes to the end of the string:

In [11]:
fruit = 'banana'
fruit[:3]

'ban'

In [12]:
fruit[3:]

'ana'

If the first index is greater than or equal to the second the result is an **empty string**, represented by two quotation marks:

In [13]:
fruit = 'banana'
fruit[3:3]

''

An empty string contains no characters and has length 0, but other than that, it is the same as any other string.

### Try Yourself!
__Exercise:__ *Given that ```fruit``` is a string, what does ```fruit[:]``` mean?*

## Strings are immutable
