 # CIS 1051 - Temple Rome Spring 2023

## Intro to Problem solving and 
## Programming in Python

![LOGO](img/temple-logo.png)

![LOGO](img/temple-logo.png)

### Strings

Prof. Andrea Gallegati

( [tuj81353@temple.edu](tuj81353@temple.edu) )

## A String is a Sequence

are not like integers, floats, and booleans. 

Strings are sequences: an ordered collection of other values.

We can access the characters one at a time with the bracket operator:

In [2]:
fruit = 'banana'
letter = fruit[1]

The expression in brackets is called an **index**: which **character** in the sequence you want (hence the name).

But we might not get what we expect!

In [3]:
letter

'a'

- For most people, the first letter of `'banana'` is `b`, not `a`. 
- For computer scientists, the index is an **offset** from the beginning of the string.

<p style="text-align: center;">... and the offset of the first letter is zero!</strong></p>

In [4]:
letter = fruit[0]
letter

'b'

- `b` is the *0th* letter (*“zero-eth”*) of `'banana'`
- `a` is the *1th* letter (*“one-eth”*) of `'banana'`
- `n` is the *2th* letter (*“two-eth”*) of `'banana'`

We can use an expression with variables and operators for the **index**.

In [6]:
i = 1
fruit[i]

'a'

In [7]:
fruit[i+1]

'n'

But it has to be an integer, otherwise ...

In [8]:
letter = fruit[1.5]

TypeError: string indices must be integers

## len

a built-in function that returns the number of characters in a string.

In [9]:
fruit = 'banana'
len(fruit)

6

To get the **last letter**, we might be tempted by:

In [10]:
length = len(fruit)
last = fruit[length]

IndexError: string index out of range

But there is no letter in `'banana'` with index `6`. 

If we start counting at **zero**, the six letters are numbered 0 to 5. To get the **last character**

In [11]:
last = fruit[length-1]
last

'a'

We can use **negative indices**, which count backward from the end of the string:
- `fruit[-1]` yields the last letter
- `fruit[-2]` yields the second to last
- and so on...

## Traversal with a for Loop

Some algorithms (e.g. Cryprography) involve processing a string one character at a time:
- they start at the beginning
- select each character in turn
- do something to it
- continue until the end

This **pattern** of processing is called a **traversal**. 

One way to write a **traversal** is with a **while loop**:

In [12]:
index = 0
while index < len(fruit):
    letter = fruit[index]
    print(letter)
    index = index + 1

b
a
n
a
n
a


- The loop **condition** is `index < len(fruit)` thus, when `index = len(fruit)` is equal to the length of the string, the condition is `false`. 
- The last character of the string, accessed, is the one with `index = len(fruit) - 1`.

Another way to write a **traversal** is with a **for loop**:

In [13]:
for letter in fruit:
    print(letter)

b
a
n
a
n
a


Each **iteration**, the next character is assigned to the variable `letter`, until no characters are left.

We can use **concatenation** (string addition) and a **for loop** to generate an *abecedarian series*:

In [14]:
prefixes = 'JKLMNOPQ'
suffix = 'ack'

for letter in prefixes:
    print(letter + suffix)

Jack
Kack
Lack
Mack
Nack
Oack
Pack
Qack


where the names of the ducklings are  in alphabetical order.

## String Slices

segments of a string. Selecting a slice is similar to selecting a character:

In [15]:
s = 'Monty Python'
s[0:5]

'Monty'

In [16]:
s[6:12]

'Python'

The operator `[n:m]` returns the **part** of the string:
- from the *“n-eth”* character, included
- to the *“m-eth”* character, excluded

This behavior is counterintuitive. 

It might help to imagine indices **in between** one character and the other.

<p align="center"><img src="img/string.png" style="margin:auto" width="600">

Omitting the first index (before the colon `:`), the slice **starts at the beginning** of the string:

In [17]:
fruit = 'banana'
fruit[:3]

'ban'

Omitting the second index (after the colon `:`), the slice **goes to the end** of the string:

In [18]:
fruit[3:]

'ana'

First index **greater than/equal** to the second, results in an **empty** string:

In [19]:
fruit[3:3]

''

no characters (`length = 0`), but other than that, it is the same as any other string!

Omitting both indices (between the colon `:`) results in the **whole** string:

In [20]:
fruit[:]

'banana'

## Strings Are Immutable

We might be tempted to use the `[]` operator on the left side of an assignment, to change a character in a string:

In [21]:
greeting = 'Hello, world!'
greeting[0] = 'J'

TypeError: 'str' object does not support item assignment

- The **object** here is the string.
- The **item** is the character we tried to assign. 

For now, an **object** is the same thing as a value, but we will refine that definition later!

The reason for this **error** is that **strings are immutable**: we can’t change an existing string. 

We can create a new string that is a variation on the original, with no effect on the original one:

In [22]:
greeting = 'Hello, world!'
new_greeting = 'J' + greeting[1:]
new_greeting

'Jello, world!'

## Searching

In [23]:
def find(word, letter):
    index = 0
    while index < len(word):
        if word[index] == letter:
            return index
        index = index + 1
    return -1

In some sense, the `find` function here above, is the inverse of the `[]` operator. 

It takes a character and finds the **index** where that character appears.

Otherwise, it returns `-1`.

This is the first example we have seen of a return statement inside a loop. If word[index] == letter, the function breaks out of the loop and returns immediately.

If the character doesn’t appear in the string, the program exits the loop normally and returns -1.

This pattern of computation—traversing a sequence and returning when we find what we are looking for—is called a search.

As an exercise, modify find so that it has a third parameter: the index in word where it should start looking.