# Strings

This material is from Chapter 8 of [*Think Python*, 3rd edition](https://greenteapress.com/wp/think-python-3rd-edition), by Allen B. Downey. I have adapted it for this class.

Strings are not like integers or floats. A string is a **sequence**, which means it contains multiple values in a particular order.
In this chapter we'll see how to access the values that make up a string, and we'll use functions that process strings.

## A string is a sequence

A string is a sequence of characters. A **character** can be a letter (in almost any alphabet), a digit, a punctuation mark, or white space.

You can select a character from a string with the bracket operator.
This example statement selects character number 1 from `fruit` and
assigns it to `letter`:

In [2]:
fruit = 'banana'
letter = fruit[1]
letter

'a'

The expression in brackets is an **index**, so called because it *indicates* which character in the sequence to select.
But the result might not be what you expect.

In [3]:
letter

'a'

In [4]:
fruit[len(fruit) - 2]

'n'

The letter with index `1` is actually the second letter of the string.
An index is an offset from the beginning of the string, so the offset of the first letter is `0`.

In [5]:
fruit[0]

'b'

You can think of `'b'` as the 0th letter of `'banana'` -- pronounced "zero-eth".

The index in brackets can be a variable.

In [6]:
i = 1
fruit[i]

'a'

Or an expression that contains variables and operators.

In [7]:
fruit[i+1]

'n'

But the value of the index has to be an integer -- otherwise you get a


`TypeError`.

In [8]:
fruit[1.5]

TypeError: string indices must be integers, not 'float'

As we saw in Chapter 1, we can use the built-in function `len` to get the length of a string.

In [9]:
n = len(fruit)
n

6

To get the last letter of a string, you might be tempted to write this:

In [10]:
fruit[n]

IndexError: string index out of range

But that causes an `IndexError` because there is no letter in `'banana'` with the index 6. Because we started counting at `0`, the six letters are numbered `0` to `5`. To get the last character, you have to subtract `1` from `n`:

In [11]:
fruit[n-1]

'a'

But there's an easier way.
To get the last letter in a string, you can use a negative index, which counts backward from the end.

In [20]:
fruit[-2]  # same as fruit[len(fruit) - 2]

'n'

The index `-1` selects the last letter, `-2` selects the second to last, and so on.

### Exercise

Get a string from the user with `input()` and extract the middle three characters.

**Hint**: use `len()`, floor (`//`), indexing (`[]`), and concatentation (`+`).

In [39]:
# extract three middle characters from user input

user_input = input()

mid_point = len(user_input) // 2
mid_point
middle = user_input[mid_point]
middle

left = user_input[mid_point - 1]
right = user_input[mid_point + 1]

print(left, middle, right, sep = ' ')

subscript
s c r


#### Solution

Unhide this section to check your answer.

In [None]:
# get input
user_input = input()

## extract middle three characters

# find the midpoint index using floor to deal with even/odd str length
mid_point = len(user_input) // 2
middle = user_input[mid_point]

# get the adjacent characters
left = user_input[mid_point - 1]
right = user_input[mid_point + 1]

# concatenate result
result = left + middle + right

# print the result
print(result)


Instead of concatenting the result, you can use the `sep` argument of `print`.

In [None]:
print(left, middle, right, sep='')

To get more info on this and other options, use `help`.

In [None]:
help(print)

## String slices

A segment of a string is called a **slice**.
Selecting a slice is similar to selecting a character.

In [40]:
text = 'Slicing is easy!'
text[2:7]

'icing'

The operator `[n:m]` returns the part of the string from the `n`th
character to the `m`th character, including the first but excluding the second.
If this behavior seems counterintuitive, it may help to imagine the indices pointing *between* the characters, as in this figure:

![Slicing](https://github.com/olearydj/INSY3010/blob/main/images/slicing.png?raw=1)


Note that the slice `[11:16]` selects the letters `icing`, which means that `16` is legal as part of a slice, but not legal as an index:

In [48]:
text[11:99]  # this is ok...

'easy!'

In [49]:
text[15]  # but this is not!

'!'

If you omit the first index, the slice starts at the beginning of the string.

In [50]:
text[:7]

'Slicing'

If you omit the second index, the slice goes to the end of the string:

In [51]:
text[8:]

'is easy!'

If the first index is greater than or equal to the second, the result is an **empty string**, represented by two quotation marks:

In [52]:
text[3:3]

''

An empty string contains no characters and has length 0.

Continuing this example, what do you think `text[:]` means? Try it and
see.

In [53]:
text[:]

'Slicing is easy!'

### Exercise

Use slicing instead of indexing to repeat the previous exercise, extracting the three middle characters of a user input string.

In [55]:
user_input = input()

mid_point = len(user_input) // 2

result = user_input[(mid_point - 1):(mid_point + 2)]

print(result)

four
our


#### Solution

Unhide this section to check your answer.

In [None]:
# get input
user_input = input()

## extract middle three characters

# find the midpoint index using floor to deal with even/odd str length
mid_point = len(user_input) // 2

result = user_input[mid_point - 1:mid_point + 2]

# print the result
print(result)

## Strings are immutable

It is tempting to use the `[]` operator on the left side of an
assignment, with the intention of changing a character in a string, like this:

In [2]:
greeting = 'Hello, world!'
greeting[0] = 'J'

TypeError: 'str' object does not support item assignment

The result is a `TypeError`.
In the error message, the "object" is the string and the "item" is the character
we tried to assign.
For now, an **object** is the same thing as a value, but we will refine that definition later.

The reason for this error is that strings are **immutable**, which means you can't change an existing string.
The best you can do is create a new string that is a variation of the original.

In [3]:
new_greeting = 'J' + greeting[1:]
new_greeting

'Jello, world!'

This example concatenates a new first letter onto a slice of `greeting`.
It has no effect on the original string.

In [None]:
greeting

## String methods

Strings provide methods that perform a variety of useful operations.
A method is similar to a function -- it takes arguments and returns a value -- but the syntax is different.
For example, the method `upper` takes a string and returns a new string with all uppercase letters.

Instead of the function syntax `upper(word)`, it uses the method syntax `word.upper()`.

In [17]:
word = 'banana'
new_word = word.upper()
new_word

'BANANA'

This use of the dot operator specifies the name of the method, `upper`, and the name of the string to apply the method to, `word`. The empty parentheses indicate that this method takes no arguments.

Some of the most commonly used string methods in Python are tabulated below. The `str.method()` notation indicates that any string object (`str`) can call these methods using dot notation, as with `word.upper()` above. For reasons beyond the scope of this course, the format `str.upper(word)` will also work, but is discouraged.

| **Category**           | **Method**                          | **Description** |
|------------------------|------------------------------------|----------------|
| **Case Manipulation**  | `str.lower()`                      | Converts all characters to lowercase. |
|                        | `str.upper()`                      | Converts all characters to uppercase. |
|                        | `str.title()`                      | Capitalizes the first letter of each word. |
|                        | `str.capitalize()`                 | Capitalizes the first letter of the string. |
| **Whitespace & Formatting** | `str.strip()`                 | Removes leading and trailing whitespace. |
| **Finding & Replacing**| `str.find(sub, start, end)`        | Returns index of first occurrence of `sub`, or `-1` if not found. |
|                        | `str.replace(old, new, count)`     | Replaces occurrences of `old` with `new`. |
|                        | `str.count(sub)`                   | Counts occurrences of `sub`. |
| **Checking Content**   | `str.isalpha()`                    | Returns `True` if all characters are letters. |
|                        | `str.isdigit()`                    | Returns `True` if all characters are digits. |
|                        | `str.isalnum()`                    | Returns `True` if all characters are alphanumeric. |
|                        | `str.isspace()`                    | Returns `True` if all characters are whitespace. |
| **Splitting & Joining**| `str.split(sep, maxsplit)`         | Splits the string into a list based on `sep`. |
|                        | `str.join(iterable)`               | Joins elements of `iterable` into a string. |

Note that the values returned by string methods vary significantly. Many return strings, but others return integers, Booleans, or even list types. It is important to know the details. Use `help` or [the Python documentation](https://docs.python.org/3/library/stdtypes.html#string-methods).

In [None]:
# this way works, but is discouraged
str.upper(word)

# this way is encouraged
word.upper()

## Glossary

**sequence:**
 An ordered collection of values where each value is identified by an integer index.

**character:**
An element of a string, including letters, numbers, and symbols.

**index:**
 An integer value used to select an item in a sequence, such as a character in a string. In Python indices start from `0`.

**slice:**
 A part of a string specified by a range of indices.

**empty string:**
A string that contains no characters and has length `0`.

**object:**
 Something a variable can refer to. An object has a type and a value.

**immutable:**
If the elements of an object cannot be changed, the object is immutable.

**invocation:**
 An expression -- or part of an expression -- that calls a method.

---

Auburn University / Industrial and Systems Engineering  
INSY 3010 / Programming and Databases for ISE  
© Copyright 2025, Danny J. O'Leary.  
For licensing, attribution, and information: [GitHub INSY3010](https://github.com/olearydj/INSY3010)
