# Strings

In this notebook you will learn to:

- Get user input with the `input()` function
- Recognize strings as a sequence type
- Access individual characters using indexing
- Extract substrings using slicing
- Understand why strings are immutable
- Use common string methods to transform text

## Getting input from users

In the previous notebook, we introduced the **Input-Process-Output** model. We've used `print()` for output and expressions/assignments for processing. Now let's complete the picture with **input**.

The `input()` function lets your program get text from the user. When Python encounters `input()`, it pauses and waits for the user to type something and press Enter.

In [None]:
name = input("What is your name? ")

In [None]:
print("Hello,", name)

The string inside `input()` is called the **prompt** — it tells the user what to type. The prompt is optional but usually helpful.

### input() always returns a string

An important detail: `input()` **always returns a string**, even if the user types a number.

In [None]:
age = input("Enter your age: ")
type(age)

If you need to do math with the input, you must convert it using `int()` or `float()`.

In [None]:
age = int(input("Enter your age: "))
type(age)

In [None]:
age + 10  # now math works

This pattern — wrapping `input()` inside `int()` or `float()` — is very common in Python programs.

### Exercise: Temperature Converter

Ask the user for a temperature in Fahrenheit, convert it to Celsius, and print the result.

**Hint**: The formula is `C = (F - 32) * 5/9`. Remember that `input()` returns a string.

In [None]:
# your code here

#### Solution

In [None]:
fahrenheit = float(input("Enter temperature in Fahrenheit: "))
celsius = (fahrenheit - 32) * 5 / 9
print(celsius, "degrees Celsius")

## A string is a sequence

Strings are not like integers or floats. A string is a **sequence**, which means it contains multiple values in a particular order.

A string is a sequence of characters. A **character** can be a letter (in almost any alphabet), a digit, a punctuation mark, or white space.

You can select a character from a string with the bracket operator.
This example statement selects character number 1 from `fruit` and assigns it to `letter`:

In [None]:
fruit = 'banana'
letter = fruit[1]

The expression in brackets is an **index**, so called because it *indicates* which character in the sequence to select.
But the result might not be what you expect.

In [None]:
letter

The letter with index `1` is actually the second letter of the string.
An index is an offset from the beginning of the string, so the offset of the first letter is `0`.

In [None]:
fruit[0]

You can think of `'b'` as the 0th letter of `'banana'` — pronounced "zero-eth".

The index in brackets can be a variable.

In [None]:
i = 1
fruit[i]

Or an expression that contains variables and operators.

In [None]:
fruit[i+1]

But the value of the index has to be an integer — otherwise you get a `TypeError`.

In [None]:
fruit[1.5]

As we saw earlier, we can use the built-in function `len` to get the length of a string.

In [None]:
n = len(fruit)
n

To get the last letter of a string, you might be tempted to write this:

In [None]:
fruit[n]

But that causes an `IndexError` because there is no letter in `'banana'` with the index 6. Because we started counting at `0`, the six letters are numbered `0` to `5`. To get the last character, you have to subtract `1` from `n`:

In [None]:
fruit[n-1]

But there's an easier way.
To get the last letter in a string, you can use a negative index, which counts backward from the end.

In [None]:
fruit[-1]  # last character

In [None]:
fruit[-2]  # second to last

The index `-1` selects the last letter, `-2` selects the second to last, and so on.

### Exercise: Middle Characters

Get a string from the user with `input()` and extract the middle three characters.

**Hint**: use `len()`, floor division (`//`), indexing (`[]`), and concatenation (`+`).

In [None]:
# extract three middle characters from user input

#### Solution

In [None]:
# get input
user_input = input("Enter a string: ")

# find the midpoint index using floor to handle even/odd length
mid_point = len(user_input) // 2

# get the middle character and its neighbors
left = user_input[mid_point - 1]
middle = user_input[mid_point]
right = user_input[mid_point + 1]

# concatenate and print result
result = left + middle + right
print(result)

Instead of concatenating the result, you can use the `sep` argument of `print`. By default, `print` puts a space between multiple values. Setting `sep=''` removes that separator.

In [None]:
print(left, middle, right, sep='')  # no spaces between values

## String slices

> **Check your understanding:** If `s = 'Python'`, what is `s[0]`? What is `s[-1]`?

A segment of a string is called a **slice**.
Selecting a slice is similar to selecting a character.

In [None]:
text = 'Slicing is easy!'
text[2:7]

The operator `[n:m]` returns the part of the string from the `n`th character to the `m`th character, including the first but excluding the second.
If this behavior seems counterintuitive, it may help to imagine the indices pointing *between* the characters, as in this figure:

![Slicing](https://raw.githubusercontent.com/olearydj/INSY3010/main/notebooks/images/slicing.png)

Note that the slice `[11:16]` selects the letters `easy!`, which means that `16` is legal as part of a slice, but not legal as an index:

In [None]:
text[11:16]  # this is ok...

In [None]:
text[16]  # but this is not!

If you omit the first index, the slice starts at the beginning of the string.

In [None]:
text[:7]

If you omit the second index, the slice goes to the end of the string:

In [None]:
text[8:]

If the first index is greater than or equal to the second, the result is an **empty string**, represented by two quotation marks:

In [None]:
text[3:3]

An empty string contains no characters and has length 0.

What do you think `text[:]` means? Try it and see.

In [None]:
text[:]

The general form of the slice syntax is `[start:stop:step]`, where `stop` is exclusive ("up to but not including stop") and the default values are `[0:len(str):1]`. Negative values for `start` or `stop` indicate the position relative to the end of the string, where `-1` is the last character. A negative `step` value indicates it will count *down*. These can be combined in various ways. We will revisit slicing when we discuss lists.

Here are some common slicing patterns:

In [None]:
s = 'Python'

# First n characters
s[:3]    # 'Pyt' — first 3

In [None]:
# Last n characters
s[-3:]   # 'hon' — last 3

In [None]:
# Every other character
s[::2]   # 'Pto' — start to end, step by 2

In [None]:
# Reverse a string
s[::-1]  # 'nohtyP' — step backwards through entire string

### Exercise: Middle Characters with Slicing

Use slicing instead of indexing to repeat the previous exercise — extract the three middle characters of a user input string.

In [None]:
# your code here

#### Solution

In [None]:
# get input
user_input = input("Enter a string: ")

# find the midpoint index
mid_point = len(user_input) // 2

# use slicing to get all three at once
result = user_input[mid_point - 1:mid_point + 2]

print(result)

## Strings are immutable

It is tempting to use the `[]` operator on the left side of an assignment, with the intention of changing a character in a string, like this:

In [None]:
greeting = 'Hello, world!'
greeting[0] = 'J'

The result is a `TypeError`.
In the error message, the "object" is the string and the "item" is the character we tried to assign.
For now, an **object** is the same thing as a value, but we will refine that definition later.

The reason for this error is that strings are **immutable**, which means you can't change an existing string.
The best you can do is create a new string that is a variation of the original.

In [None]:
greeting = 'Hello, world!'
new_greeting = 'J' + greeting[1:]
new_greeting

This example concatenates a new first letter onto a slice of `greeting`.
It has no effect on the original string.

In [None]:
greeting

### Exercise: String Surgery

You can't modify a string directly, but you can create a new one. Given `word = "Python"`:

1. Create a new string that replaces the 'P' with 'J' (result: `"Jython"`)
2. Create a new string that replaces 'on' with 'onic' (result: `"Pythonic"`)
3. Create a new string with the characters reversed (result: `"nohtyP"`) — hint: slicing can take a third value called "step"

In [None]:
word = "Python"
# your code here

#### Solution

In [None]:
word = "Python"

# 1. Replace 'P' with 'J'
jython = 'J' + word[1:]
print(jython)

# 2. Replace 'on' with 'onic'
pythonic = word[:4] + 'onic'
print(pythonic)

# 3. Reverse the string using slice with step -1
reversed_word = word[::-1]
print(reversed_word)

## The `in` operator

The `in` operator checks whether a substring appears within a string. It returns `True` or `False`.

In [None]:
'p' in 'apple'

In [None]:
'app' in 'apple'

In [None]:
'z' in 'apple'

This is useful for checking if a string contains a particular character or pattern before processing it.

## Formatted strings (f-strings)

When you want to include variable values inside a string, **f-strings** provide a clean syntax. Put `f` before the opening quote, then use curly braces `{}` to embed expressions:

In [None]:
name = "Alice"
age = 25
print(f"Hello, {name}! You are {age} years old.")

Any expression can go inside the braces:

In [None]:
x = 10
print(f"Double x is {x * 2}")

F-strings also support formatting. For example, you can control decimal places:

In [None]:
import math
print(f"Pi to two decimal places is {math.pi:.2f}")

The `:` inside the braces introduces a format specifier. Here `.2f` means "2 decimal places, floating-point format."

### Why f-strings?

The few minutes it takes to learn f-strings pays off quickly. Consider this output: `The total is $22.55, which is 19% of the budget.`

In [None]:
amount = 22.55
percent = 19

# With print() and commas — awkward spacing around $ and %
print("The total is $", amount, ", which is ", percent, "% of the budget.", sep="")

Using `sep=""` removes *all* separators, so we have to add spaces manually.

In [None]:
# With concatenation — requires str() conversion
print("The total is $" + str(amount) + ", which is " + str(percent) + "% of the budget.")

This works but is verbose. Every number needs `str()`.

In [None]:
# With f-strings — clean and readable
print(f"The total is ${amount}, which is {percent}% of the budget.")

The f-string handles everything naturally: no unwanted spaces, no manual conversion. We'll use f-strings throughout the rest of the course.

## String methods

Strings provide methods that perform a variety of useful operations.
A method is similar to a function — it takes arguments and returns a value — but the syntax is different.
For example, the method `upper` takes a string and returns a new string with all uppercase letters.

Instead of the function syntax `upper(word)`, it uses the method syntax `word.upper()`.

In [None]:
word = 'banana'
new_word = word.upper()
new_word

This use of the dot operator specifies the name of the method, `upper`, and the name of the string to apply the method to, `word`. The empty parentheses indicate that this method takes no arguments.

Some of the most commonly used string methods in Python are tabulated below. The `str.method()` notation indicates that any string object (`str`) can call these methods using dot notation, as with `word.upper()` above.

| **Category**           | **Method**                          | **Description** |
|------------------------|------------------------------------|----------------|
| **Case Manipulation**  | `str.lower()`                      | Converts all characters to lowercase. |
|                        | `str.upper()`                      | Converts all characters to uppercase. |
|                        | `str.title()`                      | Capitalizes the first letter of each word. |
|                        | `str.capitalize()`                 | Capitalizes the first letter of the string. |
| **Whitespace & Formatting** | `str.strip()`                 | Removes leading and trailing whitespace. |
| **Finding & Replacing**| `str.find(sub)`                    | Returns index of first occurrence of `sub`, or `-1` if not found. |
|                        | `str.replace(old, new)`            | Replaces occurrences of `old` with `new`. |
|                        | `str.count(sub)`                   | Counts occurrences of `sub`. |
| **Checking Content**   | `str.isalpha()`                    | Returns `True` if all characters are letters. |
|                        | `str.isdigit()`                    | Returns `True` if all characters are digits. |
|                        | `str.startswith(prefix)`           | Returns `True` if string starts with `prefix`. |
|                        | `str.endswith(suffix)`             | Returns `True` if string ends with `suffix`. |
| **Splitting & Joining**| `str.split(sep)`                   | Splits the string into a list based on `sep`. |
|                        | `str.join(iterable)`               | Joins elements of `iterable` into a string. |

Note that the values returned by string methods vary — some return strings, others return integers or other types. Use `help()` or [the Python documentation](https://docs.python.org/3/library/stdtypes.html#string-methods) for details. We'll cover `split()` and `join()` after introducing lists in a later notebook.

In [None]:
# Example: chaining methods
messy = "  HELLO  "
clean = messy.strip().lower()
print(clean)

### Exercise: Method Practice

Given `messy = "  HeLLo, WoRLd!  "`:

1. Remove the leading/trailing whitespace
2. Convert to all lowercase
3. Count how many times 'l' appears (case-insensitive — think about the order of operations)
4. Replace 'world' with 'Python' (case-insensitive — again, order matters)
5. Check if the cleaned, lowercased string starts with 'hello'

In [None]:
messy = "  HeLLo, WoRLd!  "
# your code here

#### Solution

In [None]:
messy = "  HeLLo, WoRLd!  "

# 1. Remove whitespace
stripped = messy.strip()
print("Stripped:", stripped)

# 2. Convert to lowercase
lowered = stripped.lower()
print("Lowered:", lowered)

# 3. Count 'l' (case-insensitive: lowercase first, then count)
l_count = lowered.count('l')
print("Count of 'l':", l_count)

# 4. Replace 'world' with 'Python' (must be lowercase to match)
replaced = lowered.replace('world', 'python')
print("Replaced:", replaced)

# 5. Check if it starts with 'hello'
starts_hello = lowered.startswith('hello')
print("Starts with 'hello':", starts_hello)

## Discussion: Working with Strings

**Single vs. double quotes:** Python treats `'hello'` and `"hello"` identically. Choose based on content — use double quotes when your string contains an apostrophe (`"it's"`), single quotes when it contains a double quote (`'She said "hi"'`). Be consistent within your code.

**Why immutability matters:** Strings being immutable might seem inconvenient, but it has important benefits. It makes strings "safe" to pass around — you can share a string without worrying that some other part of your code will change it unexpectedly. It also allows Python to optimize memory by reusing identical strings. When you need to build a string piece by piece, collect the pieces and join them at the end rather than repeatedly concatenating.

**Methods return new strings:** Because strings are immutable, methods like `upper()` and `replace()` return *new* strings — they don't modify the original. A common mistake is writing `s.upper()` and expecting `s` to change. You need `s = s.upper()` to update the variable.

## Common Gotchas

**Off-by-one with length**

`len(s)` returns the count of characters, but the last valid index is `len(s) - 1`. Using `s[len(s)]` causes an IndexError.

In [None]:
s = "hello"
print(len(s))      # 5
print(s[4])        # 'o' - last character
# s[5] would cause IndexError

**Slice end is exclusive**

`s[0:3]` gives characters at indices 0, 1, 2 — not 3. The end index is "up to but not including."

In [None]:
s = "hello"
print(s[0:3])  # 'hel', not 'hell'
print(s[1:4])  # 'ell'

**Strings are immutable**

`s[0] = 'X'` doesn't work. You must create a new string instead.

In [None]:
s = "hello"
# s[0] = 'H'  # This would cause TypeError
s = 'H' + s[1:]  # Create new string instead
print(s)

**Negative index gotcha**

`s[-1]` is the last character, `s[-2]` is second-to-last. But `s[-0]` is the same as `s[0]`, not the last character — because `-0` equals `0`.

In [None]:
s = "hello"
print(s[-1])   # 'o' - last
print(s[-0])   # 'h' - same as s[0], NOT last!

**Empty slices don't error**

`s[5:2]` returns `''` (empty string), not an error. This can hide bugs where you accidentally swap indices.

In [None]:
s = "hello"
print("result: [" + s[5:2] + "]")  # empty string - no error!
print("result: [" + s[2:5] + "]")  # 'llo' - correct order

## Glossary

**prompt:**
A string displayed by `input()` that tells the user what to type.

**sequence:**
An ordered collection of values where each value is identified by an integer index.

**character:**
An element of a string, including letters, numbers, and symbols.

**index:**
An integer value used to select an item in a sequence, such as a character in a string. In Python indices start from `0`.

**slice:**
A part of a string specified by a range of indices.

**empty string:**
A string that contains no characters and has length `0`.

**object:**
Something a variable can refer to. An object has a type and a value.

**immutable:**
If the elements of an object cannot be changed, the object is immutable.

**method:**
A function that is associated with an object and called using dot notation.

## Problems

### String Challenges

**★ 1. Palindrome Check**

A palindrome reads the same forwards and backwards (e.g., "radar", "level"). Use slicing to reverse a string, then compare the original to the reversed version. Try it with a few different words and observe the boolean result.

In [None]:
word = "radar"
# Reverse the word and compare it to the original
# Try with "radar", "hello", and "level"

**★★★ 2. Initial Extractor**

Given a full name with exactly three parts like `"Ada Byron Lovelace"`, extract just the initials (`"ABL"`).

Hint: The first initial is at index 0. Use `find(' ')` to locate the first space, then the second initial is one position after that. Use `find(' ', start)` to find the second space starting from a position after the first.

In [None]:
full_name = "Ada Byron Lovelace"
# your code here

**★★ 3. Email Parser**

Given an email like `"student@auburn.edu"`, extract the username (before @) and domain (after @). Hint: use `find()` to locate the @ symbol, then use slicing.

In [None]:
email = "student@auburn.edu"
# your code here - print both username and domain

**★★ 4. Simple Censor**

Replace every occurrence of the word "spam" with "****" in a string. Make it case-insensitive (SPAM, Spam, spam should all be replaced).

In [None]:
text = "I love SPAM! Spam is great. Do you like spam?"
# your code here

### Fix This Code

**★★ 5.** The following code is supposed to extract a username from an email address and capitalize it, but it has several errors. Run the cell to see what happens, then find and fix the errors in the empty cell below.

In [None]:
email = "john.doe@auburn.edu"

# Find the @ symbol
at_position = email.find[@]

# Get everything before the @
username = email[0, at_position]

# Capitalize the username
username.upper()

print("Welcome, " + username)

In [None]:
# Write your corrected version here:

---

Auburn University / Industrial and Systems Engineering  
INSY 3010 / Programming and Databases for ISE  
© Copyright Danny J. O'Leary.

This material is adapted from [*Think Python*, 3rd edition](https://greenteapress.com/wp/think-python-3rd-edition), by Allen B. Downey. For licensing, attribution, and information: [GitHub INSY3010](https://github.com/olearydj/INSY3010)