# Class 5 - Strings

## Introduction

In this notebook, we will explore the `string` data type.

Strings are sequences of characters used to represent text.

Python makes working with text particularly easy and intuitive compared to many other programming languages.

## The `str` Data Type

In Python, text is represented using the `str` (string) data type. Strings are immutable sequences of characters, which means:
- They contain ordered collections of characters
- Once created, they cannot be changed ("immutable"; though you can create new strings based on them)

### String Literals

To create a string in Python, you enclose text in either single quotes (`'`) or double quotes (`"`). Both work exactly the same way, allowing you flexibility in your code.

In [None]:
# Creating a string with single quotes
'Hello world'

In [None]:
# Creating a string with double quotes
"Hello world"

**Why have two different quote styles?**

Having both single and double quotes is useful when your string itself contains quote characters:
- Use double quotes when your string contains single quotes
- Use single quotes when your string contains double quotes

In [None]:
# Using double quotes when the string contains a single quote (apostrophe)
print("I'm learning Python strings")

# Using single quotes when the string contains double quotes
print('She said, "Python is easy!"')

## Multi-line Strings and Docstrings

Sometimes we need to work with text that spans multiple lines. Python provides several ways to handle this.

In [None]:
# A long string on a single line - can be hard to read in code
print('Some variables might be very very very long. You can write them down as a continuous sentence. However this might not be very readable in certain development environments.')

### Line Continuation with Backslash

You can use the backslash character (`\`) to indicate that a string continues on the next line. This is useful for breaking up long strings in your code while still treating them as a single line of text.

In [None]:
# Using backslash for line continuation
print('Some variables might be very very \
very long. You can write them down as a \
continuous sentence. However this might \
not be very readable in certain \
development environments.')

### Triple Quotes for Multi-line Strings

A more common approach is to use triple quotes (`'''` or `"""`) to create multi-line strings. These preserve the line breaks in your text and are especially useful for documentation.

In [None]:
# Using triple quotes for a multi-line string
print('''Some variables might be very very
very long. You can write them down as a


continuous sentence. However this might
not be very readable in certain
development environments.''')

### Docstrings

Triple-quoted strings are commonly used for documentation in Python. When placed at the beginning of a function, class, or module, they become "docstrings" that describe what the code does.

You can access these docstrings using the `help()` function.

In [None]:
# Use the help() function to view the docstring of the max() function
help(max)

## Characters are orderred alpha-numerically

In [None]:
print('a' > 'b')
print('b' > 'a')
print('z' > 'b')

In [None]:
print('A' < 'Z')

In [None]:
print('9' > '7')


### Exercise
What do you think will be the result of the following code:

In [None]:
print('9' > '77')

In [None]:
print(9 > 77)

## Characteristics of Strings

Strings in Python have three key characteristics:

1. **Sequence of Characters**: A string is made up of individual characters.
2. **Ordered Collection**: The order of characters matters and is preserved.
3. **Fixed Length**: Each string has a specific length (number of characters).

These properties allow us to access individual characters, extract portions of strings, and perform various operations on them.

In [None]:
# Creating a string and checking its type
s = 'shalom'
type(s)

### String Length

We can find the length of a string (the number of characters it contains) using the `len()` function.

In [None]:
# Finding the length of a string
len(s)

In [None]:
# Another example with spaces
s = 'Hello World'
print(s)
print(len(s))  # Note: spaces count as characters!

### String Indexing

Since strings are ordered sequences, we can access individual characters using their position or "index" in the string. In Python, indexing starts at 0, not 1.

Think of a string as a row of mailboxes, numbered starting from 0:

```
H  e  l  l  o     W  o  r  l  d
0  1  2  3  4  5  6  7  8  9  10
```

In [None]:
# Accessing characters by index
print(s[0])  # First character
print(s[1])  # Second character
print(s[-1]) # Last character (negative indices count from the end)

### Negative Indexing

Python allows negative indices, which count from the end of the string. This is often more convenient than calculating positions from the beginning:


In [None]:
s = 'Indexing in Python'
s[-2]  # Second-to-last character

### Finding the Last Character

There are two common ways to access the last character of a string:

In [None]:
s = 'abcd'
len(s)

How do we index the last character in a string if the length changes?

The **generic** way:

In [None]:
# Method 1: Calculate the last index
s = input()
last_index = len(s) - 1
print(s[last_index], last_index)

The **Pythonix** way:

In [None]:
# Method 2: Use negative indexing
print(s[-1])  # Last character
print(s[-2])  # Second-to-last character
print(s[-3])  # Third-to-last character

### Quick Exercise

Given a string containing your full name, write code to print your initials (first letter of each name).

In [None]:
# Your solution here
full_name = "John Adam Smith"
# Expected output: JAS

## String Slicing

Slicing allows us to extract a portion (substring) of a string. The syntax is:

```python
string[start:end:step]
```

Where:
- `start` is the index where the slice begins (inclusive)
- `end` is the index where the slice ends (exclusive)
- `step` is the stride between characters (___optional___, default is 1)

Think of it as specifying "from position X up to (but not including) position Y, taking every Z character."

In [None]:
# Basic substring extraction
x = 'keep it simple'
x[5:9]  # Characters at indices 5, 6, 7, 8

### Omitting Slice Parameters

You can omit slice parameters to use their default values:
- Omitting `start` defaults to the beginning of the string (index 0)
- Omitting `end` defaults to the end of the string

In [None]:
# From index 5 to the end
print(x[5:len(x)])
print(x[5:])  # Equivalent, more concise

In [None]:
# From beginning to index 5 (not including 5)
print(x[:5])
print(x[0:5])  # Equivalent

In [None]:
# The entire string
x[:]

### Using Step in Slicing

The `step` parameter lets you take every nth character in the specified range.

In [None]:
# For reference, here's how range works with a step
list(range(2, 20, 3))  # From 2 to 20 (exclusive), step 3

In [None]:
# Skipping every other character (step 2)
x[1:10:2]

In [None]:
# Taking every third character from the entire string
x[::3]

### Reversing a String

One of the most useful applications of the step parameter is reversing a string by using a negative step.

In [None]:
# Original string
s = 'abcd'
s

In [None]:
# Reversing a string with a negative step
s[::-1]

The `[::-1]` slice is a common Python idiom for reversing any sequence. It means "take all characters from beginning to end, in reverse order."

In [None]:
# Original string remains unchanged (strings are immutable)
s

In [None]:
# Other examples of negative step
s = "abcdefghi"

print(s[:-2])  # All characters except the last two
print(s[::-2])  # Every other character, in reverse

### Exercise
Ask the user to write their name ***backwards*** and then reverse and print their name in the "correct" order.

In [None]:
# ENTER YOUR CODE HERE

## Combining Strings

Python provides several ways to combine or concatenate strings.

### String Concatenation with the `+` Operator

The most basic way to combine strings is using the `+` operator.

In [None]:
# Basic concatenation
s = 'Hello'
name = 'Danni'
x = s + ' ' + name
print(x)

### String Repetition with the `*` Operator

You can repeat a string multiple times using the `*` operator.

In [None]:
# Repeating a string
(s + ' ') * 10

### String Formatting

For more complex string combinations, especially when including variables or expressions, Python offers several formatting methods:

1. f-strings
2. The `format()` method

In [None]:
name = "Alice"
age = 25

# f-string (recommended for most cases)
print(f"Hello, {name}! You are {age} years old. When you finish your degree you will be {age + 3} years old.")

# format() method
print("Hello, {}! You are {} years old.".format(name, age))


## Strings are Immutable

One of the most important characteristics of strings in Python is that they are **immutable**, meaning they cannot be changed after creation. Any operation that appears to modify a string actually creates a new string.

In [None]:
# Create a string
s = 'Hello'

In [None]:
# We can access individual characters
print(s)
s[0]

In [None]:
# But we cannot modify individual characters
s[0] = 'h'

### Working with Immutable Strings

To "modify" a string, you need to create a new string with the desired changes.

In [None]:
# You can reassign the variable to a new string
s = input()
s

In [None]:
# To "change" a character, create a new string with the desired modification
s2 = s[0] + '#' + s[2:]
s2

## String Methods

Python strings come with many built-in methods that allow you to manipulate and analyze text. A method is a function that "belongs to" an object - in this case, a string.

In [None]:
s = 'hello world'
s

### Case Conversion Methods

Python provides several methods to change the case of strings:

In [None]:
# Manual case conversion (not recommended)
print(s[0:5] + s[5:].upper())

In [None]:
# Using the upper() method
print(s.upper())

### Important: Methods Don't Modify the Original String

Because strings are immutable, string methods always return a new string. The original string remains unchanged.

In [None]:
# The original string is unchanged
s

In [None]:
# To keep the changes, assign the result back to a variable
s = s.upper()   # <--- PAY ATTENTION TO THE "=" (ASSIGNMENT) OPERATOR
print(s)

### Common String Methods

Here are some useful string methods:

In [None]:
s = 'Keep it simple'
print(1, s)

# split() - Splits a string into a list of substrings
l = s.split()
print(2, l)
print(3, type(l))

In [None]:
# Case conversion methods
s = s.upper()
print(s)
print(s.lower())

In [None]:
# capitalize() - Capitalizes the first character
'keep it SIMPLE'.capitalize()

In [None]:
# count() - Counts occurrences of a substring
s = 'ABBACTTGCCABCAB'
s.count('AB')  # Count all occurrences of 'AB'

In [None]:
# count() with start and end parameters
s.count('AB', 2, len(s)-3)  # Count occurrences between positions 2 and len(s)-3

In [None]:
# replace() - Replaces occurrences of a substring
s = 'keep it simple'
s.replace('e', 'E')  # Replace all 'e' with 'E'

In [None]:
# replace() with count parameter
s = 'keep it simple'
s.replace('e', 'E', 2)  # Replace only the first 2 occurrences of 'e'

In [None]:
# find() - Returns the lowest index of a substring
s.find('p')  # First occurrence of 'p'

In [None]:
# rfind() - Returns the highest index of a substring (searching from the right)
s.rfind('p')  # Last occurrence of 'p'

In [None]:
# strip() - Removes whitespace from beginning and end
"   hello world   ".strip()

In [None]:
# startswith() and endswith() - Check if string starts or ends with a substring
filename = "document.pdf"
print(f"Starts with 'doc': {filename.startswith('doc')}")
print(f"Ends with '.txt': {filename.endswith('.txt')}")
print(f"Ends with '.pdf': {filename.endswith('.pdf')}")

In [None]:
# join() - Combines a list of strings with the string as separator
words = ["Python", "is", "awesome"]
" ".join(words)

In [None]:
# Different separator
"-".join(words)

You can find the complete documentation online at:
- [Python String Methods](https://docs.python.org/3/library/stdtypes.html#string-methods)

For a quick video explanation on methods: https://youtu.be/dbU91k-C5aY

## Practice Exercises

Here are a few exercises to practice what you've learned. Try to solve them on your own before looking at the solutions.

### Exercise 1: Name Formatter

Write a function that takes a full name (first and last name) and returns it in the format "Last, First". For example, "John Smith" should become "Smith, John".

In [None]:
# Your solution here


### Exercise 2: Email Validator

Write a function that checks if a string looks like a valid email address. For simplicity, consider an email valid if it contains an @ symbol with text before and after it, and ends with ".com", ".org", or ".edu".

In [None]:
# Your solution here


### Exercise 3: Word Counter

Write a function that counts the number of words in a sentence. For simplicity, assume words are separated by spaces.

In [None]:
# Your solution here
