# Session 03: String operations

In this session we will learn how to manipulate strings in Python. We will learn how to:

* Extract the properties of strings
* Use string methods to manipulate strings
* Use the `in` operator to check if a string is a substring of another string
* Slice and index strings
* Use string formatting
* Concatenate strings


## Strings recap

Strings are sequences of characters. They are immutable, which means that once they are created, they cannot be changed. Strings can be created using single quotes, double quotes, or triple quotes. Here are some examples:

```python
string1 = 'Hello, World!'

string2 = "Hello, World!"

string3 = '''Hello, World!'''
```

We can check the type of a string using the `type()` function:

```python
print(type(string1))
```

> `<class 'str'>`

### Strings are immutable

Strings are immutable, which means that once they are created, they cannot be changed. Here is an example:

In [1]:
my_str = 'abcdef'

try:
    my_str[0] = 'T'  # try to change the first character
except TypeError as e:
    print(f"Error: {e}")  # print the error message if it can't be changed

Error: 'str' object does not support item assignment


## String properties

A property is a characteristic of an object. In Python, strings are objects and have properties. Some of the properties of strings are:

We are going to focus on one property in this session: the length of a string. The length of a string is the number of characters in the string. We can get the length of a string using the `len()` function:

```python
string = 'Hello, World!'

print(len(string))
```

> `13`

This means that the string `'Hello, World!'` has 13 characters, including spaces and punctuation marks like commas and exclamation marks.

In [2]:
my_str = 'this is python stuff'

print(len(my_str))

20


The previous example tells us that the string `this is python stuff` has 20 characters.

## String methods

A method is a function that can only be called on an object. Strings are objects, so they have methods.

There are many string methods in Python. Some of the most common string methods are:


Change the case of a string:
* `upper()`: Converts all characters in a string to uppercase.
* `lower()`: Converts all characters in a string to lowercase.
* `capitalize()`: Converts the first character of a string to uppercase.
* `title()`: Converts the first character of each word in a string to uppercase.
* `swapcase()`: Converts uppercase characters to lowercase and lowercase characters to uppercase.

Remove characters from a string, in the beginning, end, or both:
* `strip()`: Removes leading and trailing whitespace from a string.
* `lstrip()`: Removes leading whitespace from a string.
* `rstrip()`: Removes trailing whitespace from a string.

Check if a string starts or ends with a specified substring:
* `startswith()`: Returns `True` if a string starts with a specified substring.
* `endswith()`: Returns `True` if a string ends with a specified substring.

Replacing characters in a string:
* `replace()`: Replaces a specified substring with another substring.

Count the number of occurrences of a substring in a string:
* `count()`: Returns the number of occurrences of a specified substring in a string.

Break a string into substrings:
* `split()`: Splits a string into a list of substrings.

Join a list of substrings into a single string:
* `join()`: Joins a list of substrings into a single string.

Find the index of a substring in a string:
* `find()`: Returns the index of the first occurrence of a specified substring.

### These methods won't change the original string

It is important to note that string methods won't change the original string. Instead, they will return a new string with the desired changes. Here is an example:

In [3]:
original_str = 'this is python'

print(original_str.upper())

THIS IS PYTHON


The method shows us the effect it has on the original string, but the original string remains unchanged in memory. We can evaluate the original string to confirm this:

In [4]:
print(original_str)

this is python


### `upper()`, `lower()`, `capitalize()`, and `title()`

In [5]:
test_string = 'a string'

print(test_string.upper())

A STRING


In [12]:
test_string = 'ANOTHER STRING'

print(test_string.lower())

another string


In [7]:
test_string = 'ha ha ha'

print(test_string.capitalize())

Ha ha ha


In [8]:
test_string = 'ha ha ha'

print(test_string.title())

Ha Ha Ha


In [9]:
test_string = 'Ha Ha Ha'

print(test_string.swapcase())

hA hA hA


### `strip()`, `lstrip()`, and `rstrip()`

In [13]:
weird_string = '     is this a string?   '

weird_string

'     is this a string?   '

In [19]:
weird_string = '     is this a string?   '
weird_string_2 = 'xxxis this a string?xxx'

weird_string.strip()
weird_string_2.strip('x')

'is this a string?'

In [15]:
l_weird_string = '     is this a string?'

l_weird_string.lstrip()

'is this a string?'

In [16]:
r_weird_string = 'is this a string?   '

r_weird_string.rstrip()

'is this a string?'

### `startswith()` and `endswith()`

In [26]:
string = 'this is a string'

string.startswith('t')

True

In [28]:
# it works with substrings of more than one character
string.startswith('this')

True

In [29]:
# it is case sensitive
string.startswith('Th')

False

In [31]:
string = 'this is a string'

string.endswith('g')

True

### `replace()`

In [28]:
long_string = 'the quick brown fox jumps over the lazy dog'
long_string_2 = 'the quick brown fox jumps over the lazy dog quickly'

print(long_string.replace('quick', 'slow')) # case sensitive
print(long_string_2.replace('quick', 'slow')) # will also change parts of words (quickly / slowly)

the slow brown fox jumps over the lazy dog
the slow brown fox jumps over the lazy dog slowly


### `count()`

In [23]:
santa = 'Ho ho ho'

santa.count('ho') # case sensitive

2

In [24]:
santa.count('Ho') # case sensitive

1

### `split()`

In [29]:
phrase = 'this is a phrase, and it is a good phrase'

phrase.split()

['this', 'is', 'a', 'phrase,', 'and', 'it', 'is', 'a', 'good', 'phrase']

By default, the `split()` method splits a string at whitespace characters.

We can specify a different separator using the `sep` parameter:

In [30]:
phrase.split('is')

['th', ' ', ' a phrase, and it ', ' a good phrase']

### `join()`

In [31]:
phrase = 'this is a phrase'

words_list = phrase.split()

words_list

['this', 'is', 'a', 'phrase']

In [32]:
'XX'.join(words_list)

'thisXXisXXaXXphrase'

We choose the separator that we want to use to join the substrings, and we call the `join()` method on the separator.

### `find()`

In [43]:
phrase = 'this is a phrase'

phrase.find('is')

2

## The `in` operator

The `in` operator is used to check if a string is a substring of another string. The `in` operator returns `True` if the substring is found in the string, and `False` otherwise. Here is an example:

In [56]:
'c' in 'abcdef'

True

In [57]:
'bc' in 'abcdef'

True

In [59]:
'ae' in 'abcdef' # it looks for the whole substring

False

In [60]:
'A' in 'abcdef' # it is case sensitive

False

## Indexing and slicing strings

As we have seen in the `find()` method, the result of 

```python
phrase.find('is')
```

is `2`. This means that the substring `'is'` starts at index `2` in the string `'this is python stuff'`.

Python is a zero-based index language: in Python, when we count elements in a sequence, we start from `0`. This means that the first element in a sequence has index `0`, the second element has index `1`, and so on.

In [64]:
# extract the second letter of the string

my_string = 'abcdef'

We can extract the last element in a sequence knowing the length of the sequence.

In [65]:
length = len(my_string)

my_string[length - 1]

'f'

The previous example returns an error because the amount of characters in the string is 6, but the maximum index is 5.

We should use `len(my_string) - 1` to get the last character in the string.

In [46]:
my_string[length - 1]

'f'

Or we can use negative indexing to get the last character in the string, and count from the end of the string. 

When counting from the end of the string, the last character has index `-1`, the second-to-last character has index `-2`, and so on.

In [47]:
my_string[-1]

'f'

In [48]:
my_string[-3]

'd'

In Python, we extract elements from a sequence using square brackets `[]`. This is called slicing. We can slice strings using the following syntax:

```python
string[start:stop:[step]]
```

* `start`: The index of the first element to include in the slice, inclusive. The default value is `0`.
* `stop`: The index of the last element to include in the slice, exclusive. The default value is the length of the sequence minus `1`.
* `step`: The step size to use when slicing the string. The default value is `1`.

### Practice: slicing strings

#### Example 1

Extract the first three letter from the alphabet:

In [42]:
alphabet = 'abcdefghijklmnopqrstuvwxyz'

first_3 = alphabet[0:3]
print(first_3)

abc


#### Example 2

Extract the last three letter from the alphabet, using the length of the alphabet.

In [43]:
print(alphabet[23:26])

xyz


#### Example 3

Extract the last three letter from the alphabet, using negative indexing.

In [56]:
print(alphabet[-3:])

xyz


### Example 4

Extract all the letter in the alphabet, except the first and last characters.

In [52]:
print(alphabet[1:25])

bcdefghijklmnopqrstuvwxy


### Example 5

Extract all the letter in the alphabet in reverse order.

In [55]:
print(alphabet[::-1])

zyxwvutsrqponmlkjihgfedcba


### Example 6

Extract every other character in the alphabet, starting from the second letter.

In [62]:
print(alphabet[0:26:2])

acegikmoqsuwy


## String formatting

String formatting is a way we have to include the value of existing variables in a new string.

We have to follow the following syntax:

```python
age = 99 # it will stay true for the rest of my teaching career

f'Dani is not {age} years old'
```

> `'Dani is not 99 years old'`

In [51]:
name = 'Dani'

age = 99

f"My name is {name} and I am {age} years old"

'My name is Dani and I am 99 years old'

In [52]:
f"My name in uppercase is {name.upper()}" # we can apply methods to the variables inside the curly braces

'My name in uppercase is DANI'

In [53]:
pi = 3.14159265359
radius = 10

f"The area of a circle with radius {radius} is {pi * radius ** 2}"

'The area of a circle with radius 10 is 314.159265359'

In [54]:
f"The area of a circle with radius {radius} is {pi * radius ** 2:.3f}" # we can format the output

'The area of a circle with radius 10 is 314.159'

## Concatenating strings

Concatenating strings is the process of joining two or more strings together. We can concatenate strings using the `+` operator.

In [63]:
name = 'Dani'

last_name = 'Garcia'

full_name = name + ' ' + last_name

full_name

'Dani Garcia'

We could do the same with f-strings:

In [64]:
name = 'Dani'

last_name = 'Garcia'

full_name = f"{name} {last_name}"

full_name

'Dani Garcia'

We can repeat a string multiple times using the `*` operator:

In [67]:
santa_short = 'Ho'

full_santa = santa_short * 3

full_santa

'HoHoHo'

## Practice

### Exercise 1

Create a string that introduces yourself. The string should include your name, age, and favorite color.

Use string formatting to include the values of the variables in the string.

### Exercise 2

Create a string that includes all the letters in the alphabet with even indexes, using a variable that contains the alphabet, and using f-strings.

### Interlude: the `input()` function

The `input()` function is a built-in function in Python that allows us to get input from the user. The `input()` function takes a string as an argument, which is the prompt that will be displayed to the user.

We can save the input in a variable, and use that information in our program.

In [55]:
name = input('What is your name? ')

print(f'Your name is {name}')

Your name is Dani


### Exercise 3

Using the `input()` function, ask the user for their name, age, and favorite color.

Create a string that introduces the user, using the values entered by the user.

### Exercise 4

How can I check if a string and its reverse, are included in a palindrome?

Use this palindrome: `Go hang a salami, I’m a lasagna hog`, and look for the word `salami` and its reverse.