# Manipulating Strings

(https://automatetheboringstuff.com/chapter6/)

(https://youtu.be/0gybjpkN-UY)

Text is one of the most common forms of data your programs will handle. You already know how to concatenate two string values together with the + operator, but you can do much more than that. You can extract partial strings from string values, add or remove spacing, convert letters to lowercase or uppercase, and check that strings are formatted correctly. You can even write Python code to access the clipboard for copying and pasting text.

In this chapter, you'll learn all this and more. Then you'll work through two different programming projects: a simple password manager and a program to automate the boring chose of formatting pieces of text.

# Working With Strings

Let's look at some of the ways Python lets you write, print, and access strings in your code.

# String Literals

Typing string values in Python code is fairly straightforward: They begin and end with a single quote. But then how can you use a quote inside a string? Typing 'That is Alice's cat.' won't work, because Python thinkgs the string ends after Alice, and the rest (s cat.') is invalid Python code. Fortunately, there are multiple ways to type stings.

# Double Quotes

Strings can begin and end with doublen quotes, just as they do with single quotes. One benefit of using double quotes is that the string can have a single quote character in it. Enter the following into the interactive shell.

In [2]:
spam = "That is Alice's cat."

Strings can begin and end with double quotes, Python knows that the single quote is part of the string and not marking the end of the string. However, if you need to use both single quotes and double quotes in the string, you'll need to use escape characters.

# Escape Characters

An escape character lets you use characters that are otherwise impossible to put into a string. An escape character consists of a backslash (\) followed by the character you want to add to the string. (Despite consisting of two characters, it is commonly referred to as a singular escape character.) For example, the escape character for a single quote is \'. You can use this inside a string that begins and ends with single quotes. To see how escape characters work, enter the following into the interactive shell:

In [4]:
spam = 'Say hi to Bob\'s mother.'

In [5]:
spam

"Say hi to Bob's mother."

Python knows that since the single quote in Bob\'s has a backslash, it is not a single quote meant to end the string value. The escape characters \' and \" let you put single quotes and double quotes inside your strings, respectively.

Table 6-1 lists the escape characters you can use.

Table 6-1. Escape Characters

![image.png](attachment:image.png)

Enter the following into the interactive shell:

In [6]:
print("Hello there!\nHow are you?\nI\'m doing fine.")

Hello there!
How are you?
I'm doing fine.


# Raw Strings

You can place an r before the beginning quotation mark of a string to make it a raw string. A raw string completely ignores all escape characters and prints any backslash that appears in the string. For example, type the following into the interactive shell:

In [7]:
print(r'That is Carol\'s cat.')

That is Carol\'s cat.


Because this is a raw string, Python consideres the backslash as part of the string and not as the start of an escape character. Raw strings are helpful if you are typing string values that contain many backslashes, such as the strings used for regular expressions described in the next chapter.

# Multiline Strings With Triple Quotes

While you can use the \n escape character to put a newline into a string, it is often easier to use multiline strings. A multiline string in Python begins and ends with either three single quotes or three double quotes. Any quotes, tabs, or newlines in between the "triple quotes" are considered part of the string. Python's indentation rules for blocks do not apply to lines inside a multiline string.

Open the file editor and write the following:

In [1]:
print('''Dear Alice,

Eve's cat has been arrested for catnapping, cat burglary, and extortion.

Sincerely,
Bob''')

Dear Alice,

Eve's cat has been arrested for catnapping, cat burglary, and extortion.

Sincerely,
Bob


Notice that the single quote character in Eve's does not need to be escaped. Escaping single and double quotes is optional in raw strings. The following print() call would print identical text but doesn't use a multiline string:

In [2]:
print('Dear Alice,\n\nEve\'s cat has been arrested for catnapping, cat burglary, and extortion.\n\nSincerely,\nBob')

Dear Alice,

Eve's cat has been arrested for catnapping, cat burglary, and extortion.

Sincerely,
Bob


# Multiline Comments

While the hash character (#) marks the beginning of a comment for the rest of the line, a multiline string is often used for comments that span multiple lines. The following is perfectly valid in Python code:

In [6]:
"""This is a test Python program.
Written by Al Sweigart al@inventwithpython.com

This program was designed for Python 3, not Python 2.
"""

def spam():
    """This is a multiline comment to help
    explain what the spam() function does."""
    print('Hello!')

spam()

Hello!


# Indexing And Slicing Strings

Strings use indexes and slices the same way lists do. You can think of the string 'Hello world'! as a list and each character in the string as an item with a corresponding index.

![image.png](attachment:image.png)

The space and exclamation point are included in the character count, so 'Hello world!' is 12 characters long, from H at index 0 to ! at index 11.

Enter the following into the interactive shell:

In [7]:
spam = 'Hello world!'

In [8]:
spam[0]

'H'

In [9]:
spam[4]

'o'

In [10]:
spam[-1]

'!'

In [11]:
spam[0:5]

'Hello'

In [12]:
spam[:5]

'Hello'

In [13]:
spam[6:]

'world!'

If you specify an index, you'll get the character at that position in the string. If you specify a range from one index to another, the starting index is included and the ending index is not. That's why, if spam is 'Hello world!', spam[0:5] is 'Hello'. The substring you get from spam[0:5] will include everything from spam[0] to spam[4], leaving out the space at index 5.

Note that slicing a string  does not modify the original string. You can capture a slice from one variable in a separate variable. Try typing the following into the interactive shell:

In [14]:
spam = 'Hello world!'
fizz = spam[0:5]
fizz

'Hello'

By slicing and storing the resulting substring in another variable, you can have both the whole string and the substring handy for quick, easy access.

# The In And Not In Operators With Strings

THe in and not in operators can be used with strings just like with list values. An expression with two strings joined using in or not in will evaluate to a Boolean True or False. Enter the following into the interactive shell:

In [15]:
'Hello' in 'Hello World'

True

In [16]:
'Hello' in 'Hello'

True

In [17]:
'HELLO' in 'Hello World'

False

In [18]:
'' in 'spam'

True

In [19]:
'cats' not in 'cats and dogs'

False

These expressions test whether the first string (the exact string, case sensitive) can be found within the second string.

# Useful String Methods

(https://youtu.be/rODBsj5DfQ0)

Several string methods analyze strings or create transformed string values. This section describes the methods you'll be using most often.

# The Upper(), Lower(), Isupper(), And Islower() String Methods

The upper() and lower() string methods return a new string where all the letters in the original string have been converted to uppercase and lowercase, respectively. Nonletter characters in the string remain unchanged. Enter the following into the interactive shell:

In [20]:
spam = 'Hello world!'
spam = spam.upper()
spam

'HELLO WORLD!'

In [21]:
spam = spam.lower()
spam

'hello world!'

Note that these methods do not change the string itself but return new string values. If you want to change the original string, you have to call upper() or lower() on the string and then assign the new string to the variable where the original was stored. This is why you must use spam = spam.upper() to change the string in spam instead of simply spam.upper(). (This is just like if a variable eggs contains the value 10. Writing eggs + 3 does not change the value of eggs, but eggs = eggs + 3 does.)

The upper() and lower() methods are helpful if you need to make a case-insensitive comparison. The strings 'great' and GREat' are not equal to each other. But in the following small program, it does not matter whether the user types Great, GREAT, or grEAT, because the string is first converted to lowercase.

In [22]:
print('How are you?')
feeling = input()
if feeling.lower() == 'great':
    print('I feel great too.')
else:
    print('I hope the rest of your day is good.')

How are you?
GREAT
I feel great too.


When you run this program, the question is displayed, and entering a variation on great, such as GREat, will still give the output 'I feel great too'. AAdding code to your program to handle variations or mistakes in user input, such as inconsistent capitalization, will make your programs easier to use and less likely to fail.

The isupper() and islower() methods will return a Boolean True value if the string has at least one letter and all the letters are uppercase or lowercase, respectively. Otherwise, the methods return False. Enter the following into the interactive shell, and notive what each method call returns:

In [23]:
spam = 'Hello world!'

In [24]:
spam.islower()

False

In [25]:
spam.isupper()

False

In [26]:
'HELLO'.isupper()

True

In [27]:
'abc12345'.islower()

True

In [28]:
'12345'.islower()

False

In [29]:
'12345'.isupper()

False

Since the upper() and lower() string methods themselves return strings, you can call string methods on those returned strings values as well. Expressions that do this will look like a chain of method calls. Enter the following into the interactive shell:

In [32]:
'Hello'.upper()

'HELLO'

In [33]:
'Hello'.upper().lower()

'hello'

In [34]:
'Hello'.upper().lower().upper()

'HELLO'

In [35]:
'HELLO'.lower()

'hello'

In [36]:
'HELLO'.lower().islower()

True

# The IsX String Methods

Along with islower() and isupper(), there are several string methods that have names beginning with the word 'is'. These methods return a Boolean value that describes the nature of the the string. Here are some common isX string methods:

- isalpha() returns True if the string consists only of letters and is not blank.
- isalnum() returns True if the string consists only of letters and numbers and is not blank.
- isdecimal() returns True if the string consists only of numeric characters and is not blank.
- isspace() returns True if the string consists only of spaces, tabs, and new-lines and is not blank.
- istitle() returns True if the string consists only of words that begin with an uppercase letter followed by only lowercase letters.

Enter the following into the interactive shell: