# String Methods

Python comes with built-in string methods that gives you the power to perform complicated tasks on strings very quickly and efficiently. These string methods allow you to change the case of a string, split a string into many smaller strings, join many small strings together into a larger string, and allow you to neatly combine changing variables with string outputs.

A string method is called at the end of a string and each one has its own method specific arguments.

Below are some quick examples of all the string methods we will be learning:

In [3]:
'Hello World'.upper()

'HELLO WORLD'

In [4]:
'Hello World'.lower()

'hello world'

In [5]:
'Hello World'.title()

'Hello World'

In [7]:
' beautiful '.join(['Hello', 'world'])

'Hello beautiful world'

In [8]:
'Hello world'.replace('H', 'J')

'Jello world'

In [9]:
'     Hello world   '.strip()

'Hello world'

In [10]:
"{} {}".format("Hello", "world")

'Hello world'

# Formatting Methods

There are three string methods that can change the casing of a string. 
These are .lower(), .upper(), and .title().

It’s important to remember that string methods can only create new strings, they do not change the original string.

In [11]:
poem_title = "spring storm"
poem_author = "William Carlos Williams"
poem_title_fixed = poem_title.title()
print(poem_title)
print(poem_title_fixed)
poem_author_fixed = poem_author.upper()
print(poem_author)
print(poem_author_fixed)

spring storm
Spring Storm
William Carlos Williams
WILLIAM CARLOS WILLIAMS


# Splitting Strings

Let’s take a look at a string method that returns a different object entirely!

.split() is performed on a string, takes one argument, and returns a list of substrings found between the given argument (which in the case of .split() is known as the <i>delimiter</i>). The following syntax should be used:

string_name.split(delimiter)

If you do not provide an argument for .split() it will default to splitting at spaces.

In [13]:
line_one = "The sky has given over"
line_one_words = line_one.split()
print(line_one_words)

['The', 'sky', 'has', 'given', 'over']


If we provide an argument for .split() we can dictate the character we want our string to be split on. This argument should be provided as a string itself.

Using .split() and the provided string, create a list called author_names containing each individual author name as it’s own string.
Then create another list called author_last_names that only contains the last names of the poets in the provided string.

In [14]:
authors = "Audre Lorde,Gabriela Mistral,Jean Toomer,An Qi,Walt Whitman,Shel Silverstein,Carmen Boullosa,Kamala Suraiyya,Langston Hughes,Adrienne Rich,Nikki Giovanni"
author_names = authors.split(',')
print(author_names)
author_last_names = []

for name in author_names:
  author_last_names.append(name.split()[-1])
  
print(author_last_names)

['Audre Lorde', 'Gabriela Mistral', 'Jean Toomer', 'An Qi', 'Walt Whitman', 'Shel Silverstein', 'Carmen Boullosa', 'Kamala Suraiyya', 'Langston Hughes', 'Adrienne Rich', 'Nikki Giovanni']
['Lorde', 'Mistral', 'Toomer', 'Qi', 'Whitman', 'Silverstein', 'Boullosa', 'Suraiyya', 'Hughes', 'Rich', 'Giovanni']


# Escape Sequences

We can also split strings using escape sequences. Escape sequences are used to indicate that we want to split by something in a string that is not necessarily a character. The two escape sequences we will cover here are 

\n Newline
\t Horizontal Tab

Newline or \n will allow us to split a multi-line string by line breaks and \t will allow us to split a string by tabs. 
\t is particularly useful when dealing with certain datasets because it is not uncommon for data points to be separated by tabs.

Example:

In [15]:
smooth_chorus = \
"""And if you said, "This life ain't good enough."
I would give my world to lift you up
I could change my life to better suit your mood
Because you're so smooth"""

chorus_lines = smooth_chorus.split('\n')

print(chorus_lines)

['And if you said, "This life ain\'t good enough."', 'I would give my world to lift you up', 'I could change my life to better suit your mood', "Because you're so smooth"]


This code is splitting the multi-line string at the newlines (\n) which exist at the end of each line and saving it to a new list called chorus_lines. Then it prints chorus_lines

The new list contains each line of the original string as it’s own smaller string. Also, notice that Python automatically escaped the ' character when it created the new list.

### Remember to still put the escape in quotes as the argument for .split()

# Joining Strings

Now that you’ve learned to break strings apart using .split(), let’s learn to put them back together using .join(). .join() is essentially the opposite of .split(), it joins a list of strings together with a given delimiter. 
The syntax of .join() is:

'delimiter'.join(list_you_want_to_join)

Now this may seem a little weird, because with .split() the argument was the delimiter, but now the argument is the list. This is because join is still a string method, which means it has to act on a string. The string .join() acts on is the delimiter you want to join with, therefore the list you want to join has to be the argument.

In [16]:
reapers_line_one_words = ["Black", "reapers", "with", "the", "sound", "of", "steel", "on", "stones"]

reapers_line_one = " ".join(reapers_line_one_words)
print(reapers_line_one)

Black reapers with the sound of steel on stones


In the last bit of code, we joined together a list of words using a space as the delimiter to create a sentence. In fact, you can use any string as a delimiter to join together a list of strings. For example, if we have the list:

santana_songs = ['Oye Como Va', 'Smooth', 'Black Magic Woman', 'Samba Pa Ti', 'Maria Maria']

We could join this list together with ANY string. One often used string is a comma , because then we can create a string of comma separated variables, or CSV.

In [17]:
santana_songs = ['Oye Como Va', 'Smooth', 'Black Magic Woman', 'Samba Pa Ti', 'Maria Maria']
santana_songs_csv = ','.join(santana_songs)
print(santana_songs_csv)

Oye Como Va,Smooth,Black Magic Woman,Samba Pa Ti,Maria Maria


You can also join using escape sequences as the delimiter. Consider the following example:

In [18]:
smooth_fifth_verse_lines = ['Well I\'m from the barrio', 'You hear my rhythm on your radio', 'You feel the turning of the world so soft and slow', 'Turning you \'round and \'round']

smooth_fifth_verse = '\n'.join(smooth_fifth_verse_lines)

print(smooth_fifth_verse)

Well I'm from the barrio
You hear my rhythm on your radio
You feel the turning of the world so soft and slow
Turning you 'round and 'round


This code is taking the list of strings and joining them using a newline \n as the delimiter. Then it prints the result and produces the output.

# .strip()

When working with strings that come from real data, you will often find that the strings aren’t super clean. You’ll find lots of extra whitespace, unnecessary linebreaks, and rogue tabs.

Python provides a great method for cleaning strings: .strip(). Stripping a string removes all whitespace characters from the beginning and end.

You can also use .strip() with a character argument, which will strip that character from either end of the string.

In [19]:
love_maybe_lines = ['Always    ', '     in the middle of our bloodiest battles  ', 'you lay down your arms', '           like flowering mines    ','\n' ,'   to conquer me home.    ']
love_maybe_lines_stripped = []
for line in love_maybe_lines:
  love_maybe_lines_stripped.append(line.strip())
#print(love_maybe_lines_stripped)
love_maybe_full = "\n".join(love_maybe_lines_stripped)
print(love_maybe_full)

Always
in the middle of our bloodiest battles
you lay down your arms
like flowering mines

to conquer me home.


To STRIP multiple items, simply repeat the function.

For example you wanted to clean up the following string = ":::::yo::::" to only be "yo".

string.split(:).split()

# Replace

The next string method we will cover is .replace(). Replace takes two arguments and replaces all instances of the first argument in a string with the second argument. The syntax is as follows:

string_name.replace(character_being_replaced, new_character)

In [20]:
toomer_bio = \
"""
Nathan Pinchback Tomer, who adopted the name Jean Tomer early in his literary career, was born in Washington, D.C. in 1894. Jean is the son of Nathan Tomer was a mixed-race freedman, born into slavery in 1839 in Chatham County, North Carolina. Jean Tomer is most well known for his first book Cane, which vividly portrays the life of African-Americans in southern farmlands.
"""

toomer_bio_fixed = toomer_bio.replace("Tomer", "Toomer")
print(toomer_bio_fixed)


Nathan Pinchback Toomer, who adopted the name Jean Toomer early in his literary career, was born in Washington, D.C. in 1894. Jean is the son of Nathan Toomer was a mixed-race freedman, born into slavery in 1839 in Chatham County, North Carolina. Jean Toomer is most well known for his first book Cane, which vividly portrays the life of African-Americans in southern farmlands.



# .find()

.find() takes a string as an argument and searching the string it was run on for that string. It then returns the first index value where that string is located.

In [21]:
'smooth'.find('t')

4

We searched the string 'smooth' for the string 't' and found that it was at the fourth index spot, so .find() returned 4.

You can also search for larger strings, and .find() will return the index value of the first character of that string.

In [22]:
"smooth".find('oo')

2

In [23]:
god_wills_it_line_one = "The very earth will disown you"
disown_placement = god_wills_it_line_one.find("disown")
print(disown_placement)

20


# .format()

Python also provides a handy string method for including variables in strings. This method is .format(). .format() takes variables as an argument and includes them in the string that it is run on. You include {} marks as placeholders for where those variables will be imported.

In [24]:
def favorite_song_statement(song, artist):
  return "My favorite song is {} by {}.".format(song, artist)
favorite_song_statement("Smooth", "Santana")

'My favorite song is Smooth by Santana.'

Now you may be asking yourself, I could have written this function using string concatenation instead of .format(), why is this method better? The answer is legibility and reusability. It is much easier to picture the end result .format() than it is to picture the end result of string concatenation and legibility is everything. You can also reuse the same base string with different variables, allowing you to cut down on unnecessary, hard to interpret code.

In [25]:
def poem_title_card(poet, title):
  return "The poem \"{}\" is written by {}.".format(title, poet)
poem_title_card("Walt Whitman", "I Hear American Singing")

'The poem "I Hear American Singing" is written by Walt Whitman.'

.format() can be made even more legible for other people reading your code by including <b>keywords</b>. Previously with .format(), you had to make sure that your variables appeared as arguments in the same order that you wanted them to appear in the string, which just added unnecessary complications when writing code.

By including keywords in the string and in the arguments, you can remove that ambiguity. Let’s look at an example

In [None]:
def favorite_song_statement(song, artist):
    return "My favorite song is {song} by {artist}.".format(song=song, artist=artist)

Now it is clear to anyone reading the string what it supposed to return, they don’t even need to look at the arguments of .format() in order to get a clear understanding of what is supposed to happen. You can even reverse the order of artist and song in the code above and it will work the same way. This makes writing AND reading the code much easier.

In [26]:
def poem_description(publishing_date, author, title, original_work):
  poem_desc = "The poem {title} by {author} was originally published in {original_work} in {publishing_date}.".format(publishing_date=publishing_date, author=author, title=title, original_work=original_work)
  return poem_desc


author = "Shel Silverstein"
title = "My Beard"
original_work = "Where the Sidewalk Ends"
publishing_date = "1974"

my_beard_description = poem_description(publishing_date, author, title, original_work)
print(my_beard_description)

The poem My Beard by Shel Silverstein was originally published in Where the Sidewalk Ends in 1974.


## Quick Review Of String Methods

.upper(), .title(), and .lower() adjust the casing of your string.

.split() takes a string and creates a list of substrings.

.join() takes a list of strings and creates a string.

.strip() cleans off whitespace, or other noise from the beginning and end of a string.

.replace() replaces all instances of a character/string in a string with another character/string.

.find() searches a string for a character/string and returns the index value that character/string is found at.

.format() and f-strings allow you to interpolate a string with variables.

Let's say we were given this clunky data set, and wanted to print out a clean version:

"Afterimages:Audre Lorde:1997,  The Shadow:William Carlos Williams:1915, Ecstasy:Gabriela Mistral:1925,   Georgia Dusk:Jean Toomer:1923,   Parting Before Daybreak:An Qi:2014, The Untold Want:Walt Whitman:1871, Mr. Grumpledump's Song:Shel Silverstein:2004, Angel Sound Mexico City:Carmen Boullosa:2013, In Love:Kamala Suraiyya:1965, Dream Variations:Langston Hughes:1994, Dreamwood:Adrienne Rich:1987"

In [27]:
highlighted_poems = "Afterimages:Audre Lorde:1997,  The Shadow:William Carlos Williams:1915, Ecstasy:Gabriela Mistral:1925,   Georgia Dusk:Jean Toomer:1923,   Parting Before Daybreak:An Qi:2014, The Untold Want:Walt Whitman:1871, Mr. Grumpledump's Song:Shel Silverstein:2004, Angel Sound Mexico City:Carmen Boullosa:2013, In Love:Kamala Suraiyya:1965, Dream Variations:Langston Hughes:1994, Dreamwood:Adrienne Rich:1987"
highlighted_poems_list = highlighted_poems.split(",")
highlighted_poems_stripped = []
for poem in highlighted_poems_list:
  highlighted_poems_stripped.append(poem.strip())
#print(highlighted_poems_stripped)
highlighted_poems_details = []
for h in highlighted_poems_stripped:
  highlighted_poems_details.append(h.split(":"))
#print(highlighted_poems_details)
titles = []
poets = []
dates = []
for l in highlighted_poems_details:
  titles.append(l[0])
  poets.append(l[1])
  dates.append(l[2])
for i in range(len(titles)):
  title, poet, date = titles[i], poets[i], dates[i]
  full = "The poem {title} was published by {poet} in {date}".format(title=title, poet=poet, date=date)
  print(full)

The poem Afterimages was published by Audre Lorde in 1997
The poem The Shadow was published by William Carlos Williams in 1915
The poem Ecstasy was published by Gabriela Mistral in 1925
The poem Georgia Dusk was published by Jean Toomer in 1923
The poem Parting Before Daybreak was published by An Qi in 2014
The poem The Untold Want was published by Walt Whitman in 1871
The poem Mr. Grumpledump's Song was published by Shel Silverstein in 2004
The poem Angel Sound Mexico City was published by Carmen Boullosa in 2013
The poem In Love was published by Kamala Suraiyya in 1965
The poem Dream Variations was published by Langston Hughes in 1994
The poem Dreamwood was published by Adrienne Rich in 1987
