# Strings

## 1. What is a String?

A string is a datatype in python that is used to represent text in Python

In [1]:
name = 'Brian'
color = 'Baby Blue'

We can also use strings to combine other strings. This method is called _concatenating_

We can also use the `len` function to see how long a string is. A lot of the data we are interacting with involves strings, so it is important to learn them

In [7]:
def double_word(word):
    # If the string is empty
    if not word:
        return 0
    # Duplicate word
    second_word = word
    # Get length of two words combined
    len_two_words = str(len(word + second_word))
    return word + second_word + len_two_words

print(double_word("hello")) # Should return hellohello10
print(double_word("abc"))   # Should return abcabc6
print(double_word(""))      # Should return 0

hellohello10
abcabc6
0


There are tons of things we could do with strings in our scripts. For example, we can check if files are named a certain way by looking at the filename and seeing if they match our criteria, or we can create a list of emails by checking out the users of our system and concatenating our domain.

## 2. Parts of a String

We can access parts of a string through __string indexing__

In [8]:
name = 'Brian'
print(name[0])

B


We can also have negative indexes as well. This lets us access indexes starting from the last character

In [9]:
text = 'Random string with a lot of characters'
print(text[-1])
print(text[-2])

s
r


In [11]:
'''
Modify the first_and_last function so that it returns True if the first letter of the string is the same as the last 
letter of the string, False if they’re different. Remember that you can access characters using message[0] or message[-1]. 
Be careful how you handle the empty string, which should return True since nothing is equal to nothing.
'''

def first_and_last(message):
    # If string is empty
    if not message:
        return True
    # Get message length
    message_length = len(message)
    # Compare first index with last index
    if message[0] == message[message_length - 1]:
        return True
    return False

print(first_and_last("else"))
print(first_and_last("tree"))
print(first_and_last(""))

True
False
False


We can also access slices of a string.

A __slice__ is the portion of a string that can contain more than one character; also sometimes called a __substring__. We can do this by using a range and a colon as a separator

In [12]:
color = 'Orange'
color[1:4]

'ran'

In [14]:
fruit = 'Pineapple'
print(fruit[:4])
print(fruit[4:])

Pine
apple


## 3. Creating New Strings

We cannot modify strings in Python because they are _immutable_. What we can do is create a new string based on the old one

In [16]:
message = 'A kong string with a silly typo'
new_message = message[0:2] + 'l' + message[3:]
print(new_message)

A long string with a silly typo


So, we figured out how to create a new message from the old one. But how are we supposed to know which character to change? Let's try something different.

In [19]:
pets =  'Cats & Dogs'

# If there are more than one instance of a character,
# the method will return the first instance
print(pets.index('&'))
print(pets.index('s'))
print(pets.index('Dog'))

# If there is no substring present, then it will return an error

5
3
7


In this case, we are using a string method called _index()_ to get the index of a certain character

Try using the index method yourself now! Using the index method, find out the position of "x" in "supercalifragilisticexpialidocious".

In [21]:
word = "supercalifragilisticexpialidocious"
print(word.index('x'))

21


We said, that if the substring isn't there, we would get an error. So how can we know if a substring is contained in a string to avoid the error?

We would use the `in` keyword

In [23]:
'Dragons' in pets

False

In [24]:
'Cats' in pets

True

Let's put all the stuff together to solve a real-world problem. Imagine that your company has recently moved to using a new domain, but a lot of the company email addresses are still using the old one. You want to write a program that replaces this old domain with the new one in any outdated email addresses. The function to replace the domain would look like this.

In [25]:
def replace_domain(email, old_domain, new_domain):
    if '@' + old_domain in email:
        index = email.index('@' + old_domain)
        new_email = email[:index] + "@" + new_domain
        return new_email
    return email

## 4. More String Methods

The string class provides a bunch of methods for manipulating strings

The `upper()` and `lower()` methods makes all characters in a string uppercase or lowercase

In [27]:
'Brian'.upper()

'BRIAN'

In [29]:
'I AM VERY EXCITED'.lower()

'i am very excited'

The `strip()` method removes outer whitespace

In [28]:
" MASTER CHIEF IS ALIVE!!!!!!!!!!!!               ".strip()

'MASTER CHIEF IS ALIVE!!!!!!!!!!!!'

There are two more versions of the `strip()` method:
- `lstrip()`
- `rstrip()`

In [30]:
'    yes'.lstrip()

'yes'

In [31]:
'yes           '.rstrip()

'yes'

The `count()` method returns how many times a substring occurs in a string

In [32]:
'The number of times e occurs in this string is 4'.count('e')

4

The `endswith()` method returns whether a given string ends with a certain substring

In [33]:
'Brian'.endswith('n')

True

The `isnumeric()` method returns whether the string is made up of just numbers

In [34]:
'8973290381245932'.isnumeric()

True

The `join()` method is also used for concatenating

In [36]:
" ".join(['This', 'is', 'a', 'string', 'joined', 'by', 'spaces'])

'This is a string joined by spaces'

The `split()` methods splits a string into a list of strings. It can pass a parameter to split a string based on a substring

In [37]:
'This is an example'.split()

['This', 'is', 'an', 'example']

Fill in the gaps in the initials function so that it returns the initials of the words contained in the phrase received, in upper case. For example: "Universal Serial Bus" should return "USB"; "local area network" should return "LAN”.

In [38]:
def initials(phrase):
    # Take all words and split into list
    words = phrase.split()
    acronym = ""
    for word in words:
        # Retrieve the first letter of the word
        # and capitalize it
        first_letter = word[0].upper()
        # Append letter to the acronym
        acronym += first_letter
    return acronym
    

print(initials("Universal Serial Bus")) # Should be: USB
print(initials("local area network")) # Should be: LAN
print(initials("Operating system")) # Should be: OS

USB
LAN
OS


Don't worry about memorizing all the methods. They'll come naturally

## 5. Formatting Strings

Up to now we've been making strings using the plus sign to just concatenate the parts of the string we wanted to create. And we've used the `str` function to convert numbers into strings so that we can concatenate them, too. This works, but it's not ideal, especially when the operations you want to do with the string or on the tricky side. There's a better way to do this using the `format()` method

In [40]:
name = 'Brian'
number = len(name) * 1193

# Notice how the format method automatically
# converts the 'number' variable into a string
print('Hello {}, your lucky number is {}'.format(name, number))

Hello Brian, your lucky number is 5965


In [41]:
# With this way, the ordering of the parameters don't matter
print('Your lucky number is {number}, {name}'.format(name=name, number=len(name)*1193))

Your lucky number is 5965, Brian


Modify the student_grade function using the format method, so that it returns the phrase "X received Y% on the exam". For example, student_grade("Reed", 80) should return "Reed received 80% on the exam".

In [42]:
def student_grade(name, grade):
	return "{name} received {grade}% on the exam".format(name=name, grade=grade)

print(student_grade("Reed", 80))
print(student_grade("Paige", 92))
print(student_grade("Jesse", 85))

Reed received 80% on the exam
Paige received 92% on the exam
Jesse received 85% on the exam


Let's say you want to output the price of an item with and without tax. Depending on what the tax rate is, the number might be a long number with a bunch of decimals. So if something costs 7.5 dollars without tax and the tax rate is 9 percent , the price with tax would be $8.175. 

In [46]:
price = 7.5
with_tax = price * 1.09
print(price, with_tax)

7.5 8.175


We can make the `format()` function print only two decimals

In [47]:
price = 7.5
with_tax = price * 1.09
print('Base price: ${:.2f}. With tax: ${:.2f}'.format(price, with_tax))

Base price: $7.50. With tax: $8.18


These expressions are needed when we want to tell Python to format our values in a way that's different from the default. The expression starts with a colon to separate it from the field name that we saw before. After the colon, we write .2f. This means we're going to format a float number and that there should be two digits after the decimal dot. So no matter what the price is, our function always prints two decimals.

## 6. Practice Quiz

1. The is_palindrome function checks if a string is a palindrome. A palindrome is a string that can be equally read from left to right or right to left, omitting blank spaces, and ignoring capitalization. Examples of palindromes are words like kayak and radar, and phrases like "Never Odd or Even". Fill in the blanks in this function to return True if the passed string is a palindrome, False if not.

In [60]:
def is_palindrome(input_string):

    # Initialize strings
    new_string = ""
    reverse_string = ""
    
    # Remove all whitespace in the string and lower all characters
    stripped_string = input_string.replace(" ", "").lower()
    new_string = stripped_string.lower()
    
    # Loop backwards in string to retrieve characters
    for i in range(len(stripped_string) - 1, -1, -1):
        reverse_string += stripped_string[i]

    # Compare strings
    if new_string == reverse_string:
        return True
    return False

print(is_palindrome("Never Odd or Even")) # Should be True
print(is_palindrome("abc")) # Should be False
print(is_palindrome("kayak")) # Should be True

True
False
True


2. Using the format method, fill in the gaps in the convert_distance function so that it returns the phrase "X miles equals Y km", with Y having only 1 decimal place. For example, convert_distance(12) should return "12 miles equals 19.2 km".

In [64]:
def convert_distance(miles):
	km = miles * 1.6 
	result = "{} miles equals {:.1f} km".format(miles, km)
	return result

print(convert_distance(12)) # Should be: 12 miles equals 19.2 km
print(convert_distance(5.5)) # Should be: 5.5 miles equals 8.8 km
print(convert_distance(11)) # Should be: 11 miles equals 17.6 km

12 miles equals 19.2 km
5.5 miles equals 8.8 km
11 miles equals 17.6 km


If we have a string variable named Weather = "Rainfall", which of the following will print the substring or all characters before the "f"?

In [68]:
weather = 'Rainfall'
f_location = weather.index('f')
print(f_location)
print(weather[:f_location])

4
Rain


4. Fill in the gaps in the nametag function so that it uses the format method to return first_name and the first initial of last_name followed by a period. For example, nametag("Jane", "Smith") should return "Jane S."

In [71]:
def nametag(first_name, last_name):
	return("{first_name} {last_name}.".format(first_name=first_name, last_name=last_name[0]))

print(nametag("Jane", "Smith")) 
# Should display "Jane S." 
print(nametag("Francesco", "Rinaldi")) 
# Should display "Francesco R." 
print(nametag("Jean-Luc", "Grand-Pierre")) 
# Should display "Jean-Luc G." 

Jane S.
Francesco R.
Jean-Luc G.


5. The replace_ending function replaces the old string in a sentence with the new string, but only if the sentence ends with the old string. If there is more than one occurrence of the old string in the sentence, only the one at the end is replaced, not all of them. For example, replace_ending("abcabc", "abc", "xyz") should return abcxyz, not xyzxyz or xyzabc. The string comparison is case-sensitive, so replace_ending("abcabc", "ABC", "xyz") should return abcabc (no changes made).

In [72]:
'''
NOTE: DID NOT ATTEMPT
'''

def replace_ending(sentence, old, new):
	# Check if the old string is at the end of the sentence 
	if sentence.index(old):
		# Using i as the slicing index, combine the part
		# of the sentence up to the matched string at the 
		# end with the new string
# 		i = ___
# 		new_sentence = ___
# 		return new_sentence

	# Return the original sentence if there is no match 
	return sentence
	
print(replace_ending("It's raining cats and cats", "cats", "dogs")) 
# Should display "It's raining cats and dogs"
print(replace_ending("She sells seashells by the seashore", "seashells", "donuts")) 
# Should display "She sells seashells by the seashore"
print(replace_ending("The weather is nice in May", "may", "april")) 
# Should display "The weather is nice in May"
print(replace_ending("The weather is nice in May", "May", "April")) 
# Should display "The weather is nice in April"

IndentationError: expected an indented block (<ipython-input-72-b853f08a8125>, line 16)