# Lesson 02 - Strings

### The following topics are discussed in this notebook:
* An introduction to the string data type.
* Escape characters.
* Operations on strings.
* String functions.

## Introduction to Strings

A string object is a piece of text, or in other words, a sequence of characters. When defining a string, we must put the characters that compose it inside either single or double quotes. These quotes are what allows Python to distinguish between a string and a command. 

In the cell below, we define a string variable, with the value `"Hello World!"`, and we then print the result. 

In [1]:
my_string = "Hello world!"
print(my_string)

Hello world!


The official name of the Python string type is `str`. We confirm this be calling `type()` on `my_string`. 

In [2]:
type(my_string)

str

As mentioned above, we can also use single quotes when defining a string. The following definition of `my_string` is equivalent to the previous definition. 

In [3]:
my_string = 'Hello world!'
print(my_string)

Hello world!


One benefit of being able to use either single quotes or double quotes when creating strings is that it allows us a convenient way to create strings that themselves contain quotes as characters. For instance, assume that we want to create a variable containing the following string:

    He yelled, "I have had enough!" before storming out of the room.
    
The following attempt at creating this string will give us an error.

In [4]:
sentence = "He yelled, "I have had enough!" before storming out of the room."

SyntaxError: invalid syntax (<ipython-input-4-4965ff137a3f>, line 1)

In the example above, Python got confused by the quotation marks in the middle of the string. When it encountered the second quotation mark, it believed that this was the end of the string, although this was intended to be part of the string. 

There are a few ways to fix this. The simplest is to use single quotes to define the string. When Python encounters the first single quote, it knows that a string is being defined. It won't stop reading characters into the string until it hits another single quote. Any double quotes that it encounters along the way will be treated as inert characters.

In [5]:
sentence = 'He yelled, "I have had enough!" before storming out of the room.'
print(sentence)

He yelled, "I have had enough!" before storming out of the room.


## Escape Sequences
An **escape sequence** is a sequence of characters that Python applies special meaning to when they are encountered in a string. Several common escape sequences are listed below.

| Escape Character  | Result |
|:---:|-----|
| **\\'**  | Prints a single quote. |
| **\\"**  | Prints a double quote. |
| **\\n**  | Inserts a newline. |
| **\\t**  | Inserts a tab. |
| **\\\**  | Inserts a backslash. |

For an example use case for escape characters, assume that we want to define a variable containing the following string of characters: 

    He yelled, "I've had enough!" before storming out of the room.
    
As before, the presence of double quotes within the string prohibit us from being able to use double quotes to define the string. Furthermore, since the character for the apostrophe is the same as the character for a single quote, we are now no longer able to use single quotes to define our string. One solution is to use an escape character for the apostrophe so that Python knows to interpret it as text. 

In [6]:
sentence = 'He yelled, "I\'ve had enough!" before storming out of the room.'
print(sentence)

He yelled, "I've had enough!" before storming out of the room.


Had we wished, we could have escaped the double quotes within the string as well as the apostrophe. In this case, we could have used either single or double quotes to define our string. 

We can use the `\n` escape character to insert a newline inside of a string. 

In [7]:
tale2cities = "It was the best of times.\nIt was the worst of times."
print(tale2cities)

It was the best of times.
It was the worst of times.


We can use `\t` to insert tabs in our string. This can be used for indenting lines, or for aligning output. 

In [8]:
print("Regular.")
print("\tIndented.")
print("\t\tDouble indented.")

Regular.
	Indented.
		Double indented.


The tab escape character can be used to align portions of multi-line output. The following example shows how we might use tabs to align columns in the printout of an employee database. 

In [9]:
print('ID\tEmployee Name\tSalary')
print('-------------------------------')
print('107\tJane Doe\t$54,000')
print('139\tJohn Smith\t$48,300')
print('162\tPat Jones\t$52,500')

ID	Employee Name	Salary
-------------------------------
107	Jane Doe	$54,000
139	John Smith	$48,300
162	Pat Jones	$52,500


## Operations Involving Strings

When appearing between numbers, the symbols `+`, `-`, `*`, `/`, and `**` perform the relevant arithmetic operations. However, these symbols can sometimes be used to combine instance of data types. We will see examples of this as we introduce new data types. The only one of these symbols that can be used between two strings is the `+` symbol. 

When `+` is used between two strings, it combines, or **concatenates** the strings. The string that appears on the left side of `+` will come first, and the string on the left side will be appended to the end. 

In [10]:
a = 'star'
b = 'wars'
c = a + b
print(c)

starwars


We can use `+` to combine several strings at once. It is not necessary for all of the strings to be stored in variable. We see this in the next example, which placed 

In [11]:
d = a + ' ' + b
print(d)

star wars


## Operations Involving Strings and Numbers

If we try to combine a string and a number with +, we will get an error.

In [12]:
print("one" + 2)

TypeError: can only concatenate str (not "int") to str

Note that numbers enclosed with quotes are also considered strings. Python does not recongnize them as numbers.

In [13]:
print("1" + 2)

TypeError: can only concatenate str (not "int") to str

Although we are not able to "add" strings to numbers, we are able to "multiply" a string by a number. The result will be a string that has concatenated with itself the specified number of times.

In [14]:
print("blah " * 5)

blah blah blah blah blah 


Since the product of a string and an integer produces another string, expressions of this type can be concatenated together.

In [15]:
print("la " * 4 + "doo " * 3)

la la la la doo doo doo 


## Type Coercion with Strings

We will now explore under what situations we are able to convert between `str` objects and `int` or `float` objects. 

We can convert a `str` object to an `int` or a `float` if the value contained within the string makes sense as the new datatype. 

In [16]:
a_str = '61'
a_int = int(a_str)
a_float = float(a_str)
print(a_int)
print(a_float)

61
61.0


In [17]:
b_str = '7.93'
b_float = float(b_str)
print(b_float)

7.93


Since the value of `b_str` is not interpretable as an integer, we will get an error if we attempt to coerce it to an integer.

In [18]:
b_int = int(b_str)

ValueError: invalid literal for int() with base 10: '7.93'

If we are very insistent about coercing `b_int` to an integer, we can first coerce it into a float, and then an integer. 

In [19]:
b_int = int(float(b_str))
print(b_int)

7


We can always convert an `int` or a `float` object to a `str` using the `str()` function.

In [20]:
x_float = 4.5
x_str = str(x_float)
print(x_str)

4.5


In [21]:
y_int = 8675409
y_str = str(y_int)
print(y_str)

8675409


Converting numerical values to strings can be very useful if we want to output a message that contains a mixture of predetermined text, as well as numeric values that are stored in variables. Converting the numeric portions of the message to strings allowed them to be concatenated with the rest of the text. 

Consider the following example.

In [22]:
z = 3.56
z2 = z**2

print('The square of ' + str(z) + ' is ' + str(z2) + '.')

The square of 3.56 is 12.6736.


## The `len()` Function

Python provides several built-in functions for working with strings. The first such function we will discuss is `len()`. The `len()` function allows you to determine the length of a string.

In [23]:
x = "There are 39 characters in this string."
print(len(x))

39


## Methods

The majority of the functions we will encounter when working with strings are **methods**. The difference between a method and other types of functions we will encounter is subtle, and will be discussed in greater detail later in the course. For now, we simply note the following points regarding methods:
1. A method is a function that belongs to a specific object (such as an `int`, `float`, or `str`). 
2. To use a method on an object, you write the name of the object, follows by a dot, followed by the name of the method, followed by a set of parentheses.

In the following example, we consider three string methods:

* `upper()` converts the string to uppercase. 
* `lower()` converts the string to lowercase. 
* `title()` capitalizes the first letter of each word in the string. 

Note that none of these methods actually change the contents of the string. They instead provide a new string as their output. 

In [24]:
myString = "There's a method in the madness."
print(myString.upper())
print(myString.lower())
print(myString.title())

THERE'S A METHOD IN THE MADNESS.
there's a method in the madness.
There'S A Method In The Madness.


Some methods accept inputs (also called arguments). One example is the `count()` method. This method searches the string to see how many times the supplied input (also a string) appears within the original string. This is demonstrated below. 

In [25]:
print( myString.count("m") )
print( myString.count("e") )

2
5


The `replace()` method accepts two arguments. This method scans the source string, and replaces all occurences of the first argument with the second argument. Again, it does not actually change the contents of the original string. It instead returns a new string as output. 

In [26]:
a = "a "
b =  "no "
print( myString.replace(a, b) )

There's no method in the madness.
