## Strings
*Amanda R. Kube Jotte*

A **string** is a data type that can consist of **concatenated** (ie linked or chained) alphanumeric and punctuation characters. Strings are recognized by Python through the use of single (' '), double (" "), or triple (''' ''') quotation marks. 

For example, I can create a string placing the text that I want in my string between single, double, or triple quotes like below. All 3 will produce the same string.

In [1]:
'This is a sentence.'

'This is a sentence.'

In [3]:
"This is also a sentence."

'This is also a sentence.'

In [4]:
'''This is a third sentence.'''

'This is a third sentence.'

The code cells above show output printed below them because they contain an *expression*. Recall from the [previous section](../3/Assignments.ipynb) that expressions are code statements that are not assignment statements (that is, they do not use an `=`). 

```{note}
If an expression is the only line in a code cell, or if it is the last line in a cell, its result will automatically display below the cell. When this happens, we say that the code is *printing to the console*. You can think of this like your calculator showing you the result after you type something in.
```

## Creating and Displaying Strings

Usually, we wouldn't just write a string in a code cell like this. If we want to print something to the console, we can use the `print()` function. You've seen this function before, but we have not properly introduced it. The `print()` function converts its input(s) to a string, and prints it. It does not need to be the only line or the last line to print to the console. It can be anywhere in your code.

In [None]:
var = 3 + 2
print('Today is a sunny day in Chicago.')
var

This is a sentence.


5

The cell above creates a variable named `var`, prints the string `'Today is a sunny day in Chicago.'`, and then the expression in the last line `var` will print the results of `var` (in this case 5) to the console. So we see there are two printed outputs above.

You will later in this book that printing things to the console can be very useful. For now, it is useful to start learning how the `print()` function works and how strings can be used in Python.

When creating strings in Python, inside or outside of `print()`, it is recommended to use double quotes instead of single quotes, as they allow for the use of single quotations inside. In the example below, we get an error message when trying to use an apostrophe inside of single quotations.

In [None]:
print("Today's the day!")

This isn't easy.


In [None]:
print('Today's the day!')

SyntaxError: unterminated string literal (detected at line 1) (3546504085.py, line 1)

While the above error can be fixed by wrapping the string in double quotes in place of the single quotes, it can also be fixed by an **escape sequence**. Escape sequences are string modifiers that allow for the use of certain characters that would otherwise be misinterpreted by Python. Because strings are created by the use of quotes, the escape sequences `\'` and `\"` allow for the use of quotes as part of a string:

In [None]:
print('Today\'s the day!')

This isn't easy.


Other useful escape sequences include `\n` and `\t`. These allow for a new line and tab spacing to be added to a string, respectively.

In [None]:
print("This is the first sentence. \nThis sentence is on a new line! \tThis sentence comes after a tab.")

This is the first sentence. 
This is the second sentence! 	This is the third sentence?


Triple quotes are useful if you want to create a string that spans multiple lines.

In [None]:
my_par = '''This is line 1.
This is line 2.
This is line 3.
'''
print(my_par) # display the text using the print() function
my_par # display the text without the print function

This is sentence 1.
This is sentence 2.
This is sentence 3.



'This is sentence 1.\nThis is sentence 2.\nThis is sentence 3.\n'

Above, I created a variable named `my_par` that stores three sentences, each separated by a newline. I then used the `print()` function to display the text, and also included `my_par` on the last line of the cell to show how Python displays it directly.

```{note}
The two outputs are formatted differently.  
- Without `print()`, Python shows the text *literally*—including the special newline character (`\n`) instead of starting a new line, and with quotation marks around the text.  
- With `print()`, Python formats the text in a more natural way: it starts new lines where the `\n` appears, and it removes the quotation marks.  
```

Formatting text nicely isn't the only benefit of using the `print()` function. Recall, the `print()` function converts its input(s) (also called **arguments**, see [the next section](../5/IntroFunctions.ipynb) for more information) to a string, and prints it. So far, we've given the function only one input -- meaning we have put one string inside the parentheses. We could give it multiple strings:

In [9]:
print("This is the first input.", "This is another input.") # We separate inputs by commas

# The function separates inputs with a space and makes them one large string
print("There can", "also be", "three or more inputs.") 

This is the first input. This is another input.
There can also be three or more inputs.


Or even inputs that are not strings. Recall, we did this in [Section 3.2](../2/Booleans.ipynb).

In [10]:
# The second input here is a Boolean. Python makes it a string and combines the strings
print("1.5 == 0.8:", 1.5 == 0.8)

1.5 == 0.8: False


## String Operations
Joining together strings is also called **concatenation**. The `print()` function is not the only way to join together strings. We can **concatenate** strings using the mathematical operators: `+` and `*`.

In [11]:
str1 = "Moses supposes"
str1

'Moses supposes'

In [12]:
str2 = "his toeses are roses."
str2

'his toeses are roses.'

In [13]:
str1 + str2

'Moses supposeshis toeses are roses.'

When we concatenate strings using `+`, it is done *very literally*. As there is no space at the end of `str1` or at the beginning of `str2`, when we join them together, there is no space between `supposes` and `his`. We can fix this by either changing the strings or concatenating a space in the middle.

In [14]:
str1 + " " + str2

'Moses supposes his toeses are roses.'

You can think of `+` as *squishing* together the strings on either side of the operator. 

```{note}
The `+` operator can be used with numeric data or with strings but NOT with a combination of both.

```python
3 + 2.75 # This will work
```

```python
"Hello " + "world!" # This will also work
```

```python
9 + " lives" # This will produce an TypeError
```

If you want to concatenate numbers and text, you have to first turn the numeric data into a string. You can do this by placing it in quotations like below:

```python
"9" + " lives"
```

Or you can use the `str()` function to create a string.

```python
str(9) + " lives"
```

Keep in mind that when a numerical value is converted to a string, it can no longer be used to perform certain mathematical calculations, such as division, subtraction, or exponentiation.

```python
"2" ** 2 # this will produce a TypeError
```

```python
# This is concatenation not addition
"2" + "2" # The only time 2 + 2 = 22
```

```

We can also use the `*` operator on strings. This will concatenate the string with copies of itself. For example, the code below concatenates two copies of the string.

In [16]:
"waka" * 2

'wakawaka'

We can "multiply" strings by any integer.

In [19]:
"waka" * 7

'wakawakawakawakawakawakawaka'

But, we cannot use floats:

In [None]:
"waka" * 1.5

TypeError: can't multiply sequence by non-int of type 'float'

## Checking and Changing Data Types

If you need to confirm the data types of the values you are using, you can use the `type()` function that we learned in [Section 3.1](../1/NumericData.ipynb). The code below will print the data types of `"waka"`, `1.5`, and `7` which we should see are string (or `str`), float, and integer (or `int`). 

In [20]:
print(type("waka"))
print(type(1.5))
print(type(7))

<class 'str'>
<class 'float'>
<class 'int'>


Often, this is more useful if we've created variables to store these values and have forgotten what type of data we stored.

In [23]:
my_var = 4 < 7
my_other_var = "It's raining cats and dogs."

In [None]:
type(my_var) # we don't have to print the result if its the last/only line

bool

In [25]:
type(my_other_var)

str

We've seen the `int()`, `float()`, and `str()` functions which can change the data type of their input. We can also use these to convert numerical values inside strings to integers and floats.

In [None]:
print(int('45'))
print(float('45'))

45
45.0


Remember, the `int()` and `float()` functions can only convert recognized numerical values. A string of letters cannot be converted to a float or integer.

In [None]:
int('Sorry')

ValueError: invalid literal for int() with base 10: 'Sorry'

## String Comparisons

In [Section 3.2](../2/Booleans.ipynb) we showed examples of comparisons of integers and floats, but strings can also be used with comparison operators.

In [None]:
a = 'Dan'
b = 'Mike'

print("a == b:", a == b)
print("a != b:", a != b)
print("a < b:", a < b)
print("a <= b:", a <= b)
print("a > b:", a > b)
print("a >= b:", a >= b)

As you can see, Python compared the two strings and found them different, but is also able to use inequality operators to compare them. The order is determined [lexicographically](https://en.wikipedia.org/wiki/Lexicographic_order) using the ASCII values of the characters.

Letter case is important for comparisons:

In [None]:
'Dan' == 'dan'