## Strings
*Amanda R. Kube Jotte, Evelyn Campbell, and Dan L. Nicolae*

A **string** is a data type that can consist of **concatenated** (ie linked or chained) alphanumeric and punctuation characters.

Strings are recognized by Python through the use of single (' '), double (" "), or triple (''' ''') quotation marks. 

In [None]:
print('This is a sentence.')

This is a sentence.


Double quotes are recommended as a first option use, as they allow for the use of single quotations inside. In the example below, we get an error message when trying to use an apostrophe inside of single quotations.

In [None]:
print("This isn't easy.")

This isn't easy.


In [None]:
print('This isn't easy.')

SyntaxError: unterminated string literal (detected at line 1) (3546504085.py, line 1)

While the above error can be fixed by wrapping the string in double quotes in place of the single quotes, it can also be fixed by an **escape sequence**. Escape sequences are string modifiers that allow for the use of certain characters that would otherwise be misinterpreted by Python. Because strings are created by the use of quotes, the escape sequences `\'` and `\"` allow for the use of quotes as part of a string:

In [None]:
print('This isn\'t easy.')

This isn't easy.


Other useful escape sequences include `\n` and `\t`. These allow for a new line and tab spacing to be added to a string, respectively.

In [None]:
print('''This is the first sentence. \nThis is the second sentence! \tThis is the third sentence?''')

This is the first sentence. 
This is the second sentence! 	This is the third sentence?


We can **concatenate,** or join together, strings using the mathematical operations of `+` or `*`.

In [None]:
print('This is a sentence.'+'This is a sentence.')

print('This is a sentence.' * 2)


This is a sentence.This is a sentence.
This is a sentence.This is a sentence.


In the above example, we see that Python prints the sentence twice, but these sentences run into each other (i.e. there is no space in between). We have to specifically tell Python to add this space. We can do this by printing the string variables that we want along with a space in quotation marks (" "). We can also do this by adding multiple arguments to the `print()` function, separated by a comma.

In [None]:
print('This is a sentence.' + " " + 'This is a sentence.')
print('This is a sentence.', 'This is a sentence.')

This is a sentence. This is a sentence.
This is a sentence. This is a sentence.


Note, string concatenation joins multiple string expressions together, but cannot be used in combination with numerical expressions since they are not the same data type.

In [None]:
2 + 'This is a sentence.'

TypeError: unsupported operand type(s) for +: 'int' and 'str'

Escape sequences also can be used in the `print()` function as an argument or through concatenation:

In [None]:
# Escape sequence used as an argument in the print function
print('This is a sentence.', '\t', 'This isn\'t easy.') 
# Escape sequence used to print a blank line
print('\n')                  
# Escape sequence concatenated to strings in the print function
print('This isn\'t easy.' + '\t' + 'This is a sentence.')

This is a sentence. 	 This isn't easy.


This isn't easy.	This is a sentence.


Numeric values can also be recognized as a string by putting them within quotation marks or using them as an argument in the `str()` function.

In [None]:
print("2")
print(True)

2
True


We can confirm that these are indeed strings by calling the `type()` function on these variables, which can be used on any variable to check its data type.

In [None]:
print(type("2"))
print(type(True))

<class 'str'>
<class 'bool'>


Keep in mind that when a numerical value is converted to a string, it can no longer be used to perform certain mathematical calculations, such as division, subtraction, or exponentiation.

In [None]:
"2" ** 2

TypeError: unsupported operand type(s) for ** or pow(): 'str' and 'int'

It can be used in addition and multiplication, but more so in a "stringy" way and not a "mathy" way, that is through concatenation rather than mathematical operations.

In [None]:
"2" + "2"

'22'

This is the only time when 2 + 2 equals 22. <span class="emoji">🙃</span>

We can also convert numerical values in strings to integers and floats


In [None]:
print(int('45'))
print(float('45'))

45
45.0


Remember, the `int()` and `float()` functions can only convert recognized numerical values. A string of letters cannot be converted to a float or integer.

In [None]:
int('Sorry')

ValueError: invalid literal for int() with base 10: 'Sorry'

By understanding data types, we can begin to use them in other analyses and functionalities in Python. Next, we will learn how to use data types in comparisons, which can help further down the line in functions ([Chapter 3.5](../5/IntroFunctions.ipynb)), for loops ([Chapter 5.3](../../05/3/Control_Statements_Iteration.ipynb)), and subsetting data from DataFrames ([Chapter 6.6](../../06/6/Select_Condition.ipynb)).

We showed examples of comparisons of integers and floats, but strings can also be used with comparison operators.

In [None]:
a='Dan'
b='Mike'

print("a == b:", a == b)
print("a != b:", a != b)
print("a < b:", a < b)
print("a <= b:", a <= b)
print("a > b:", a > b)
print("a >= b:", a >= b)

As you can see, Python compared the two strings and found them different, but is also able to use inequality operators to compare them. The order is determined [lexicographically](https://en.wikipedia.org/wiki/Lexicographic_order) using the ASCII values of the characters.

Letter case is important for comparisons:

'Dan'=='dan'