In [None]:
%reload_ext postcell
%postcell register

# Strings

Python strings are characters enclosed in single quotes `'` or double quotes `"`

In [1]:
"Hello World"

'Hello World'

In [2]:
'Hello World'

'Hello World'

Note that in most languages, strings are enclosed in double quotes.

Python additonally provides multi-line strings, enclosed in tripple quotes `"""`:

In [3]:
"""Hello
world
!"""

'Hello\nworld\n!'

Note that `\n` is a special character representing a new line. If you were to `print()` the string above, you would actually see multiple lines:

In [4]:
print("""Hello
world
!""")

Hello
world
!


### Operators

| Operator | Description | Example
| ---      | ---         | ---
| +        | Combine two strings | "hello " + "world"
| *        | Repeat the string   | "hello" * 5
| in       | Does a substring exist in the string? | "orld" in "Hello world"

In [8]:
"hello " + "world" # notice the extra space after hello

'hello world'

In [7]:
"hello " * 5

'hello hello hello hello hello '

In [10]:
"orld" in "Hello world"

True

**Exercise** Where else have you seen the `in` operator?

**Exercise** Does "Hell" exist in the text "Hello World"? (show using code)

In [None]:
%%postcell exercise_025_140_a

#type your answer here

### Common string functions
The list below shows many functions which operate on strings. You are expected to check documentation to understand how to use them. A few important ones will be tested below: https://docs.python.org/3/library/string.html

In [12]:
[f for f in dir("hello") if not f.startswith("__")]

['capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']

**.split** The split function has been used many times already and is used to split a string into tokens. It can accept a separator, otherwise the string is split on blank spaces

In [15]:
'1, 2, 3'.split(',')

['1', ' 2', ' 3']

**.strip** The strip function removes extra space before and after the string. This is often useful when reading data files.

In [18]:
"    hello world      \n"

'    hello world      \n'

In [19]:
"    hello world      \n".strip()

'hello world'

**.join** The join operates on a delimiter (such as comma, semi-colon) and accepts a list (or any iterable) as a parameter. It then inserts the delimiter between every item in the list. In a sense, this method is the opposite of the **.split** method

In [24]:
";".join(['homer', 'marge', 'bart'])

'homer;marge;bart'

**Exercise** Turn list `['arya', 'jon', 'white walker']` into string `'arya, jon, white walker` using the `join` function

In [None]:
%%postcell exercise_025_140_b

#type your answer here

**.find** method attempts to find a substring in a string and return the first location where the strings start to match. For example, in the string "hello world", the word "world" doesn't show up until the 6th character. If the string being searched is not found, `-1` is returned. Note that if you just need to check if a substring exists within a string, you can also use the generic `in` operator.

In [28]:
"hello world".find("world")

6

In [29]:
"hello world".find("jupiter")

-1

In [30]:
"world" in "hello world"

True

**.replace** replaces one one substring with another, in a larger string

In [32]:
"hello world; welcome to programming".replace(';',',')

'hello world, welcome to programming'

### Formatters (string interpolation)
One of the most common operations related to a string is to create a new string which contains values in variables. This is also called _string interpolation_. There are several ways of doing this. If you are using the `print` function, you can just separate values by commas:

In [33]:
name = "Homer"
age  = 38

print("His name is", name, "and his age is", age)

His name is Homer and his age is 38


Notice that in the print function, an extra space is added after each comma.

Another, even more obvious, method is to _concatenate_ strings by adding them:

In [35]:
"His name is " + str(name) + " and his age is " + str(age)

'His name is Homer and his age is 38'

Notice that extra space needs to be added before and after variable names. Numeric variables also need to be converted to strings using the `str` function.

The following method is not recommended, but you may see it in other people's code, particularly if you are reading Python 2.7 code:

In [36]:
"His name is %s and his age is %s" % ("Homer", "38")

'His name is Homer and his age is 38'

A more modern method of string interpolation is this:

In [38]:
"His name is {} and his age is {}".format("Homer", "38")

'His name is Homer and his age is 38'

If you wish to repeat a variable, you may number the brackets:

In [40]:
"His name is {0} and {0}'s age is {1}".format("Homer", "38")

"His name is Homer and Homer's age is 38"

You can even name the brackets:

In [41]:
"His name is {name} and {name}'s age is {age}".format(name="Homer", age="38")

"His name is Homer and Homer's age is 38"

#### 'f-strings'

Finally, starting with Python 3.6 (the version you are currently using), you can insert variables directly in a string, as long as it is surrounded by brackets and the string is preceeded by the letter `f`:

In [43]:
name = "Homer"
age  = 38

f"His name is {name} and his age is {age}"

'His name is Homer and his age is 38'