## Python Strings

### Working with Strings

A *string* variable in Python can be created with the assignment operator *=* and by enclosing the text in either single or double quotes. Either is fine, but the start and end quotes must be the same!

In [None]:
cityname = "Dublin"
address = 'University College Dublin, Belfield'

Strings that span multiple lines can be defined in several different ways. One way is to include a backslash character \ as the last character on the line:

In [None]:
s1 = "This is all \
part of the same string"

Alternatively, we can use triple quotes (""" or ''') to enclose a multi-line string:

In [None]:
s2 = """
This also has multiple lines
defined in a different way
"""

Strings can be viewed as lists of individual characters. We can apply many standard list operations and functions to strings, such as accessing individual characters or slicing.

In [None]:
cityname[0]

In [None]:
address[0:3]

We can use the built-in *len()* function to check the length of a string (i.e. how many characters it contains):

In [None]:
len(cityname)

Strings can be concatenated together using + operator:

In [None]:
address + ", " + cityname

**Special characters**: Backslashes are used to introduce special characters. This use of backslashes is referred to as an *escape sequence*. For instance '\t' represents a tab character and '\n' represents a newline character.

In [None]:
s = "\tIreland\n\tGermany"

In [None]:
print(s)

### String Functions

Strings have a range of associated functions to perform basic operations on them. Note: These functions make a copy of the original string, they don not change the original!

For example, you can change the case of the characters in a string:

In [None]:
country = "Ireland"
country.upper()  # make a copy of country, but in upper case characters

In [None]:
country.lower() # make a copy of country, but in lower case characters

We can remove trailing whitespaces characters (spaces, tabs and line breaks) from strings with the *strip()* function:

In [None]:
s = "Hello World   "
s.strip()

Strings have simple functions for finding and replacing characters substrings (i.e. strings contained within other strings). We can search for the index (position) of a substring within a longer string using the *find()* function:

In [None]:
s.find("World")  # what index does that substring 'World' start at?

If we try to find a substring that does not exist within another string, we get an index of -1.

In [None]:
s.find("bye")  

We can count number of times a character or substring occurs in a longer string using the *count()* function:

In [None]:
s.count("o")

We can also replace characters or substrings in another string. Note: The *replace()* function makes a copy of the string, it does not change the original string.

In [None]:
rep = s.replace("l","z")
print( s )    # original string
print( rep )  # the new copy, where replacements were made

We can separate a single string into a list of one or more strings based on a *delimiter* (a separator character or string) using the *split()* function:

In [None]:
names = "lisa;john;alex;alice"
names.split(";")  # split this string based on the ; character

In [None]:
words = "the python programming language"
words.split(" ")  # split the string based on the space character

In reverse, we can merge a list containing multiple strings into a single string using the *join()* function, where the values from the list are separated by a specified character or string.

In [None]:
l = ["dublin","cork","galway"]
" & ".join( l )   # note, the function is called on the separator string!

In [None]:
initials = ["J", "R", "R"]
".".join( initials )

### Converting Between Types

Recall mixing incompatible types is not permitted in Python, so trying to concatenate a string and a number will give an error message.

In [None]:
"my age is " + 30

Instead, we use conversion functions to change a value between basic types in Python. Use the built-in *str()* function to convert any variable to a string:

In [None]:
str(30)  # change an integer to a string

We can convert string values to other types using various built-in functions - most commonly *int()* to convert to an integer and *float()* to convert to a floating point (real) value:

In [None]:
int("3500")

In [None]:
float("5.04")

Obviously not all strings will be suitable for conversion to a number, and will result in an error message:

In [None]:
int("UCD")

### Formatting Strings

There are many different ways of nicely formatting output in Python. The simplest way is to use the *print()* function with a comma separated list of values to print out:

In [None]:
teamx = 3
teamy = 2
mins = 35
print( "The score was", teamx, "-", teamy, "after", mins, "minutes" )

A more flexible way is to concatenate multiple variables of different types into a single string, using the *%* operator. The *format string* provides the recipe to build the string, containing zero or more *placeholders*. The placeholders get substituted for the list of values that you provide after the % symbol.

Basic usage is: a format string, followed by '%' character, then one or more values in parenthesis. The number of placeholders in the format string must equal the number of values!

In [None]:
"%s and %s" % ( "Alex", "Alice" )   # 2 placeholders, 2 values

Special placeholder codes are used when building a format string. Each placeholder should correspond to the type of the value that will replace it.
- %d: integer
- %f: floating point, with default precision
- %.Nf: floating point to *N* decimal places)
- %s: a string (or any value)

In [None]:
director_name = "Zack Snyder"
movie_name = "Watchmen"
year = 2009
rating = 7.6

In [None]:
s = "%s directed %s in %d, which has IMDB rating %f" % ( director_name, movie_name, year, rating )

In [None]:
print(s)

Often string formatting can be used to round or "tidy" floating point values:

In [None]:
x = 1.28353

In [None]:
"%f => %0.f or %.1f or %.2f or %.3f" % ( x, x, x, x, x )