##  Python Strings
### Working with Strings
A string variable in Python can be created with the assignment operator = and by enclosing the text in either single or double quotes. Either is fine, but the start and end quotes must be the same!

In [1]:
cityname = "Dublin"
address = 'University College Dublin, Belfield'

Strings that span multiple lines can be defined in several different ways. One way is to include a backslash character \ as the last character on the line:

In [2]:
s1 = "This is all \
part of the same string"

Alternatively, we can use triple quotes (""" or ''') to enclose a multi-line string:

In [4]:
s2 = """
This also has multiple lines
defined in a different way
"""

Strings can be viewed as lists of individual characters. We can apply many standard list operations and functions to strings, such as accessing individual characters or slicing.

In [5]:
cityname[0]

'D'

In [6]:
address[0:3]

'Uni'

We can use the built-in len() function to check the length of a string (i.e. how many characters it contains):

In [7]:
len(cityname)

6

Strings can be concatenated together using + operator:

In [8]:
address + ", " + cityname

'University College Dublin, Belfield, Dublin'

#### Special characters: 
Backslashes are used to introduce special characters. This use of backslashes is referred to as an escape sequence. For instance '\t' represents a tab character and '\n' represents a newline character.

In [9]:
s = "\tIreland\n\tGermany"

In [10]:
print(s)

	Ireland
	Germany


#### String Functions
Strings have a range of associated functions to perform basic operations on them. Note: These functions make a copy of the original string, they don not change the original!

For example, you can change the case of the characters in a string:

In [11]:
country = "Ireland"
country.upper()  # make a copy of country, but in upper case characters

'IRELAND'

In [12]:
country.lower() # make a copy of country, but in lower case characters

'ireland'

We can remove trailing whitespaces characters (spaces, tabs and line breaks) from strings with the strip() function:

In [13]:
s = "Hello World   "
s.strip()

'Hello World'

Strings have simple functions for finding and replacing characters substrings (i.e. strings contained within other strings). We can search for the index (position) of a substring within a longer string using the find() function:

In [14]:
s.find("World")  # what index does that substring 'World' start at?

6

If we try to find a substring that does not exist within another string, we get an index of -1.

In [16]:
s.find("bye")

-1

We can count number of times a character or substring occurs in a longer string using the count() function:

In [18]:
s.count("o")

2

We can also replace characters or substrings in another string. Note: The replace() function makes a copy of the string, it does not change the original string.

In [19]:
rep = s.replace("l","z")
print( s )    # original string
print( rep )  # the new copy, where replacements were made

Hello World   
Hezzo Worzd   


We can separate a single string into a list of one or more strings based on a delimiter (a separator character or string) using the split() function:

In [20]:
names = "lisa;john;alex;alice"
names.split(";")  # split this string based on the ; character

['lisa', 'john', 'alex', 'alice']

In [21]:
words = "the python programming language"
words.split(" ")  # split the string based on the space character

['the', 'python', 'programming', 'language']

In reverse, we can merge a list containing multiple strings into a single string using the join() function, where the values from the list are separated by a specified character or string.

In [22]:
l = ["dublin","cork","galway"]
" & ".join( l )   # note, the function is called on the separator string!

'dublin & cork & galway'

In [23]:
initials = ["J", "R", "R"]
".".join( initials )

'J.R.R'

### Converting Between Types
Recall mixing incompatible types is not permitted in Python, so trying to concatenate a string and a number will give an error message.

In [24]:
"my age is " + 30

TypeError: must be str, not int

Instead, we use conversion functions to change a value between basic types in Python. Use the built-in str() function to convert any variable to a string:

In [25]:
str(30)  # change an integer to a string

'30'

We can convert string values to other types using various built-in functions - most commonly int() to convert to an integer and float() to convert to a floating point (real) value:

In [26]:
int("3500")

3500

In [27]:
float("5.")

5.0

Obviously not all strings will be suitable for conversion to a number, and will result in an error message:

In [28]:
int("UCD")

ValueError: invalid literal for int() with base 10: 'UCD'

### Formatting Strings
There are many different ways of nicely formatting output in Python. The simplest way is to use the print() function with a comma separated list of values to print out:

In [29]:
teamx = 3
teamy = 2
mins = 35
print( "The score was", teamx, "-", teamy, "after", mins, "minutes" )

The score was 3 - 2 after 35 minutes


A more flexible way is to concatenate multiple variables of different types into a single string, using the % operator. The format string provides the recipe to build the string, containing zero or more placeholders. The placeholders get substituted for the list of values that you provide after the % symbol.

Basic usage is: a format string, followed by '%' character, then one or more values in parenthesis. The number of placeholders in the format string must equal the number of values!

In [30]:
"%s and %s" % ( "Alex", "Alice" )   # 2 plaeholders, 2 values
print("My university name is {} , {}".format("UCD", 55))
help(format)

My university name is UCD , 55
Help on built-in function format in module builtins:

format(value, format_spec='', /)
    Return value.__format__(format_spec)
    
    format_spec defaults to the empty string.
    See the Format Specification Mini-Language section of help('FORMATTING') for
    details.



Special placeholder codes are used when building a format string. Each placeholder should correspond to the type of the value that will replace it.

%d: integer

%f: floating point, with default precision

%.Nf: floating point to N decimal places)

%s: a string (or any value)

In [44]:
director_name = "Zack Snyder"
movie_name = "Watchmen"
year = 2009
rating = 7.6

In [46]:
s = "{0} directed {1} in {2}, which has IMDB rating {3}" .format(director_name, movie_name, year, rating)

In [47]:
print(s)

Zack Snyder directed Watchmen in 2009, which has IMDB rating 7.6


In [48]:
x = 1.28353

In [49]:
"%f => %0.f or %.1f or %.2f or %.3f" % ( x, x, x, x, x )

'1.283530 => 1 or 1.3 or 1.28 or 1.284'