<small><small><i>
All of these python notebooks are available at [https://gitlab.erc.monash.edu.au/andrease/Python4Maths.git]
</i></small></small>

# Working with strings

## The Print Statement

As seen previously, The **print()** function prints all of its arguments as strings, separated by spaces and follows by a linebreak:

    - print("spam eggs")
    - print("spam",'eggs')
    - print("spam", <Variable Containing the String>)

Note that **print** is different in old versions of Python (2.7) where it was a statement and did not need parenthesis around its arguments.

In [3]:
'spam eggs'  # single quotes

'spam eggs'

In [4]:
'doesn\'t'  # use \' to escape the single quote...

"doesn't"

In [6]:
"doesn't"  # ...or use double quotes instead

"doesn't"

In [7]:
'"Yes," they said.'

'"Yes," they said.'

In [8]:
"\"Yes,\" they said."

'"Yes," they said.'

In [9]:
'"Yes," they said.'

'"Yes," they said.'

In [10]:
'"Isn\'t," they said.'

'"Isn\'t," they said.'

In [12]:
"Isn\'t, they said."

"Isn't, they said."

The print has some optional arguments to control where and how to print. This includes `sep` the separator (default space) and `end` (end charcter) and `file` to write to a file.

In [13]:
print("spam", "eggs", sep='...', end='!!')

spam...eggs!!

If you don’t want characters prefaced by \ to be interpreted as special characters, you can use raw strings by adding an r before the first quote:

In [14]:
print('C:\some\name')  # here \n means newline!

C:\some
ame


In [15]:
print(r'C:\some\name')  # note the r before the quote

C:\some\name


## String Indexing

Strings can be indexed (subscripted), with the first character having index 0. There is no separate character type; a character is simply a string of size one:

In [17]:
word = 'Python'
word[0]  # character in position 0

'P'

In [18]:
word[5]  # character in position 5

'n'

In [19]:
word[-1]  # last character

'n'

In [20]:
word[-2]  # second-last character

'o'

In [22]:
word[2:5]  # characters from position 2 (included) to 5 (excluded)

'tho'

In [23]:
word[:2] + word[2:]

'Python'

In [26]:
word[42]  # the word only has 6 characters

IndexError: string index out of range

In [27]:
word[4:42]

'on'

Python strings cannot be changed — they are immutable. Therefore, assigning to an indexed position in the string results in an error:

In [28]:
word[0] = 'J'

TypeError: 'str' object does not support item assignment

If you need a different string, you should create a new one:

In [29]:
word[:2] + 'py'

'Pypy'

## String Formating

There are lots of methods for formating and manipulating strings built into python. Some of these are illustrated here.

String concatenation is the "addition" of two strings. Observe that while concatenating there will be no space between the strings.

In [16]:
string1='eggs'
string2='!'
print('spam' + string1 + string2)

spameggs!


The new string formatting uses the string method "format". There are multiples ways of using this method.

Accessing arguments by position:

In [30]:
print('{0}, {1}, {2}'.format('a', 'b', 'c'))
print('{}, {}, {}'.format('a', 'b', 'c'))  # 3.1+ only
print('{2}, {1}, {0}'.format('a', 'b', 'c'))
print('{2}, {1}, {0}'.format(*'abc'))      # unpacking argument sequence
print('{0}{1}{0}'.format('abra', 'cad'))   # arguments' indices can be repeated

a, b, c
a, b, c
c, b, a
c, b, a
abracadabra


Accessing arguments by name:

In [31]:
print('Coordinates: {latitude}, {longitude}'.format(latitude='37.24N', longitude='-115.81W'))
coord = {'latitude': '37.24N', 'longitude': '-115.81W'}
print('Coordinates: {latitude}, {longitude}'.format(**coord))

Coordinates: 37.24N, -115.81W
Coordinates: 37.24N, -115.81W


We can also specify the width of the field and the number of decimal places to be used. For example:

## Other String Methods

Multiplying a string by an integer simply repeats it

In [35]:
print("spam eggs! "*5)

spam eggs! spam eggs! spam eggs! spam eggs! spam eggs! 


Strings can be tranformed by a variety of functions:

In [34]:
s="Spam EggS"
print(s.capitalize())
print(s.upper())
print(s.lower())
print( "     lots of space             ".strip()) # remove leading and trailing whitespace
print(s.replace("Spam", "Bacon"))

Spam eggs
SPAM EGGS
spam eggs
lots of space
Bacon EggS


There are also lost of ways to inspect or check strings. Examples of a few of these are given here:

## String comparison operations
Strings can be compared in lexicographical order with the usual comparisons. In addition the `in` operator checks for substrings:

In [29]:
'abc' < 'bbc' <= 'bbc'

True

In [30]:
"ABC" in "This is the ABC of Python"

True