## Strings

Strings can also be treated as collections similar to lists and tuples.
For example

In [15]:
S = 'Hello World!!'

print([x for x in S if x.islower()]) # list of lower case charactes

words=S.split() # list of words

print("Words are:",words)

print("--".join(words)) # hyphenated 
" ".join(w.capitalize() for w in words) # capitalise words

['e', 'l', 'l', 'o', 'o', 'r', 'l', 'd']
Words are: ['Hello', 'World!!']
Hello--World!!


'Hello World!!'

## String Formating

There are lots of methods for formating and manipulating strings built into python. Some of these are illustrated here.

String concatenation is the "addition" of two strings. Observe that while concatenating there will be no space between the strings.

In [16]:
string1='World'
string2='!'
print('Hello' + string1 + string2)

HelloWorld!


The **%** operator is used to format a string inserting the value that comes after. It relies on the string containing a format specifier that identifies where to insert the value. The most common types of format specifiers are:

    - %s -> string
    - %d -> Integer
    - %f -> Float
    - %o -> Octal
    - %x -> Hexadecimal
    - %e -> exponential

In [17]:
print("Hello %s" % string1)

print("Actual Number = %d" %18)

print("Float of the number = %f" %18)

print("Octal equivalent of the number = %o" %18)

print("Hexadecimal equivalent of the number = %x" %18)

print("Exponential equivalent of the number = %e" %18)

Hello World
Actual Number = 18
Float of the number = 18.000000
Octal equivalent of the number = 22
Hexadecimal equivalent of the number = 12
Exponential equivalent of the number = 1.800000e+01


We can also specify the width of the field and the number of decimal places to be used. For example:

In [18]:
print('Print width 10: |%10s|'%'x')

print('Print width 10: |%-10s|'%'x') # left justified

print("The number pi = %.2f to 2 decimal places"%3.1415)

print("More space pi = %10.2f"%3.1415)

print("Pad pi with 0 = %010.2f"%3.1415) # pad with zeros

Print width 10: |         x|
Print width 10: |x         |
The number pi = 3.14 to 2 decimal places
More space pi =       3.14
Pad pi with 0 = 0000003.14


## Other String Methods

In [19]:
print("Hello World! "*5)

Hello World! Hello World! Hello World! Hello World! Hello World! 


Strings can be tranformed by a variety of functions:

In [20]:
s="hello wOrld"
print(s.capitalize())
print(s.upper())
print(s.lower())
print('|%s|' % "Hello World".center(30)) # center in 30 characters
print('|%s|'% "     lots of space             ".strip()) # remove leading and trailing whitespace
print("Hello World".replace("World","Class"))

Hello world
HELLO WORLD
hello world
|         Hello World          |
|lots of space|
Hello Class


There are also lost of ways to inspect or check strings. Examples of a few of these are given here:

In [21]:
s="Hello World"
print("The length of '%s' is"%s,len(s),"characters") # len() gives length
s.startswith("Hello") and s.endswith("World") # check start/end
# count strings
print("There are %d 'l's but only %d World in %s" % (s.count('l'),s.count('World'),s))
print('"el" is at index',s.find('el'),"in",s) #index from 0 or -1

The length of 'Hello World' is 11 characters
There are 3 'l's but only 1 World in Hello World
"el" is at index 1 in Hello World


## String comparison operations
Strings can be compared in lexicographical order with the usual comparisons. In addition the `in` operator checks for substrings:

In [22]:
'abc' < 'bbc' <= 'bbc'

True

In [23]:
"ABC" in "This is the ABC of Python"

True

## Accessing parts of strings

Strings can be indexed with square brackets. Indexing starts from zero in Python. 

In [24]:
s = '123456789'
print('First charcter of',s,'is',s[0])
print('Last charcter of',s,'is',s[len(s)-1])

First charcter of 123456789 is 1
Last charcter of 123456789 is 9


Negative indices can be used to start counting from the back

In [25]:
print('First charcter of',s,'is',s[-len(s)])
print('Last charcter of',s,'is',s[-1])

First charcter of 123456789 is 1
Last charcter of 123456789 is 9


Finally a substring (range of characters) an be specified as using $a:b$ to specify the characters at index $a,a+1,\ldots,b-1$. Note that the last charcter is *not* included.

In [26]:
print("First three charcters",s[0:3])
print("Next three characters",s[3:6])

First three charcters 123
Next three characters 456


In [27]:
print("First three characters", s[:3])
print("Last three characters", s[-3:])

First three characters 123
Last three characters 789


## Strings are immutable

It is important that strings are constant, immutable values in Python. While new strings can easily be created it is not possible to modify a string:

In [28]:
s='012345'
sX=s[:2]+'X'+s[3:] # this creates a new string with 2 replaced by X
print("creating new string",sX,"OK")
sX=s.replace('2','X') # the same thing
print(sX,"still OK")
s[2] = 'X' # an error!!!

creating new string 01X345 OK
01X345 still OK


TypeError: 'str' object does not support item assignment

## Dictionaries

Dictionaries are mappings between keys and items stored in the dictionaries. Alternatively one can think of dictionaries as sets in which something stored against every element of the set. They can be defined as follows:

To define a dictionary, equate a variable to { } or dict()

In [29]:
d = dict() # or equivalently d={}
print(type(d))
d['abc'] = 3
d[4] = "A string"
print(d)

<class 'dict'>
{'abc': 3, 4: 'A string'}


As can be guessed from the output above. Dictionaries can be defined by using the `{ key : value }` syntax. The following dictionary has three elements

In [30]:
d = { 1: 'One', 2 : 'Two', 100 : 'Hundred'}
len(d)

3

Now you are able to access 'One' by the index value set at 1

In [31]:
print(d[1])

One


There are a number of alternative ways for specifying a dictionary including as a list of `(key,value)` tuples.
To illustrate this we will start with two lists and form a set of tuples from them using the **zip()** function
Two lists which are related can be merged to form a dictionary.

In [32]:
names = ['One', 'Two', 'Three', 'Four', 'Five']
numbers = [1, 2, 3, 4, 5]
[ (name,number) for name,number in zip(names,numbers)] # create (name,number) pairs

[('One', 1), ('Two', 2), ('Three', 3), ('Four', 4), ('Five', 5)]

Now we can create a dictionary that maps the name to the number as follows.

In [33]:
a1 = dict((name,number) for name,number in zip(names,numbers))
print(a1)

{'One': 1, 'Two': 2, 'Three': 3, 'Four': 4, 'Five': 5}


Note that the ordering for this dictionary is not based on the order in which elements are added but on its own ordering (based on hash index ordering). It is best never to assume an ordering when iterating over elements of a dictionary.

By using tuples as indexes we make a dictionary behave like a sparse matrix:

In [34]:
matrix={ (0,1): 3.5, (2,17): 0.1}
matrix[2,2] = matrix[0,1] + matrix[2,17]
print(matrix)

{(0, 1): 3.5, (2, 17): 0.1, (2, 2): 3.6}


Dictionary can also be built using the loop style definition.

In [35]:
a2 = { name : len(name) for name in names}
print(a2)

{'One': 3, 'Two': 3, 'Three': 5, 'Four': 4, 'Five': 4}


### Built-in Functions

The **len()** function and **in** operator have the obvious meaning:

In [36]:
print("a1 has",len(a1),"elements")
print("One is in a1",'One' in a1,"but not Zero", 'Zero' in a1)

a1 has 5 elements
One is in a1 True but not Zero False


**clear( )** function is used to erase all elements.

In [37]:
a2.clear()
print(a2)

{}


**values( )** function returns a list with all the assigned values in the dictionary. (Acutally not quit a list, but something that we can iterate over just like a list to construct a list, tuple or any other collection):

In [38]:
[ v for v in a1.values() ]

[1, 2, 3, 4, 5]

**keys( )** function returns all the index or the keys to which contains the values that it was assigned to.

In [39]:
{ k for k in a1.keys() }

{'Five', 'Four', 'One', 'Three', 'Two'}

**items( )** is returns a list containing both the list but each element in the dictionary is inside a tuple. This is same as the result that was obtained when zip function was used - except that the ordering has been 'shuffled' by the dictionary.

In [40]:
",  ".join( "%s = %d" % (name,val) for name,val in a1.items())

'One = 1,  Two = 2,  Three = 3,  Four = 4,  Five = 5'

**pop( )** function is used to get the remove that particular element and this removed element can be assigned to a new variable. But remember only the value is stored and not the key. Because the is just a index value.

In [41]:
val = a1.pop('Four')
print(a1)
print("Removed",val)

{'One': 1, 'Two': 2, 'Three': 3, 'Five': 5}
Removed 4
