# Strings Part II

In addition to looking at individual elements of a string, we can also examine and manipuate entire chunks of a string at a time. The colon operator `:` allows us to specify continuous chunks of a sequence. The two numbers on either side of the colon indicate (1) "start here" and (2) "up to but not including here".  

In [1]:
s = 'Monty Python'
print(s[0:4]) # start at 0 and end at (but not including) 4

Mont


In [2]:
print(s[6:7])

P


If the second argument to the colon operator is beyond the end of the sequence, then Python will just stop at the end of the sequence. 

In [3]:
s = 'Monty Python'
print(s[6:20]) 

Python


If the first argument to the colon operator is missing, then Python starts at the beginning of the sequence. If the second argument is missing, then Python finishes at the end.

In [4]:
s = 'Monty Python'
print(s[:2]) # start at the beginning

Mo


In [5]:
print(s[8:]) # go until the end

thon


In [6]:
print(s[:]) # whole thing

Monty Python


We have already seen string concatenation using the `+` operator. Note that string concatenation does not automatically add a space. 

In [7]:
a = 'Hello'
b = a + 'There'
print(b)

HelloThere


In [9]:
c = a + ' ' + 'There' # add a space manually
print(c)

Hello There


Often with strings we use `in` as a logical operator. We have already seen `in` being used as a structure for our `for` loops. We can also use `in` as a logical operator, just like `<` or `==`. 

The `in` keyword checks to see if one string is "in" another string. If it is, then the expression is True. If not, then False. 

In [10]:
fruit = 'banana'
'n' in fruit

True

In [11]:
'm' in fruit

False

In [12]:
if 'a' in fruit:
    print('Found it!')

Found it!


When you assign a string variable, Python considers the string to be a **string object**. String objects (and other types of objects) can be operated on by simply appending functions to the end of the object. 

These functions do not modify the original object, but instead return a new object that has been altered. In the example below, we assign the variable "greet" as a string, which means it is also a string object. Notice that we can append the `lower()` function to the end of the object. The `lower()` function sets all characters to lowercase. 

In [2]:
greet = 'Hello Gus'
greet_lwr = greet.lower() # apply the lower() function to greet
print(greet_lwr)

hello gus


In [3]:
print(greet) # greet is unchanged

Hello Gus


In [4]:
print('Hi There'.lower())

hi there


To see all the different methods, or things we can do to strings, we can use the `dir()` function

In [5]:
string = 'Hello world'
type(string) # what type of variable is it?
dir(string) # what built-functions can we use on it?

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']

Let's go through several particularly useful string functions. 

The `find()` function tells us the position of a substring within another string. If the substring is not found, then `find()` returns -1.

In [20]:
fruit = 'banana'
pos = fruit.find('na')
print(pos)

2


In [21]:
aa = fruit.find('z')
print(aa)

-1


`upper()` and `lower()` convert all characters to uppercase or lowercase

In [6]:
greet = 'Hello Gus'
nnn = greet.upper()
print(nnn)

HELLO GUS


In [7]:
www = greet.lower()
print(www)

hello gus


`replace()` finds all occurrences of the search string and replaces it with the replacement string

In [8]:
greet = 'Hello Gus'
nstr = greet.replace('Gus', 'Jon') # replace "Bob" with "Jane"
print(nstr)

Hello Jon


In [9]:
nstr = greet.replace('l', 'X') # replace "o" with "X"
print(nstr)

HeXXo Gus


`lstrip()` removes whitespace from the left side of a string. `rstrip()` removes whitespace from the right side of a string. `strip()` removes both beginning and ending whitespace. Whitespace in text could be spaces, tabs, blank lines, or any other space that would come out of the printer as white.

In [10]:
greet = ' Hello Gus '
print(greet.lstrip())

Hello Gus 


In [11]:
print(greet.rstrip())

 Hello Gus


In [12]:
print(greet.strip())

Hello Gus


`startswith()` tells us if the string starts with a particular substring.

In [30]:
line = 'Please have a nice day'
line.startswith('Please') # does the string start with Please?

True

In [31]:
line.startswith('p') # does the string start with p?

False

Finally, let's use the `find()` function to complete a real-world task. Given the first line of an email, how can we extract the host name of the email address?

In [32]:
data = 'From stephen.marquard@uct.ac.za Sat Jan  5 09:14:16 2008'
atpos = data.find('@') # find location of @
print(atpos)

21


In [33]:
sppos = data.find(' ', atpos) # find location of space following @
print(sppos)

31


In [34]:
host = data[atpos+1 : sppos] # return all characters between atpos and sppos
print(host)

uct.ac.za
