# Section 2.1.6 Strings

## 1. What is a string?

A **string** is simply a sequence of characters. Humans may look at a **string** and see a word or a sentence, but all the computer sees is a list of alpha-numeric (plus other) characters.

We've already done a lot of things with **strings**. We've displayed them on the screen with the *print()* function. We've changed a **string** of numbers to an integer or a floating point number using type conversion functions. And we've asked a user to give us a **string** to use with the *input()* function.

So far we've used **strings** like they were words or sentences. Now we're going to think like the computer and look at **strings** like they're a sequence of characters.

### String Indices

If there's one thing about **strings** that you take awake from today's lesson, it should be that when referencing each character in a **string**, using its *index*, you must start at 0. The first character in a **string** is **NOT** 1.

In [2]:
# First, we need to assign a string variable a value.

this_is_a_string = "Jasmine"

# Then we can look at the individual characters.

first_letter = this_is_a_string[0]
second_letter = this_is_a_string[1]

print(first_letter)
print(second_letter)

J
a


**Pop Quiz:** *What would be displayed if we asked for the 5th character in a string that was this sentence?*

In [4]:
sentence = "What would be displayed if we asked for the 5th character in a string that was this sentence?"
fourth_character = sentence[4]
print (fourth_character)

 


## 2. LEN Function

If you want to somehow analyze a **string**, or put it in a loop, it might be useful to know how long the string is. That way you know how many charcter idices you need to reference.

Keep in mind, however, that the **len()** function returns a length that counts the characters as if the first charater was a 1. 

In [5]:
fun_stuff = "holiday"
len(fun_stuff)

7

The **strig** "holiday" has 7 characters in it -- 'h' is character 1, 'o' is character 2, etc. 'h' is NOT character 0.

You can now get creative knowing how to get the length of a **string** and specific letters within a string.

In [7]:
fun_stuff = "mountain passes"
length = len(fun_stuff)
print(length)

last_character = fun_stuff[length-1]
print(last_character)

15
s


Because the **len()** function returns an integer, we can add or subtract from its assigned variable to find specific characters.

## 3. Traversing a String

As mentioned earlier, because a **string** is a sequence of characters, there may come a time when you want to loop through a **string**, advancing one character at a time. We can do this with a **while** loop.

In [10]:
my_string = "Vancouver Island"
string_length = len(my_string)
index = 0

while index < string_length:
    letter = my_string[index]
    print(letter)
    index = index + 1
    
    


V
a
n
c
o
u
v
e
r
 
I
s
l
a
n
d


**Pop Quiz:** *Why wouldn't we ask the while loop to iterate until the index was less than or equal to the length of the string?*

Here's another useful way to use **while** loops to traverse through **string**.

In [14]:
my_string = "Salt Spring Island"
string_length = len(my_string)
index = 0
count = 0

while index < string_length:
    letter = my_string[index]
    if letter == 'S' or letter == 's':
        print(letter)
        count = count + 1
    index = index + 1
    
print("The string contains", count, "upper and lowercase s's.")
        

S
S
s
The string contains 3 upper and lowercase s's.


## 4. String Slices

In addition to selecting one character within a **string**, using that character's index, we can also select a series of characters in a **string**. This series of characters from a **string** is called a **slice**.



In [17]:
my_new_string = "University of Alberta"

first_slice = my_new_string[0:10]
second_slice = my_new_string[14:21]

print(first_slice)
print(second_slice)

University
Alberta


We can also use this syntax with only one number.

In [19]:
slice1 = my_new_string[10:]          # Starts at the end of the string and works back 10 indices.
slice2 = my_new_string[:10]          # Starts at the beginning of the string and works forward 10 indices.

print(slice1)
print(slice2)

 of Alberta
University


**Pop Quiz:** *What happens when you exclude both indices, but still include the colon?*

In [20]:
slice3 = my_new_string[:]
print(slice3)

University of Alberta


## 5. String are Immutable

When something is immutable, it means it cannot be changed.

For example, I can assign a **string** to a variable. But I cannot use an index reference to change a character in a **string**.

In [21]:
immutable_string = "I cannot be changed."
letter = immutable_string[0]
print(letter)

immutable_string[0] = "We"

I


TypeError: 'str' object does not support item assignment

In the above situation, if we really wanted to change 'I' to 'We', we'd have to get kind of creative.

In [23]:
print("We", immutable_string[2:])

We cannot be changed.


In this example, we're not changing the **string**, we're manipulating it to get what we want. If we wanted to keep this new **string**, we could assign it to a new variable.

## 6. The IN Operator

In an above example, we used a **while** loop to find all the 's' characters in a **string**. This is useful if we need to know how many s's there are, but what if we just want to know if 's' exists within the **string** and we don't care how many times it appears? We can use the **in** operator.

**in** is an operator because it works like a plus sign (+) or a minus sign (-). It is not a function. And, the **in** operator only returns True or False.

In [24]:
another_string = "I live in Edmonton."
"I" in another_string

True

In [25]:
"b" in another_string

False

In [26]:
"Edmonton" in another_string

True

## 7. String Methods

In Python, a **string** is considered an *object*. *Objects* contain both data and *methods* that can be used to manipulate the data. *Methods* are basically built-in functions for the *object*.

Python has a **type()** function that returns the type of the *object* provided as the argument. But Python also has a **dir()** function that lists all the *methods* associated with the *object*.



In [27]:
type(another_string)

str

In [28]:
dir(another_string)

['__add__',
 '__class__',
 '__contains__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__getitem__',
 '__getnewargs__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__iter__',
 '__le__',
 '__len__',
 '__lt__',
 '__mod__',
 '__mul__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__rmod__',
 '__rmul__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'capitalize',
 'casefold',
 'center',
 'count',
 'encode',
 'endswith',
 'expandtabs',
 'find',
 'format',
 'format_map',
 'index',
 'isalnum',
 'isalpha',
 'isdecimal',
 'isdigit',
 'isidentifier',
 'islower',
 'isnumeric',
 'isprintable',
 'isspace',
 'istitle',
 'isupper',
 'join',
 'ljust',
 'lower',
 'lstrip',
 'maketrans',
 'partition',
 'replace',
 'rfind',
 'rindex',
 'rjust',
 'rpartition',
 'rsplit',
 'rstrip',
 'split',
 'splitlines',
 'startswith',
 'strip',
 'swapcase',
 'title',
 'translate',
 'upper',
 'zfill']

Even better, Python has a **help()** function that will tell you a little about a specific *function* and how to use it.

Of course, the Python website and Python books will have more detailed explanations and examples, but this is a great starting point.

In [29]:
help(str.strip)

Help on method_descriptor:

strip(...)
    S.strip([chars]) -> str
    
    Return a copy of the string S with leading and trailing
    whitespace removed.
    If chars is given and not None, remove characters in chars instead.



In [30]:
help(another_string.strip)

Help on built-in function strip:

strip(...) method of builtins.str instance
    S.strip([chars]) -> str
    
    Return a copy of the string S with leading and trailing
    whitespace removed.
    If chars is given and not None, remove characters in chars instead.



In the example above, I used the **help()** function two different ways. The first way I called **help()** with an argument of 'str.strip'. This is similar to calling a function that is part of an imported module or library. In this case, the 'str' part of the argument represents the *object* and the 'strip' part of the argument represents the *function*. 

The second **help()** example does the exact same thing, but instead of using the generic *object* 'str', I used a known **string** variable, or 'another_string'.

When calling *methods*, you must always start the call with the name of the *object* you're referring to, then a dot (.), then the name of the *method*.

    object.method()

Sometimes the *method* will take an argument within the brackets, sometimes it doesn't. For example, the *method* 'strip' can take a 'chars' argument, if you want to give it. But it will work without an argument.

In [31]:
another_string.strip()

'I live in Edmonton.'

In [41]:
another_string.strip('I ')

'live in Edmonton.'

In [40]:
another_string.strip('in')

'I live in Edmonton.'

**ADD MORE EXAMPLES**

## 8. Parsing Strings

Sometimes, like when doing text analysis, we might want to look at a large **string** and extract only a specific part of that **string**. We can do this by combining **string** slicing and **string** *methods*.



In [45]:
email_header = "From: bobby@gmail.com Sun Jun 3 2017 12:30:00"

# Let's assume we only want to find the domain name of the email address, and nothing else.

#Finds the @ symbol.
at_sign_location = email_header.find('@')     
print(at_sign_location)

#Finds the first blank space after the @ symbol.
blank_after_email = email_header.find(' ',at_sign_location)     
print(blank_after_email)

#Finds the characters in the indices between the @ symbol location plus 1 character until the first blank character.
domain_name = email_header[at_sign_location+1:blank_after_email]   
print(domain_name)

11
21
gmail.com
