# Introduction
In this module, we will expand on the earlier module on variables and introduce the string variable type. Though more accurately described as a data structure, the string data type has many features that are not found in numeric values. To understand these issues, we begin by defining a string as a data structure, and from there discuss how the data in strings are structure. We then discuss the properties and methods that are associated with string values.

# Strings
Though data elements are often thought of as numbers or strings, the appropriate distinction is numbers in characters. So, strings are not a data element, but are instead a data structure. Specifically, a string is a data structure that contains an ordered collection of characters and has properties and methods that are unique to the string object. In this chapter, we will look at the structure of a string, the content of a string and the methods and properties that are accessible through the string object. 

## A String is a Collection of Characters
It is best to think of a string as an *ordered collection of characters* rather than as a string. Thinking of strings as a collection of characters helps us think more clearly about how strings are structured and how to interact with them in our applications.  

In [None]:
myString = "Collection of Characters"

In [None]:
len(myString)

When a string is created, we view it as a single entity, but python treats it as an ordered collection of values. So, the above string assignment is treated in the following way:

|0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16|17|18|19|20|21|22|23|
|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|-|
|C|o|l|l|e|c|t|i|o|n| |o|f| |C|h|a|r|a|c|t|e|r|s|

Because strings are stored in this way, we have access to any single character or subset of the string through indices. For example, the following statements can be used to retrieve a letter or series of letters from our string. The first statement prints the letter in the 4th pointer (which is in the 5th letter in the sequence). 

In [None]:
myString[4]

The second statement prints a subset of the string which includes all letters from the 1st pointer to the 4th pointer.

In [None]:
myString[1:4]

The third statement makes use of the len() function (which counts the number of items in a collection of elements) to subset all characters from the 0th pointer to the last pointer (note: because 0 is the first position and len() provides a count, (len() – 1) calculates the position of the last pointer).

In [None]:
myString[:len(myString)]

The next two statements subset the string by extracting all letters up to the 3rd pointer and all letters from the 3rd pointer to the end of the string. 

In [None]:
myString[:15]

In [None]:
myString[15:]

In [None]:
myString[:-15]

In [None]:
myString[-15:]

Another consequence of the *an ordered collection of characters* paradigm is that each character is an element that can be traversed or searched. So, loops can be used to iterate through all elements in the ordered collection of characters. Likewise, the in operator can be used to search for elements in the collection. 

In [None]:
i = 0 
while i < len(myString): 
    letter = myString[i] 
    print("Letter " + str(i) + ": " + letter) 
    i = i + 1 

Remember, for loops can be used when you know exactly how many times you want to loop your code. In the case of strings, Python has a built-in knowledge of strings which allows you to loop through each item in the collection of characters. **Note:** When using the for loop to iterate through a string in this way, the letter variable is updated each time through the loop to the *next* item (in this case, character) in the collection. This is why we are able to access the individual letter without knowing their position in the string.

In [None]:
for letter in myString: 
    print("Letter " + str(myString.index(letter)) + ": " + letter)

In [None]:
aCount = 0 
for letter in myString:
    if letter == 'a': 
        aCount += 1 

print("There are " + str(aCount) + " a's in the string " + myString) 


A final consequence of strings being an ordered collection of characters is that when comparing strings, length is irrelevant and capitalization matters. Length is irrelevant because it is irrelevant when we compare words (which is why ‘zero’ comes after ‘one’ and ‘eighty’ comes before ‘seventy’). Also, capitalization matters because, from the computer’s perspective, an ‘a’ and an ‘A’ are different letters. 

In [None]:
print('zero' < 'one')

In [None]:
print('zero' == 'ZERO')

In [None]:
print('one' == '1')

In [None]:
print('one' == 'one')

## A String is an Object
We aren’t covering object-oriented concepts in this class, but you do need to understand that objects are data structures that have properties and methods in addition to whatever values we might assign. So, our string variable above has the value of ‘Collection of Characters’, but by nature of being a string object, our variable has access to properties and methods that are built in to string objects. A property is some static value that is unique to the object it is meant to describe.  

A method is a function that is built in to the string object that performs some operation on the string value. An example of a string method is the upper() method which returns a uppercased version of the string. The following lines provide a brief selection of string methods available. Refer to page 72 of your text book for a comprehensive list. Also refer to https://docs.python.org/3/library/stdtypes.html#string-methods 

In [None]:
myString.upper()

In [None]:
myString.lower()

In [None]:
myString.split("a")

In [None]:
stringParts = myString.split()

In [None]:
stringParts

In [None]:
len(stringParts)

In [None]:
for stringPart in stringParts:
    print(stringPart + " is " + str(len(stringPart)) + " characters long.")

In [None]:
myString.capitalize()

In [None]:
myString.strip('s')

In [None]:
myString.center(50, "-")

In [None]:
myString.count(' ')

In [None]:
myString.endswith('ing')

In [None]:
myString.find('of')

In [None]:
myString.isupper()

In [None]:
myString.islower()

There are different types of objects in Python, and Strings are classified as static objects. A static object is an object that is immutable (or unchangeable). This means that once a string object is instantiated, it cannot be changed and that any changes you wish to save, must be saved as a new string object. This concept is counter-intuitive, but it will make sense when you look at the ways strings are handled in python. So, the first example does not work because you cannot change the value of any element in the ordered collection characters. 

In [None]:
i = 0 
while i < len(myString): 
    myString[i] = myString[i].upper() 
    print(myString)
    i += 1 

However, the second example does work because the code creates a new instance of the myUpperString each time through the loop. 

In [None]:
i = 0 
myUpperString = ""
while i < len(myString): 
    myUpperString = myUpperString + myString[i].upper() 
    print(myUpperString)
    i += 1 


## String Presentation
Formatting strings can be one of the most time-consuming aspects of programming. To create a dynamic and informative application interface, you will often find yourself needing to parse strings and to pipe variable values into an output string. Python offers many solutions to this problem (oftentimes referred to as *string interpolation*). The following lines of code illustrate these options.  

In [None]:
firstName = "Jake"
lastName = "London"
myAge = 41

In [None]:
print("My name is", firstName, lastName, ", and I am", myAge, "years old.") 

In [None]:
print("My name is " + firstName + " " + lastName + ", and I am " + str(myAge) + " years old.") 

In [None]:
print("My name is %s %s, and I am %d years old." % (firstName, lastName, myAge)) 

In [None]:
print("My name is {} {}, and I am {} years old.".format(firstName, lastName, myAge)) 

In [None]:
print("My name is {1} {0}, and I am {2} years old.".format(firstName, lastName, myAge)) 

**Note:** The following examples will not work because Azure notebooks does not support Python 3.6+. This is unfortunate, because it is my favorite (because it is the most readable) method for presenting strings. This method will work on your version of Python.

In [None]:
print(f"My name is {firstName} {lastName}, and I am {myAge} years old.") 

In [None]:
myMessage = f'''
My name is {firstName} {lastName}, 
and I am {myAge} years old.
'''

print(myMessage) 

# Exercise
Write code to identify university email addresses. Prompt the user for their email address and tell them whether or not their email address is a university account (ends in a .edu).

In [None]:
# Step 1...

# Step 2...