# Exercise:

Take the following comma delimited strings:

    str1 = 'John, Doe, 123 Main Street, New York City, NY, 01234'
    str2 = 'Jane, Smith, 456 East Street, Philadelphia, PA, 67890'
    str3 = 'Mike, Lee, 789 West Avenue, Los Angeles, CA, 24680'

Use the tools we have just learned to decompose the string into distinct fields corresponding to the text between the commas. After you have this decomposition, create a sorted list containing all of the state abbrviations in these records. In this case, the states are all in the 4th field (in this case, NY, PA, and CA)

## The plan....

What we're going to do in this notebook is to walk through a simple Python exercise, step by step. The point of this is not to suggest that this is the most efficient way to build your code, nor to even suggest that the resulting code is especially beautiful or efficient. Rather, the point is to demonstrate one way in which we can very methodically think about decomposing a problem into very simple chunks.

## Create string objects in python as a starting point

In [1]:
str1 = 'John, Doe, 123 Main Street, New York City, NY, 01234'
str2 = 'Jane, Smith, 456 East Street, Philadelphia, PA, 67890'
str3 = 'Mike, Lee, 789 West Avenue, Los Angeles, CA, 24680'

## Let's figure out how to parse the string, starting with splitting up using the ',' delimeter

In [2]:
str1 = 'John, Doe, 123 Main Street, New York City, NY, 01234'
str2 = 'Jane, Smith, 456 East Street, Philadelphia, PA, 67890'
str3 = 'Mike, Lee, 789 West Avenue, Los Angeles, CA, 24680'

str1.split(',')

['John', ' Doe', ' 123 Main Street', ' New York City', ' NY', ' 01234']

## From that point, we can easily see that the `State` field is can be retried using list indexing

In [3]:
str1 = 'John, Doe, 123 Main Street, New York City, NY, 01234'
str2 = 'Jane, Smith, 456 East Street, Philadelphia, PA, 67890'
str3 = 'Mike, Lee, 789 West Avenue, Los Angeles, CA, 24680'

str1.split(',')[4]

' NY'

## Unfortunately, the split left in some leading white space, so let's strip that off

In [4]:
str1 = 'John, Doe, 123 Main Street, New York City, NY, 01234'
str2 = 'Jane, Smith, 456 East Street, Philadelphia, PA, 67890'
str3 = 'Mike, Lee, 789 West Avenue, Los Angeles, CA, 24680'

str1.split(',')[4].strip()

'NY'

## Now that we can pull out the field we want, let's simplify our lives by making it a function that we can reuse

In [5]:
def find_state(s):
    return s.split(',')[4].strip()

str1 = 'John, Doe, 123 Main Street, New York City, NY, 01234'
str2 = 'Jane, Smith, 456 East Street, Philadelphia, PA, 67890'
str3 = 'Mike, Lee, 789 West Avenue, Los Angeles, CA, 24680'

find_state(str1)

'NY'

## Okay, so next step. We don't want to work with each of the strings individually so let's through them into a list, understanding that we'll eventually iterate over this list.

In [6]:
def find_state(s):
    return s.split(',')[4].strip()

str = list()
str.append('John, Doe, 123 Main Street, New York City, NY, 01234')
str.append('Jane, Smith, 456 East Street, Philadelphia, PA, 67890')
str.append('Mike, Lee, 789 West Avenue, Los Angeles, CA, 24680')

str

['John, Doe, 123 Main Street, New York City, NY, 01234',
 'Jane, Smith, 456 East Street, Philadelphia, PA, 67890',
 'Mike, Lee, 789 West Avenue, Los Angeles, CA, 24680']

## ... and here's the iteration

In [7]:
def find_state(s):
    return s.split(',')[4].strip()

str = list()
str.append('John, Doe, 123 Main Street, New York City, NY, 01234')
str.append('Jane, Smith, 456 East Street, Philadelphia, PA, 67890')
str.append('Mike, Lee, 789 West Avenue, Los Angeles, CA, 24680')

for s in str:
    print find_state(s)

NY
PA
CA


## Not quite there yet, because we need to keep track of all of the results as we iterate. Let's use a list to do so.

In [8]:
def find_state(s):
    return s.split(',')[4].strip()

str = list()
str.append('John, Doe, 123 Main Street, New York City, NY, 01234')
str.append('Jane, Smith, 456 East Street, Philadelphia, PA, 67890')
str.append('Mike, Lee, 789 West Avenue, Los Angeles, CA, 24680')

results = list()
for s in str:
    results.append(find_state(s))
    
results

['NY', 'PA', 'CA']

## And finally, we just need to sort this list of results

As it turns out, there are multiple ways to sort a list in python. Sorting a list using the builtin method `sort()` will sort the list in place (i.e. it will replace the original list by a sorted list), but it actually returns `None`. The upshot of this is that to actually see the results of the sort, we'll need to print out the `results` object after the sort.

In [9]:
def find_state(s):
    return s.split(',')[4].strip()

str = list()
str.append('John, Doe, 123 Main Street, New York City, NY, 01234')
str.append('Jane, Smith, 456 East Street, Philadelphia, PA, 67890')
str.append('Mike, Lee, 789 West Avenue, Los Angeles, CA, 24680')

results = list()
for s in str:
    results.append(find_state(s))

results.sort()

results

['CA', 'NY', 'PA']