<h1 style="text-align:center;">Quick Review: Python Collections</h1>

<h2 style="text-align:left;">
List: `[1, "two", 3]`
<span style="float:right;margin-right:2em">"Examples of"</span>
</h2>

<h2 style="text-align:left;">
Tuple: (1, "two", 3)
<span style="float:right;margin-right:2em">"Related Information"</span>
</h2>

<h2 style="text-align:left;">
Dictionary: {"one": 1, "two": 2}
<span style="float:right;margin-right:2em">"Save for later"</span>
</h2>

<h2 style="text-align:left;">
Set: {"one", 2, "three"}
<span style="float:right;margin-right:2em">"Check for memebership"</span>
</h2>

<h1 style="text-align:center;">We can iterate over any of these</h1>

In [None]:
collection = {2, 1, 3}
for item in collection:
    print(item)

<h1 style="text-align:center;">Python is really good at reusing concepts.</h1>

<h3>We are going to use the same tools that we would use to interact with these collections over and over again in other parts of the language</h3>


<h1 style="text-align:center;">Strings!</h1>

In [None]:
# single quote strings
# double quote strings
# string literals

<h1 style="text-align:center;">A string acts like a tuple of charaters</h1>

In [None]:
my_string = 'abc'
print(my_string[1])

In [None]:
my_string[1] = '!'

<h1 style="text-align:center;">What happens when we use inequalities on strings?</h1>

<h1 style="text-align:center;">Does the same thing happen with collections?</h1>

In [None]:
print('abc' < 'def')

<h1 style="text-align:center;">If strings are immutable, what happens when you add them?</h1>

In [None]:
print('abc' + 'def')

# Indices are zero indexed

In [None]:
my_list = ['one', 'two', 'three']
my_string = 'abc'
# get first and second characters

#  Trick question!

### What is `my_list[len(my_list)]`?

Negative indices count from the end of the list
------------------------------------------------------------------

`numbers[-i]` is equivalent to `numbers[len(numbers) - i]`

In [None]:
numbers = ['one', 'two', 'three', 'four', 'five']
# get last and second to last words
# show -1 and len - 1 are equal

Indices support assignment
----------------------------------------

In [None]:
numbers = [0, 1, 2, 3, 4, 5]
# replace the fifth element with 'five'

Repeated indices retrieve data from nested collections
------------------------------------------------------------------------------

In [None]:
table = [
    [(0, 0), (0, 1), (0, 2)],
    [(1, 0), (1, 1), (1, 2)],
    [(2, 0), (2, 1), (2, 2)]
]

print(table[0][1])  # walk through this step by step

What will `print(table[1][2][0])` show? 

How about `print(table[0][0][-1])`?

# Who has had trouble with zero indexing?

Slices!
----------

`list[start:stop]` means 
 * from index `start`
 * up to but not including, index `stop`

In [None]:
numbers = ['one', 'two', 'three', 'four', 'five']
print(numbers[2:4])

### What slice will return the first 3 elements?
### If len(numbers) is the number of items in the list, what slice will return the first half of the list?

#### Slicing means from index *start* up to but not including index *stop*

What will `print(numbers[2:2])` show?

How about `print(numbers[5:2])`?

### In the slice `a:b`, both `a` and `b` are optional.
- `numbers[:b]` is equivalent to `numbers[0:b]`.
- `numbers[a:]` is equivalent to `numbers[a:len(my_list)]`
- what does `numbers[:]` do?

# In the slice `a:b`, `a` and `b` can also be negative.

In [None]:
numbers = ['one', 'two', 'three', 'four', 'five', 'six']
print(numbers[-3])

### The way to get indices correct is to imaginge the number coming before the elements.

Step size!
--------------

Slices have a third optional parameter that controls the stride

`list[start:stop:step]` means 
 * from index `start`
 * return every element `step` apart
 * up to but not including `stop`

In [None]:
print(my_list)
my_list[::-1]

### Tricky question
slice [start:stop:step] provides from *start* every *step* apart, up to but not including index *stop*

Which one of these does what you expect it to?
* `numbers[2:5:-1]`
* `numbers[5:2:-1]`

In [None]:
# What elements will the following slices return?
numbers = ['one', 'two', 'three', 'four', 'five']
print(numbers[:-2])
print(numbers[2:-2])
print(numbers[-2:2])
print(numbers[-2:])
print(numbers[-1::-1])

# Given a string `my_string`, rotate it `i` steps to the left.

## `012345` rotated 2 steps to the left becomes `234501`

---------------------------------------------

# Challenge: Given a string, print all of it's substrings

### For example, given string `abc`, print `abc`, `ab`, `bc`, `a`, `b`, `c`

# We can assign into slices

In [None]:
numbers[2:5] = 2, 3, 4
print(numbers)

# Example of assigning with unequal lengths

# Bonus Question:

### Use slices to assign every odd element in a list to the even element immediately preceding it.

### `[0, 1, 2, 3] -> [1, 1, 3, 3]`

# Format strings!

### Python keeps reinventing string formatting, the latest is format strings (Python version >= 3.6)

In [1]:
name = "Jared"
place = "Noisebridge"

# example of a format string


Hello Jared
Welcome to Noisebridge



# Formatting options for everyone!

### Check out https://pyformat.info/ for details.

In [None]:
import math

name = 'Noisebridge'
my_name = 'Jared'

print(f'I love {math.pi} -> I love {math.pi * 10:.4}')
print(f'I love fixed width {name:^11}')
print(f'I love fixed width {my_name:^11}')

The previous preferred way of formatting strings was the .format method

In [None]:
name = 'Noisebridge'
print('Hello {0}, {1}!'.format(name, "nice to meet you"))

### Use dir to get a list of all methods associated with a string

In [None]:
dir(str)

### There are lots of methods to work with strings:
- split
- join
- strip
- lower/upper
- replace
- startswith/endswith

### split and join are the hardest to understand, but very useful

In [None]:
my_str = "1 2 3 4"
# split

# rejoin on space

# rejoin on <>

# convert the string '1, 2, 3, 4' into the list of integers `[1, 2, 3, 4]`
>
> remember that you can convert a number represented as a string into an integer with `int('5')`

# Challenge: Use replace, split, strip and slicing to return the last five words of the first sentence.

In [4]:
science = """
…In that Empire, the Art of Cartography attained such Perfection that the map of a
single Province occupied the entirety of a City, and the map of the Empire, the entirety
of a Province. In time, those Unconscionable Maps no longer satisfied, and the
Cartographers Guilds struck a Map of the Empire whose size was that of the Empire, and
which coincided point for point with it. The following Generations, who were not so
fond of the Study of Cartography as their Forebears had been, saw that that vast Map
was Useless, and not without some Pitilessness was it, that they delivered it up to the
Inclemencies of Sun and Winters. In the Deserts of the West, still today, there are
Tattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is
no other Relic of the Disciplines of Geography.
—Suarez Miranda,Viajes devarones prudentes, Libro IV,Cap. XLV, Lerida, 1658
"""
science

'\n…In that Empire, the Art of Cartography attained such Perfection that the map of a\nsingle Province occupied the entirety of a City, and the map of the Empire, the entirety\nof a Province. In time, those Unconscionable Maps no longer satisfied, and the\nCartographers Guilds struck a Map of the Empire whose size was that of the Empire, and\nwhich coincided point for point with it. The following Generations, who were not so\nfond of the Study of Cartography as their Forebears had been, saw that that vast Map\nwas Useless, and not without some Pitilessness was it, that they delivered it up to the\nInclemencies of Sun and Winters. In the Deserts of the West, still today, there are\nTattered Ruins of that Map, inhabited by Animals and Beggars; in all the Land there is\nno other Relic of the Disciplines of Geography.\n—Suarez Miranda,Viajes devarones prudentes, Libro IV,Cap. XLV, Lerida, 1658\n'

In [12]:
science.replace("\n", " ").split('.')[0].replace(',', '').split()[-5:]

['the', 'entirety', 'of', 'a', 'Province']

### Regular Expressions!

A regular expression is a pattern that matches some set of strings.
* the regular expression `abc` matches exactly one string: "abc"
* the regular expression `\d` matches any single character, 1-9
* the regular expression `.` matches any single character
* `*` matches any number of repetitions for the previous character. `a*` matches "", "a", "aa", "aaa"...
* `+` matches one or more repetitions
* `?` matches zero or one repetitions
* `()` is a group that can be operated on collectively. `(ABC)?` matches "" or "ABC"


In [None]:
import re
re_digit = re.compile('\d')
match = re_digit.match('1')
if match is not None:
    print(match.group())

In [None]:
for match in re.finditer(pattern, string):
    print(match)

Special characters in regular expressions:
   - \d any digit
   - \ escape
   - . any single character
   - \* between 0 and infinite repetitions of the previous character
   - \+ between 1 and infinite repetitions of the previous character
   - ? between 0 and 1 repetitions of the previous character
   - {i,j} between i and j repetitions of the previous character
   - () group that can be operated on, or referenced later with \1 ... \9
   - lots more ...
    
Lets make a regular expression that matches a phone number!

Check out:
=========
* string methods for more useful string manipulation
* numpy for fancier indexing
* the itertools module for yet more collection manipulation
* https://regexr.com to learn regular expressions