# Types and Collections

* type determines how data is represented, and it's rules for use.
* collections let us assemble data into facts about the world

# List
* Ordered
* Mutible
* 'Examples of'

In [None]:
# examples of creation, indexing and assignment

# Tuple
* Ordered
* Immutable
* 'Related information'

In [None]:
# examples of creation, indexing and assignment

# Dictionary
* Access values based on keys
* Keys must be immutable
* 'I want this information on demand'

In [None]:
# examples of creation, retrieval and assignment

# Set
* unordered
* unique
* 'member of' or 'property of'

In [None]:
# examples of creation, and membership test

A string acts like a tuple of characters
===============================
* ordered
* immutable

Indices!
-----------

# Indices are zero indexed

Negative indices count from the end of the list
------------------------------------------------------------------

`numbers[-i]` is equivalent to `numbers[len(numbers) - i]`

In [4]:
numbers = [0, 1, 2, 3, 4, 5]
print(numbers[-1], '==', numbers[len(numbers) - 1])
print(numbers[-4], '==', numbers[len(numbers) - 4])

5 == 5
2 == 2


Indices support assignment
----------------------------------------

In [5]:
numbers = [0, 1, 2, 3, 4, 5]
numbers[5] = 'five'

print(numbers)

[0, 1, 2, 3, 4, 'five']


Repeated indices retrieve data from nested collections
------------------------------------------------------------------------------

In [6]:
table = [
    [(0, 0), (0, 1), (0, 2)],
    [(1, 0), (1, 1), (1, 2)],
    [(2, 0), (2, 1), (2, 2)]
]

print(table[0][1])

(0, 1)


What will `print(table[1][2][0])` show? 

How about `print(table[0][0][-1])`?

Slices!
----------

`list[start:stop]` means 
 * from index `start`
 * up to but not including, index `stop`

In [None]:
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print(numbers[2:5])

In [None]:
numbers[2:5] = 'two', 'three', 'four'
print(numbers)

### Slicing means from index `start` up to but not including index `stop`

What will `print(numbers[2:2])` show?

How about `print(numbers[5:2])`?

### In the slice `a:b`, both `a` and `b` are optional.
- `numbers[:b]` is equivalent to `numbers[0:b]`.
- `numbers[a:]` is equivalent to `numbers[a:len(my_list)]`
- what does `numbers[:]` do?

# In the slice `a:b`, `a` and `b` can also be negative.

In [None]:
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
print(numbers[-4:])

### The way to get indices correct is to imaginge the number coming before the elements.

Step size!
--------------

Slices have a third optional parameter that controls the stride

`list[start:stop:step]` means 
 * from index `start`
 * return every element `step` apart
 * up to but not including `stop`

Which one of these does what you expect it to?
* `numbers[2:5:-1]`
* `numbers[5:2:-1]`

In [None]:
# What elements will the following slices return?
numbers = [1, 2, 3, 4, 5, 6, 7, 8 , 9]
print(numbers[:-2])
print(numbers[2:-2])
print(numbers[-2:2])
print(numbers[-2:])
print(numbers[-5::-1])

Copying!
-------------

Making a slice performs a shallow copy.

In [None]:
numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

copy = numbers[:5]
copy = ['zero', 'one', 'two', 'three', 'four']
print(copy)
print(numbers[:5])

In [None]:
deep_data = [[0], [1], [2]]

copy = deep_data[:2]
copy[0][0] = 'zero'
print(copy)
print(deep_data[:2])

numpy only copies when you explicitly tell it to.

In [None]:
import numpy

numbers = numpy.array([0, 1, 2, 3, 4, 5])
view = numbers[-3:]
view[:] = [0, 0, 0]

print(view)
print(numbers)

Concatination
--------------------

Use `+` to concatinate lists.

In [None]:
print([0, 1, 2] + ['three', 'four', 'five'])

How would you rotate a list `i` steps to the left?

`012345` rotated 2 steps to the left becomes `234501`

How would you assign every odd element in a list to the even element immediately preceding it?

In [None]:
i = 2
print(numbers[i:] + numbers[:i])

In [None]:
numbers[1::2] = numbers[::2]
print(numbers)

### Strings!

If strings act like tuples of characters, what will the following do?
* `"cat" + "goes meow"`
* `"cat goes meow"[:3]`
* `"cat goes meow"[::-1]`
* `"cat goes meow"[:2] = 'p', 'a'`
* `for c in "my string": print(c)`

### Strings can be manipulated with string methods

In [None]:
s = "my string"
print(s.upper())

If strings are immutable, what is happening here?

### Use dir to get a list of all methods associated with a string

In [None]:
dir("hello world")

### Methods operate on objects
* python objects:
  * `'hello noisebridge'`
  * `5`
  * `['one', 'two', 'three']`
* We say they are instances of type string, integer and list.
* Any syntactically complete atom of python is an object, and has a similar interface.

### There are lots of excellent methods to work with strings:
- split
- join
- strip
- lower/upper
- replace
- startswith

In [None]:
comma_seperated_values = '1, 2, 3, 4'
list_values = comma_seperated_values.split(',')

print(list_values)
original_values = ','.join(list_values)
print(original_values)

Challenge question: convert the string '1, 2, 3, 4' into the list of integers `[1, 2, 3, 4]`

### String literals help you write text with newlines in it.

In [None]:
poem = '''
        hickory dickory dock
        the mouse ran up the clock
'''

### Format Strings!

Python keeps reinventing string formatting, the latest is format strings (Python version >= 3.6)

In [None]:
# example of format strings

Formatting options for everyone!

Check out https://pyformat.info/ for details.

In [None]:
import math

name = 'Noisebridge'
my_name = 'Jared'

print(f'I love {math.pi} -> I love {math.pi * 10:.3}')
print(f'I love fixed width {name:^11}')
print(f'I love fixed width {my_name:^11}')

The previous preferred way of formatting strings was the .format method

In [None]:
name = 'Noisebridge'
print('Hello {0}, {1}!'.format(name, "nice to meet you"))

You can get pretty silly with these

In [None]:
import sys
print(f'Hello {sys.exit()}')

### Regular Expressions!

A regular expression is a pattern that matches some set of strings.
* the regular expression `abc` matches exactly one string: "abc"
* the regular expression `\d` matches any single character, 1-9
* the regular expression `.` matches any single character
* `*` matches any number of repetitions for the previous character. `a*` matches "", "a", "aa", "aaa"...
* `+` matches one or more repetitions
* `?` matches zero or one repetitions
* `()` is a group that can be operated on collectively. `(ABC)?` matches "" or "ABC"


In [None]:
import re
re_digit = re.compile('\d')
match = re_digit.match('1')
if match is not None:
    print(match.group())

In [None]:
for match in re.finditer(pattern, string):
    print(match)

Special characters in regular expressions:
   - \d any digit
   - \ escape
   - . any single character
   - \* between 0 and infinite repetitions of the previous character
   - \+ between 1 and infinite repetitions of the previous character
   - ? between 0 and 1 repetitions of the previous character
   - {i,j} between i and j repetitions of the previous character
   - () group that can be operated on, or referenced later with \1 ... \9
   - lots more ...
    
Lets make a regular expression that matches a phone number!

Check out:
=========
* string methods for more useful string manipulation
* numpy for fancier indexing
* the itertools module for yet more collection manipulation
* https://regexr.com to learn regular expressions