### Big idea:

K-means is an unsupervised learning tool for identifing clusters with-in datasets

- Algorithm: Pick arbitrary points as guesses for the center of each group. Assign all the data points to the closest matching group. within each group, average the points to get a new gess for the center of the group. Repeat multiple times: assign data and average the points.

- Functions needed: 
    * mean(data)
    * dist(point, point)
    * assign_data(centroids, points)
    * compute-centroids(groups)
    * k_eans(points)

- Sequences are a very common type of iterable. Lists, tuples, and strings are all sequences. 
- Sequences are iterables that have a specific set of features. They can be indexed starting from 0 and ending at one less than the length of the sequence, they have a length, and they can be sliced. Lists, tuples, strings, and all other sequences work this way.
- Lots of things in Python are iterables, but not all iterables are sequences. Sets, dictionaries, files, and generators are all iterables but none of these things are sequences.
- So anything that can be looped over with a for loop is an iterable, and sequences are one type of iterable, but Python has many other kinds of iterables as well.

#### Some messy tips

- for big computations, localize the global lookups/ functions
   - dissassemble the function:
    ```
    from dis import dis
    dis(func_name)
    ```

- partial: partial function evaluation. e.g. To fully evaluate the function pow takes two arguments, aka a function of arity two. 

In [3]:
from functools import partial
pow(2,5)

32

In [4]:
# freeze the first argument
twopow = partial(pow, 2)
twopow(5)

32

- map() is a function with two arguments:


In [17]:
def fahrenheit(T):
    return ((float(9)/5)*T + 32)
def celsius(T):
    return (float(5)/9)*(T-32)
temp = (36.5, 37, 37.5, 39)

F = list(map(fahrenheit, temp))
C = list(map(celsius, F))
print(F)
print(C)

[97.7, 98.60000000000001, 99.5, 102.2]


In [18]:
a = [1,2,3,4]
b = [17,12,11,10]
c = [-1,-4,5,9]
print(list(map(lambda x,y:x+y, a,b)))
print(list(map(lambda x,y,z:x+y+z, a,b,c)))
print(list(map(lambda x,y,z:x+y-z, a,b,c)))

[18, 14, 14, 14]
[17, 10, 19, 23]
[19, 18, 9, 5]


- If you think one line is too opaque -> pull the opaque part out and give it a meaningful function name

e.g. 
```
[tuple(map(mean, zip(*group))) for group in groups]
```
can change to:
```
def transpose(data):
    'Swap the rows and columns in a 2D array of data'
    return list(zip(*data))
[tuple(map(mean, transpose(group))) for group in groups]    
```

#### Good codes:

- Good function names
- good doc strings
- nice type annotations
- Code straight forward