# Improving your skills with Python lists

Python Lists are a flexible container that hold other objects.  There are four main collection data types in the Python programming language:

* **Tuple** - items are ordered and unchangeable. Allows duplicate members.
* **Set** - items are unordered and unindexed. No duplicate members.
* **List** - items are ordered and changeable. Allows duplicate members.
* **Dictionary** - items are unordered, changeable and indexed using keys. No duplicate members.

This primer specifically focuses on [Python lists](https://docs.python.org/3/tutorial/datastructures.html).  Many programmers starting off in Python know how to use lists and this resource is aimed at those individuals who are ready to learn a little more.

In [27]:
import re
import numpy as np

## Lambda, zip, map and list comprehensions

First of all if the items in your list are numeric then you should probably be using NumPy to perform operations.  We use some numeric examples here to provided varied examples, but keep this in mind when you write code outside of this assignment.  These are three functions that are commonly used with lists. 


   * [lambda](https://docs.python.org/3/reference/expressions.html#lambda)- shorthand to create an anonymous function
   * [zip](https://docs.python.org/3/library/functions.html#zip) - take iterables and zip them into tuples
   * [map](https://docs.python.org/3/library/functions.html#map) - applies a function over an iterable
   * [filter](https://docs.python.org/3/library/functions.html#filter) - return items of iterable for which function returns true.

First let's take a look at what *lambda* does.

In [1]:
x = [1,2,3]
lambda x: max(x)

<function __main__.<lambda>>

The syntax uses a colon to indicate the item in an iterable.  The ``max()`` function is being wrapped by lambda in this example.  Notice that it returns a function. Lambdas are most often used in the context of custom function rather than builtin ones.  Here we see the lambda being applied with ``map()``.

>It is more pythoninc to use list comprehensions as opposed to map

In [2]:
a = [1,2,3,4]
b = list(map(lambda x: x**2, a))
b

[1, 4, 9, 16]

## Simple list comprehensions

Many list comprehensions can be written using map and lambda.  Map applies the function over an iterable and lambda is the anonymous function.  Here are some examples.

In [3]:
a = range(-5,5)

## with builtins
b = list(map(abs,a))
c = [abs(x) for x in a]
print(b==c,b)

## with your own function
b = [x**2 for x in a]
c = list(map(lambda x: x**2, a))
print(b==c,b)

True [5, 4, 3, 2, 1, 0, 1, 2, 3, 4]
True [25, 16, 9, 4, 1, 0, 1, 4, 9, 16]


So there are several ways to construct our lists.  Which one is faster?  Note that the map function returns a generator so to ensure both functions are returning a list we need to convert the generator to a list.

In [25]:
a = range(-1000,1000)
a_np = np.array(a)

## with builtins
%timeit -n 1000  list(map(abs,a))
%timeit -n 1000 [abs(x) for x in a]
%timeit -n 1000 np.abs(a_np)

67.4 µs ± 4.71 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
110 µs ± 5.75 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
1.77 µs ± 108 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)


## Filtering list comprehensions

In [37]:
import types

## filter
a = ['', 'fee', '', '', '', 'fi', '', '', '', '', 'foo', '', '', '', '', '', 'fum']
b = list(filter(lambda x: len(x) > 0,a))
c = [x for x in a if len(x) > 0]
print(b==c,b)

## square only the ints and filter the rest                                                                                                                                                                           
a = [1, '4', 9, 'a', 0, 4]
b = [i**2 for i in a if type(i)==type(1)]
c = list(map(lambda x: x**2, filter(lambda x: isinstance(x,int),a)))
print(b==c,b)

True ['fee', 'fi', 'foo', 'fum']
True [1, 81, 0, 16]
(0, 2, 4, 6, 8)


#### EXERCISE 1

Using both the map and list comprehension versions we are going to compare the time it takes to manipulate sentences in a corpus.  Here are a few lines of code that read a book into Python code.  The code is then split into sentences.

In [33]:
text = open('./data/Dracula_Bram_Stoker.txt', 'r').read()
stop_pattern = '\.|\?|\!'
sentences = re.split(stop_pattern, text)
sentences = [re.sub("\r|\n"," ",s.lower()) for s in sentences]

1. Return a list that has sentences only if they contain the word 'dark'
2. Return a list that has sentences where it filters for the word 'dark', but replaces it with 'bright'.

In [43]:
## YOUR CODE HERE



## Nested list comprehensions

For nested list comprehensions it can be tricky to remember the order to build.   in the same way that you build a for loop from top to bottom.  The second version unpacks the nested lists into a single list, but be careful the version without square braces reverses the order.


In [48]:
l = [['40', '20', '10', '30'], ['20', '20', '20'], ['30', '20'], ['100', '100', '100', '100']]
print([[float(y) for y in x] for x in l])
print([float(y) for x in l for y in x])

[[40.0, 20.0, 10.0, 30.0], [20.0, 20.0, 20.0], [30.0, 20.0], [100.0, 100.0, 100.0, 100.0]]
[40.0, 20.0, 10.0, 30.0, 20.0, 20.0, 20.0, 30.0, 20.0, 100.0, 100.0, 100.0, 100.0]


## The conveniences of zip

Dictionaries are incredibly useful data structures.  It is very common to construct a dictionary using lists and the zip function can be convenient under these circumstances.  Not that zip, like map returns a generator which is both convenient and pythonic, but you may sometimes need to convert it to a list to view contents.

In [50]:
a1,a2 = [1,2,3],['a','b','c']

%timeit -n 10000 dict(zip(a1,a2))
%timeit -n 10000 dict(zip(*[a1,a2]))

472 ns ± 126 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
435 ns ± 30 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)


#### EXERCISE 2

1. Take the original Dracula text (list of sentences) and create a list of word lists (space delimited)
2. Use a nested list comprehension to create a flattened list with every word in the book
3. Use the Counter dictionary from the [collections](https://docs.python.org/3/library/collections.html) module to create a dictionary that shows the counts for each word. 

In [52]:
## YOUR CODE HERE



## Matrix operations

You can perform matrix operations on lists

In [68]:
# matrix transpose transpose
a = [['a','b','c'],['d','e','f']]
b = list(map(list, zip(*a)))
c = [[row[i] for row in a] for i in range(len(a[0]))]

print(a)
print(b==c,b)

[['a', 'b', 'c'], ['d', 'e', 'f']]
True [['a', 'd'], ['b', 'e'], ['c', 'f']]


In [69]:
# rotate (to the right 90 degrees)

b = list(map(list, zip(*a[::-1])))
c = [[row[i] for row in a[::-1]] for i in range(len(a[0]))]
print(b==c,b)

True [['d', 'a'], ['e', 'b'], ['f', 'c']]


#### EXERCISE

1. Look at the following map function.  Can you create the corresponding list comprehension?

In [76]:
a = [[1,2,3],[4,5,6]]
b = list(map(lambda x: x[1]**x[0], zip(*a)))
b

[4, 25, 216]

In [None]:
## YOUR CODE HERE



## List comprehensions are not limited to lists

In [40]:
## you can use the list comprehension syntax for tuples too
my_tuple = tuple(num for num in range(10) if num % 2 == 0)
print(my_tuple)

my_dict = {l:np.random.randint(0,9,1)[0] for l in ['a','b','c']}
print(my_dict)

(0, 2, 4, 6, 8)
{'a': 0, 'b': 2, 'c': 0}


### Additional Resources


* [isinstance vs type (stackoverflow)](http://stackoverflow.com/questions/1549801/differences-between-isinstance-and-type-in-python)
* [list comps vs map](http://stackoverflow.com/questions/1247486/python-list-comprehension-vs-map)