# Processing Python Lists

Let us see how to process data in python lists using functions such as `filter`, `map`, `sorted`, etc.
We can also use list comprehensions as alternative to map.

- Create Python list of strings.
- Filter for `COMPLETE` orders.
- Get unique status (use `map` and `set`)
- Sort the data base on order customer id.

In [1]:
# list
orders = ['1,2013-07-25 00:00:00.0,11599,CLOSED',
 '2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT',
 '3,2013-07-25 00:00:00.0,12111,COMPLETE',
 '4,2013-07-25 00:00:00.0,8827,CLOSED',
 '5,2013-07-25 00:00:00.0,11318,COMPLETE',
 '6,2013-07-25 00:00:00.0,7130,COMPLETE',
 '7,2013-07-25 00:00:00.0,4530,COMPLETE',
 '8,2013-07-25 00:00:00.0,2911,PROCESSING',
 '9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT',
 '10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT']

In [2]:
len(orders)

10

### Lambda function

In [3]:
def sumN(n):
    return (n* (n+1))/2

In [4]:
l = [1,2,3,4]  # [1,3,6,10]

In [5]:
[sumN(n) for n in l]  # list comprehension

[1.0, 3.0, 6.0, 10.0]

In [7]:
sumNl = lambda n: (n* (n+1))/2

In [9]:
sumNl(4)

10.0

In [10]:
map?

[0;31mInit signature:[0m [0mmap[0m[0;34m([0m[0mself[0m[0;34m,[0m [0;34m/[0m[0;34m,[0m [0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
map(func, *iterables) --> map object

Make an iterator that computes the function using arguments from
each of the iterables.  Stops when the shortest iterable is exhausted.
[0;31mType:[0m           type
[0;31mSubclasses:[0m     

In [15]:
list(map(sumN, l))

[1.0, 3.0, 6.0, 10.0]

In [18]:
# using lambda function and map function to generate a list of values
l = [1,2,3,4]  # [1,3,6,10]
list(map(lambda n: (n*(n+1)/2), l))

[1.0, 3.0, 6.0, 10.0]

### Filter Data in Python Using filter and lambda

In [19]:
orders

['1,2013-07-25 00:00:00.0,11599,CLOSED',
 '2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT',
 '3,2013-07-25 00:00:00.0,12111,COMPLETE',
 '4,2013-07-25 00:00:00.0,8827,CLOSED',
 '5,2013-07-25 00:00:00.0,11318,COMPLETE',
 '6,2013-07-25 00:00:00.0,7130,COMPLETE',
 '7,2013-07-25 00:00:00.0,4530,COMPLETE',
 '8,2013-07-25 00:00:00.0,2911,PROCESSING',
 '9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT',
 '10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT']

In [21]:
filter?

[0;31mInit signature:[0m [0mfilter[0m[0;34m([0m[0mself[0m[0;34m,[0m [0;34m/[0m[0;34m,[0m [0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
filter(function or None, iterable) --> filter object

Return an iterator yielding those items of iterable for which function(item)
is true. If function is None, return the items that are true.
[0;31mType:[0m           type
[0;31mSubclasses:[0m     

In [27]:
order = orders[4]

In [23]:
order

'1,2013-07-25 00:00:00.0,11599,CLOSED'

In [25]:
order.split(',')[3] =='COMPLETE'

False

In [28]:
order.split(',')[3] =='COMPLETE'

True

In [29]:
# return a list of orders such that the order status of each transaction is 'COMPLETE'
filter(lambda order: order.split(',')[3] == 'COMPLETE', orders)

<filter at 0x7f01f4276980>

In [30]:
list(filter(lambda order: order.split(',')[3] == 'COMPLETE', orders))

['3,2013-07-25 00:00:00.0,12111,COMPLETE',
 '5,2013-07-25 00:00:00.0,11318,COMPLETE',
 '6,2013-07-25 00:00:00.0,7130,COMPLETE',
 '7,2013-07-25 00:00:00.0,4530,COMPLETE']

In [31]:
# Either 'COMPLETE' or 'CLOSED'
list(filter(lambda order: order.split(',')[3] in ('COMPLETE', 'CLOSED'), orders))

['1,2013-07-25 00:00:00.0,11599,CLOSED',
 '3,2013-07-25 00:00:00.0,12111,COMPLETE',
 '4,2013-07-25 00:00:00.0,8827,CLOSED',
 '5,2013-07-25 00:00:00.0,11318,COMPLETE',
 '6,2013-07-25 00:00:00.0,7130,COMPLETE',
 '7,2013-07-25 00:00:00.0,4530,COMPLETE']

### Getting unique values from list using map and set

In [32]:
map?

[0;31mInit signature:[0m [0mmap[0m[0;34m([0m[0mself[0m[0;34m,[0m [0;34m/[0m[0;34m,[0m [0;34m*[0m[0margs[0m[0;34m,[0m [0;34m**[0m[0mkwargs[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m     
map(func, *iterables) --> map object

Make an iterator that computes the function using arguments from
each of the iterables.  Stops when the shortest iterable is exhausted.
[0;31mType:[0m           type
[0;31mSubclasses:[0m     

In [35]:
# return unique elements
set(map(lambda order: order.split(',')[3] , orders))

{'CLOSED', 'COMPLETE', 'PENDING_PAYMENT', 'PROCESSING'}

### Sort python list using key

In [37]:
sorted?

[0;31mSignature:[0m [0msorted[0m[0;34m([0m[0miterable[0m[0;34m,[0m [0;34m/[0m[0;34m,[0m [0;34m*[0m[0;34m,[0m [0mkey[0m[0;34m=[0m[0;32mNone[0m[0;34m,[0m [0mreverse[0m[0;34m=[0m[0;32mFalse[0m[0;34m)[0m[0;34m[0m[0;34m[0m[0m
[0;31mDocstring:[0m
Return a new list containing all items from the iterable in ascending order.

A custom key function can be supplied to customize the sort order, and the
reverse flag can be set to request the result in descending order.
[0;31mType:[0m      builtin_function_or_method

In [38]:
order = orders[0]

In [42]:
order.split(',')[2]

'11599'

In [43]:
int(order.split(',')[2])

11599

In [46]:
# pass lambda function to key
sorted(orders, key=lambda order: int(order.split(',')[2]), reverse=True)

['3,2013-07-25 00:00:00.0,12111,COMPLETE',
 '1,2013-07-25 00:00:00.0,11599,CLOSED',
 '5,2013-07-25 00:00:00.0,11318,COMPLETE',
 '4,2013-07-25 00:00:00.0,8827,CLOSED',
 '6,2013-07-25 00:00:00.0,7130,COMPLETE',
 '9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT',
 '10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT',
 '7,2013-07-25 00:00:00.0,4530,COMPLETE',
 '8,2013-07-25 00:00:00.0,2911,PROCESSING',
 '2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT']