This program is to demonstrate the concept of using lambda functions 
to incorporate dynamic filtering instead of hardcoded filtering that 
was preformed in Applications using loops. 

In this concept, we define 
filtering functions that take 2 inputs (collection, a dynamic function
that has the filter criteria at runtime). 

The filtering function loops
through the passed collection and checks for a condition where it 
invokes the dynamic function on each of the record from the collection
and evaluates if the condition yields a true or false.

If true, the row from the collection is appended to the result.

In [1]:
# function to read through a file and split the order lines
def readData(filepath):
    file = open(filepath)
    data = file.read()
    datalist = data.splitlines()
    return datalist

In [2]:
def myFilter(c,f):
    result = []
    for i in c:
        if(f(i)):
            result.append(i)
    return result

In [3]:
orderPath = "/data/retail_db/orders/part-00000"
orders = readData(orderPath)
orderItems = readData("/data/retail_db/order_items/part-00000")

In [4]:
orders[0:5]

['1,2013-07-25 00:00:00.0,11599,CLOSED',
 '2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT',
 '3,2013-07-25 00:00:00.0,12111,COMPLETE',
 '4,2013-07-25 00:00:00.0,8827,CLOSED',
 '5,2013-07-25 00:00:00.0,11318,COMPLETE']

In [5]:
orderItems[0:5]

['1,1,957,1,299.98,299.98',
 '2,2,1073,1,199.99,199.99',
 '3,2,502,5,250.0,50.0',
 '4,2,403,1,129.99,129.99',
 '5,4,897,2,49.98,24.99']

In [7]:
CompletedOrders = myFilter(orders,lambda o:o.split(',')[3] == 'COMPLETE')
CompletedOrders

['3,2013-07-25 00:00:00.0,12111,COMPLETE',
 '5,2013-07-25 00:00:00.0,11318,COMPLETE',
 '6,2013-07-25 00:00:00.0,7130,COMPLETE',
 '7,2013-07-25 00:00:00.0,4530,COMPLETE',
 '15,2013-07-25 00:00:00.0,2568,COMPLETE',
 '17,2013-07-25 00:00:00.0,2667,COMPLETE',
 '22,2013-07-25 00:00:00.0,333,COMPLETE',
 '26,2013-07-25 00:00:00.0,7562,COMPLETE',
 '28,2013-07-25 00:00:00.0,656,COMPLETE',
 '32,2013-07-25 00:00:00.0,3960,COMPLETE',
 '35,2013-07-25 00:00:00.0,4840,COMPLETE',
 '45,2013-07-25 00:00:00.0,2636,COMPLETE',
 '56,2013-07-25 00:00:00.0,10519,COMPLETE',
 '63,2013-07-25 00:00:00.0,1148,COMPLETE',
 '65,2013-07-25 00:00:00.0,5903,COMPLETE',
 '67,2013-07-25 00:00:00.0,1406,COMPLETE',
 '71,2013-07-25 00:00:00.0,8646,COMPLETE',
 '72,2013-07-25 00:00:00.0,4349,COMPLETE',
 '76,2013-07-25 00:00:00.0,6898,COMPLETE',
 '80,2013-07-25 00:00:00.0,3007,COMPLETE',
 '83,2013-07-25 00:00:00.0,1265,COMPLETE',
 '88,2013-07-25 00:00:00.0,3809,COMPLETE',
 '91,2013-07-25 00:00:00.0,8912,COMPLETE',
 '92,2013-07-2

In [8]:
OrderFilterbydate = myFilter(orders,lambda o: o.split(',')[1][0:10] == '2013-07-25')
OrderFilterbydate

['1,2013-07-25 00:00:00.0,11599,CLOSED',
 '2,2013-07-25 00:00:00.0,256,PENDING_PAYMENT',
 '3,2013-07-25 00:00:00.0,12111,COMPLETE',
 '4,2013-07-25 00:00:00.0,8827,CLOSED',
 '5,2013-07-25 00:00:00.0,11318,COMPLETE',
 '6,2013-07-25 00:00:00.0,7130,COMPLETE',
 '7,2013-07-25 00:00:00.0,4530,COMPLETE',
 '8,2013-07-25 00:00:00.0,2911,PROCESSING',
 '9,2013-07-25 00:00:00.0,5657,PENDING_PAYMENT',
 '10,2013-07-25 00:00:00.0,5648,PENDING_PAYMENT',
 '11,2013-07-25 00:00:00.0,918,PAYMENT_REVIEW',
 '12,2013-07-25 00:00:00.0,1837,CLOSED',
 '13,2013-07-25 00:00:00.0,9149,PENDING_PAYMENT',
 '14,2013-07-25 00:00:00.0,9842,PROCESSING',
 '15,2013-07-25 00:00:00.0,2568,COMPLETE',
 '16,2013-07-25 00:00:00.0,7276,PENDING_PAYMENT',
 '17,2013-07-25 00:00:00.0,2667,COMPLETE',
 '18,2013-07-25 00:00:00.0,1205,CLOSED',
 '19,2013-07-25 00:00:00.0,9488,PENDING_PAYMENT',
 '20,2013-07-25 00:00:00.0,9198,PROCESSING',
 '21,2013-07-25 00:00:00.0,2711,PENDING',
 '22,2013-07-25 00:00:00.0,333,COMPLETE',
 '23,2013-07-25 00

In [9]:
OrderItemsFilteredbyOrderID = myFilter(orderItems, lambda oi: oi.split(',')[1] == '2')
OrderItemsFilteredbyOrderID

['2,2,1073,1,199.99,199.99', '3,2,502,5,250.0,50.0', '4,2,403,1,129.99,129.99']