**Vanilla Python**

recipe for map, filter and reduce:
1. **function** (function object! object=a function that is not executed)
2. an **iterable** (something that you can iterate over)

# 1. Map

N to N relationship, we put in an iterable of length N and out comes N sized iterable

In [14]:
list(map(str, [1,2,3,4,5,6,8]))

['1', '2', '3', '4', '5', '6', '8']

In [15]:
def square_it(x):
    return x**2 + 5

In [17]:
list(map(square_it, [1,23,4,5,6,7,8]))

[6, 534, 21, 30, 41, 54, 69]

In [21]:
# with multiple arguments
def judge_groceries(x,y):
    # FORMAT STRING
    return f"What's this? A {x}, {y}!"

In [20]:
variable = "William how are you today?"
print(f"My sentence is: {variable}")

My sentence is: William how are you today=


In [22]:
list(map(judge_groceries, ['apple', 'kiwi', 'cherry', 'sprouts'], ['yummy', 'meh', 'yummy', 'yummy!!!']))

["What's this? A apple, yummy!",
 "What's this? A kiwi, meh!",
 "What's this? A cherry, yummy!",
 "What's this? A sprouts, yummy!!!!"]

# 2. Filter

In [23]:
filter(lambda x: x < 0, [-1, 4, 5, 10, 17, 100, -23])

<filter at 0x7f3af7791490>

In [24]:
list(filter(lambda x: x < 0, [-1, 4, 5, 10, 17, 100, -23]))

[-1, -23]

In [26]:
my_function = lambda x: x < 0

In [27]:
my_function(-5)

True

In [28]:
list(filter(lambda x: x in ['car', 'sky', 'hammer'],
           ['house', 'car', 'apple', 'hammer', 'bread']))

['car', 'hammer']

In [31]:
list(map(lambda x: x in ['car', 'sky', 'hammer'],
           ['house', 'car', 'apple', 'hammer', 'bread']))

[False, True, False, True, False]

In [33]:
# Does the same like the lambda above!
def comparison(x):
    return x in ['car', 'sky', 'hammer']

list(map(comparison, ['house', 'car', 'apple', 'hammer', 'bread']))


[False, True, False, True, False]

# 3. Reduce

In [29]:
from functools import reduce

Unlike map and filter, reduce returns the result straight away. No **object**!

In [34]:
reduce((lambda x, y: x*y), [1,2,3,4,5,8])

960

[1,2,4,5,7,8,9,5,3]  -> 124,578,953

In [52]:
reduce((lambda a, d: 10*a + d), [1,2,4,5,7,8,9,5,3], 0)

100124578953

In [45]:
# Let's say we have this string
s = "The-QUICK-Brown-fox-JUMPS-Over-the-Lazy-Dog"

color = lambda x: x.replace('brown', 'blue')
speed = lambda x: x.replace('quick', 'slow')
dashes = lambda x: x.replace('-', ' ')
work = lambda x:x.replace('lazy', 'industrious')

fs = [str.lower, color, speed, dashes, work, str.title]

In [51]:
def call(a, func):
    return func(a)

reduce(call, # function which gets a string to be cleaned and a function from the function list
      fs, # this is our function list
      s # this our starter string needs to be cleaned
      )

'The Slow Blue Fox Jumps Over The Industrious Dog'

In [50]:
work(dashes(speed(color(s.lower())))).title()

'The Slow Blue Fox Jumps Over The Industrious Dog'

# 4. Apply

In the realm of pandas

In [57]:
import pandas as pd
import random

In [60]:
str_lst = ["The",
           "QUICK",
           "Brown",
           "fox",
           "JUMPS",
           "Over",
           "the",
           "Lazy",
           "Dog",
           "-",
          ]

In [77]:
df = pd.DataFrame({'A':[random.choice(str_lst) for i in range(10)],
                  'B':[random.choice(str_lst) for i in range(10)],
                  'C':[1,24,5,2,3,8,2,5,7,1]})

In [78]:
df.dtypes

A    object
B    object
C     int64
dtype: object

In [84]:
df['C'].apply(lambda x: x**2)

0      1
1    576
2     25
3      4
4      9
5     64
6      4
7     25
8     49
9      1
Name: C, dtype: int64

# 5. applying it to 4.03 activity 2 (healthcare)

In [124]:
df = pd.read_csv('unit4_healthcare_for_all.csv')

In [89]:
df['DOMAIN'].unique()

array(['T2', 'S1', 'R2', 'S2', 'T1', 'R3', 'U1', 'C2', 'C1', 'U3', ' ',
       'R1', 'U2', 'C3', 'U4', 'S3', 'T3'], dtype=object)

In [112]:
domain_categories = {"U" : "Urban",
                    "C": "City",
                    "S": "Suburban", 
                    "T": "Town",
                    "R": "Rural",
                    " ": np.NaN,
                    }

In [99]:
# How do we get the values?
domain_categories['S']

'Suburban'

In [None]:
df['DOMAIN'].apply(#some translation function
                    )

In [114]:
import numpy as np
def clean_domain(x):
    if x[0] in domain_categories.keys():
        return domain_categories[x[0]]
    # else:
    #    return np.NaN
    
    
    # TODO
    # Check if first character of x is in the keys of the dictionary
        # if so, return the corresponding value
    # else:
        # return np.NaN
    # return

In [118]:
df['DOMAIN'] = df['DOMAIN'].apply(clean_domain)

In [119]:
df

Unnamed: 0,STATE,PVASTATE,DOB,MDMAUD,RECP3,GENDER,DOMAIN,INCOME,HOMEOWNR,HV1,...,VETERANS,NUMPROM,CARDPROM,CARDPM12,NUMPRM12,MAXADATE,RFA_2,NGIFTALL,TIMELAG,AVGGIFT
0,IL,,3712,XXXX,,F,Town,,,479,...,,74,27,6,14,9702,L4E,31,4.0,7.741935
1,CA,,5202,XXXX,,M,Suburban,6.0,H,5468,...,,32,12,6,13,9702,L2G,3,18.0,15.666667
2,NC,,0,XXXX,,M,Rural,3.0,U,497,...,,63,26,6,14,9702,L4E,27,12.0,7.481481
3,CA,,2801,XXXX,,F,Rural,1.0,U,1000,...,,66,27,6,14,9702,L4E,16,9.0,6.812500
4,FL,,2001,XXXX,X,F,Suburban,3.0,H,576,...,,113,43,10,25,9702,L2F,37,14.0,6.864865
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
90564,FL,,4803,XXXX,,F,Suburban,6.0,H,733,...,,59,23,5,12,9702,L4D,24,3.0,3.375000
90565,AK,,0,XXXX,,M,City,,,988,...,,14,6,5,12,9702,L1G,1,,25.000000
90566,TX,,5001,XXXX,,M,City,7.0,H,1679,...,,10,4,3,8,9702,L1F,1,,20.000000
90567,MI,,3801,XXXX,X,M,City,,,376,...,,33,14,7,17,9702,L3E,7,3.0,8.285714


In [121]:
df.head()

Unnamed: 0,STATE,PVASTATE,DOB,MDMAUD,RECP3,GENDER,DOMAIN,INCOME,HOMEOWNR,HV1,...,VETERANS,NUMPROM,CARDPROM,CARDPM12,NUMPRM12,MAXADATE,RFA_2,NGIFTALL,TIMELAG,AVGGIFT
0,IL,,3712,XXXX,,F,T2,,,479,...,,74,27,6,14,9702,L4E,31,4.0,7.741935
1,CA,,5202,XXXX,,M,S1,6.0,H,5468,...,,32,12,6,13,9702,L2G,3,18.0,15.666667
2,NC,,0,XXXX,,M,R2,3.0,U,497,...,,63,26,6,14,9702,L4E,27,12.0,7.481481
3,CA,,2801,XXXX,,F,R2,1.0,U,1000,...,,66,27,6,14,9702,L4E,16,9.0,6.8125
4,FL,,2001,XXXX,X,F,S2,3.0,H,576,...,,113,43,10,25,9702,L2F,37,14.0,6.864865


In [123]:
df['DOMAIN'] = pd.Series(map(clean_domain, df['DOMAIN']))

In [100]:
'dasdfsdgf'[0]

'd'

In [102]:
'U' in domain_categories.keys()

True

In [111]:
x = 'U3'
print(x[0])
print(domain_categories[x[0]])
print(domain_categories['U'])

U
Urban
Urban
