### Defining a Lambda Function

A lambda function is a small anonymous function.

A lambda function can take any number of arguments, but can only have one express

Syntax:
lambda arguments : expression

The example below takes in one argument, and performs an operation on it

In [5]:
add10 = lambda argument1 : argument1 + 10

To use the function and pass in , say 25, we write:

In [8]:
add10(25)

35

### Introducing map() function
If we have a sequence of numbers, and wish to apply the same operations using lambda, we can use the map function

The map(unction applies a given function to each item of an iterable (list, tuple etc.) and returns a list of the results.

Syntax:
map(function, iterable)

In [15]:
numbers = [ 1, 2, 3, 4, 5]

print(numbers)


[1, 2, 3, 4, 5]


In [18]:
results = map(add10, numbers)

# result is a map object. We need to convert into a list
new_numbers = list(results)

In [19]:
print (new_numbers)

[11, 12, 13, 14, 15]


### Application to NLP Processing Task (1)


Suppose we have extracted data from a webpage.
We should expect some HTML tags such BR  in them. 
We can define a lambda function to identify BR and replace with a space

One of the most important methods that use regular expressions is sub with the following syntax.

   re.sub(pattern, repl, string, max=0)

This method replaces all occurrences of the RE pattern in string with repl, substituting all occurrences unless max provided. This method returns modified string.

In [46]:
import re
remove_br = lambda x: re.sub(r"""<BR>""", ' ', x)

In [47]:
mytext = "The best cereal is Quakter Cookers. <BR> They are nutritious and tasty. <BR>" 

In [43]:
new_mytext = remove_br(mytext)

In [48]:
print (new_mytext)

The best cereal is Quakter Cookers.   They are nutritious and tasty.  


Let's create a list of texts. For simplicity, we just insert the similar text into an empty list 5 times


In [57]:
texts = []
for i in range(1,5):
    texts.append(mytext)
print (texts)

['The best cereal is Quakter Cookers. <BR> They are nutritious and tasty. <BR>', 'The best cereal is Quakter Cookers. <BR> They are nutritious and tasty. <BR>', 'The best cereal is Quakter Cookers. <BR> They are nutritious and tasty. <BR>', 'The best cereal is Quakter Cookers. <BR> They are nutritious and tasty. <BR>']


Now, we apply the lambda function and map function to get a new list of text

In [60]:
new_texts = map(remove_br, texts)

In [61]:
print(list(new_texts))

['The best cereal is Quakter Cookers.   They are nutritious and tasty.  ', 'The best cereal is Quakter Cookers.   They are nutritious and tasty.  ', 'The best cereal is Quakter Cookers.   They are nutritious and tasty.  ', 'The best cereal is Quakter Cookers.   They are nutritious and tasty.  ']


### Application to NLP Preprocessing Task (2)

It is common to use Pandas dataframe to hold the original text data.
Then, to create a new series (column) to any transformed data

In [72]:
import pandas as pd

# Create the pandas DataFrame 
df = pd.DataFrame(texts, columns = ['original_text']) 
df

Unnamed: 0,original_text
0,The best cereal is Quakter Cookers. <BR> They ...
1,The best cereal is Quakter Cookers. <BR> They ...
2,The best cereal is Quakter Cookers. <BR> They ...
3,The best cereal is Quakter Cookers. <BR> They ...


Now, we create a new empty column 


In [73]:
df["new_text"] = ""
df

Unnamed: 0,original_text,new_text
0,The best cereal is Quakter Cookers. <BR> They ...,
1,The best cereal is Quakter Cookers. <BR> They ...,
2,The best cereal is Quakter Cookers. <BR> They ...,
3,The best cereal is Quakter Cookers. <BR> They ...,


Finally, we apply the lambda function (remove_br) and use the map() method within Dataframe to
apply the function onto the column original text and place the results into the column next_text

In [74]:
df['new_text'] = df.original_text.map(remove_br)
df

Unnamed: 0,original_text,new_text
0,The best cereal is Quakter Cookers. <BR> They ...,The best cereal is Quakter Cookers. They are...
1,The best cereal is Quakter Cookers. <BR> They ...,The best cereal is Quakter Cookers. They are...
2,The best cereal is Quakter Cookers. <BR> They ...,The best cereal is Quakter Cookers. They are...
3,The best cereal is Quakter Cookers. <BR> They ...,The best cereal is Quakter Cookers. They are...


In [78]:
print(df.iloc[3,0])


The best cereal is Quakter Cookers. <BR> They are nutritious and tasty. <BR>


In [80]:
print(df.iloc[3,1])

The best cereal is Quakter Cookers.   They are nutritious and tasty.  
