# User defined functions

&nbsp;


## 1- `def` statement 
* the Python `def` statement is a true executible statement, when it runs it creates a new funcsion object and assigns it to a name.   
* `def`s are not evaluated until they are reached and ran, hence they do not need to be fully defined before a program runs.  
* in the `def` statement we define the number of argumets to tbe provided. (optionl, minimum can be zero).   
* often the body of a `def` statement contains an -optional- `return` statement and it may show up anywhere in the body of the method.  
* `def` can appear nested within an `if` statement depending on a condition.  
&nbsp;

`def method_name(arg_1, arg_2.... arg_N)
    statement(s)
    return(value)`
    
* let us write a simple method. It take two number and multiples one by the other.  

In [2]:
def multiply(a, b):
    return(a * b)

In [3]:
multiply(3,4)

12

In [4]:
multiply(12,0.3)

3.5999999999999996

* the output of a method return can be assigned to a name:

In [5]:
f = multiply('too',5)
f

'tootootootootoo'

In [6]:
multiply(3,'Na')

'NaNaNa'

* this is an example of *polymorphism* in Python, a type-dependent behavior. What the expression `a * y` will return depends upon the kinds of objects that `a` and `b` are. In the first example the method performs a multiplication while in the second it performs a repetition.      
* any two objects that support  **`*`** will work no matter what type.  
* polymorphism means that the meaning of an operation depends on the object being operated upon.   
* because python is dynamically type language almost every operation is a polymorphic operation.  
* this by design, accounts to no small extent for Python's conciseness flexibility.  
&nbsp;


In [7]:
def intersect(seq1, seq2):
    return([x for x in seq1 if x in seq2])

In [8]:
intersect('tribe','entreat')

['t', 'r', 'e']

* the `return` statement is optional:

In [9]:
def test_func(n1,n2):
    print('if you multiply {} by {} you get {}'.format(n1,n2,n1*n2))

In [10]:
test_func(22,0.9)

if you multiply 22 by 0.9 you get 19.8


* to avoid having to specify all arguments we can assign default values for method's arguments
* the arguments are passed in order, so the first argument will always be assigned to `y` and the second to `x` unless declared otherwise.   

In [12]:
def power(y,x = 2):
    return(y**x)

In [13]:
power(20)

400

* it is possible to include a placeholder for an argument without a default value and test for it using the reserved keyword `None`    

In [14]:
def double_power(y,x = 2,z = None):
    val = y**x if z==None else y**(x**z)
    return(val)

In [15]:
double_power(3)

9

In [16]:
double_power(y=3,z=4)

43046721

* method(s) can call or be called from another method(s)

In [17]:
def power_call():
    # use map to capture multiple entries
    i,j,k = map(float,input('enter i,j,k: ').split(','))
    return(double_power(i,j,k))
    

def double_power(y,x = 2,z = None):
    val = y**x if z==None else y**(x**z)
    return(val)

In [18]:
power_call()

enter i,j,k: 1,2,3


1.0

&nbsp;

* if we rearrange the argument order in the method `double_power` we get a SyntaxError!
* in Python methods, **a method's non-default argument(s) should always preceed detault arguments.  **   

In [None]:
def double_power(x = 2,z = None, y):
    val = y**x if z==None else y**(x**z)
    return(val)

&nbsp;

## 2- `*args` and `**kwargs`

* the extension `*` and `**` support passing any number of arguments into a function. 
* commonly `*args` is used to pass arguments that are interpreted as a list whereas `**kwargs` allows passing argument that will be processed as a dictionary. however any other word can be used with `*` and `**`.

In [23]:
def count_arguments(*args):
    for obj in args:
        print(obj)
    print('\nthere\'s {} object(s) in this argument'.format(len(args)))

In [20]:
count_arguments('bar',3,2,'spam','foo')

bar
3
2
spam
foo

there's 5 object(s) in this argument


In [24]:
def count_dict(small = 15, large = 22, **kwargs):
    for keys in kwargs:
        print(keys)
        
    print('\nsmall = {}, large = {},'.format(small, large), kwargs)

In [25]:
count_dict(small = 15, large = 25, pepperoni=2,sausage=5,beef=4,chicken=3)

pepperoni
sausage
beef
chicken

small = 15, large = 25, {'pepperoni': 2, 'sausage': 5, 'beef': 4, 'chicken': 3}


notice that kwargs is printied entirely as a dictionary. 

&nbsp;


In [26]:
count_dict(pepperoni='two',sausage='five',beef='four',chicken='three')

pepperoni
sausage
beef
chicken

small = 15, large = 22, {'pepperoni': 'two', 'sausage': 'five', 'beef': 'four', 'chicken': 'three'}


* if a method contains an `*args` argument mixed with other arguments it is important to pay attention to the order in the `def` call.  
* since an `*args` argument sequesters everything that comes after them, if there is another argument to be set after `*args` that argument needs to be passed explicitly.

In [27]:
def test_func(a,*b,c):
    print(a,b,c)

In [28]:
# an error ocurrs because *b will sequester the values 2,3,4,5 and perceives c to be missing.
test_func(1,2,3,4,5)

TypeError: test_func() missing 1 required keyword-only argument: 'c'

In [29]:
test_func(1,2,3,4,c=5)

1 (2, 3, 4) 5


In [30]:
test_func(a=1,c=22)

1 () 22


* the `*` extension can be used by iteself (without a keyword) to force all arguments following it to be passed.

In [34]:
# the following construct will force all arguments to be declared
def forced_args(a,*,b,c,d):
    print(a,b,c,d)

SyntaxError: invalid syntax (<ipython-input-34-5aad35aa821d>, line 3)

In [36]:
forced_args(5, 2, 1,'foo')

TypeError: forced_args() takes 1 positional argument but 4 were given

while the first argument passed will be automatically assigned to `a`, the `*` extension forces argument declaration for everything that follows it.

In [37]:
forced_args(5, c=2, d='foo',b=1)

5 1 2 foo


* unlike `*`, the two star argument `**` cannot appear by itself as an argument.
* unline `*args`, the `**kwargs` does not accept any named arguments after it. The `**kwards` (or equivalent) has to be the last argument.

In [41]:
def count_dict(**kwargs,small,large):
    for keys in kwargs:
        print(keys)
    print('\n','small={},'.format(small),'large={},'.format(large),kwargs)

&nbsp;

In the Gregorian calendar three criteria must be taken into account to identify leap years:

* The year can be evenly divided by 4, is a leap year, unless:
    * The year can be evenly divided by 100, it is NOT a leap year, unless:   
        * The year is also evenly divisible by 400. Then it is a leap year. 
        
        
* the years 2000 and 2400 are leap years, while 1800, 1900, 2100, 2200, 2300 and 2500 are NOT leap years
        
<span style='color:blue'>write the method `is_leap()` takes integers (years) as argument, goes thru the logic above and then prints out whether a year is a leap year or not. use the **`*args`** extension to pass multiple arguments of year into the method</span>

method return

`1904 IS a leap year
1932 IS a leap year
1986 IS NOT a leap year
2000 IS a leap year
2008 IS a leap year
2016 IS a leap year
2021 IS NOT a leap year`

In [1]:
#skipped code
def is_leap(*args):
    for y in args:
        if (( y%400 == 0)or (( y%4 == 0 ) and ( y%100 != 0))):
            print("%d IS a Leap Year" %y)
        else:
            print("%d IS NOT a Leap Year" %y)
    

In [2]:
#check your logic

is_leap(1904, 1932, 1986, 2000, 2008, 2016, 2021)

1904 IS a Leap Year
1932 IS a Leap Year
1986 IS NOT a Leap Year
2000 IS a Leap Year
2008 IS a Leap Year
2016 IS a Leap Year
2021 IS NOT a Leap Year


&nbsp;

## 3- Generators and the `yield` statement:

* we have already come across a few iterables such as the `range()` function and list comprehensions.   
* a Generator is an iterable object that when created is compiled into an object that supports iteration protocols.  
* Generators can be created in two different ways:  
 - 1 -  using `def` statement with `yield()` statement instead of `return()` statement in a method.      
 - 2 -  a comprehension expression that is enclosed in parantheses `(`  `)`.     
         
         
* When the method/comprehension runs and is assigned to a name the resulting object is a generator object.


the function below finds all numbers between 0 and 1000 that are divisble by 11 and creates an iterator.

In [48]:
my_list = list(range(10001))

In [49]:
def divisible_11(a_list):
    for num in a_list:
        if num % 11 == 0:yield num            

In [50]:
iterator_11 = divisible_11(my_list)

In [51]:
type(iterator_11)

generator

we can use `iterator_11` to do something with those values. 

In [52]:
for i in iterator_11:
    my_list[i] = '------'
    
print(my_list[:1000], end = ' ')

['------', 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, '------', 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, '------', 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, '------', 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, '------', 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, '------', 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, '------', 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, '------', 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, '------', 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, '------', 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, '------', 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, '------', 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, '------', 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, '------', 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, '------', 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, '------', 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, '------', 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, '------', 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, '------', 199, 200

* but why do this when we can do it with a simple filtered list comprehension ?
    - Generators are memory space optimizers.
    - Generators similar to `range()` objects do not require the entire list of objects to be constructed.   
    - Generators are better suited for very large result sets because for small sets list comprehensions run faster.  

In [55]:
# let't try to build the index list using a conventional approach
divisible_list = []
for num in range(10001):
    if num % 11 == 0:
        divisible_list.append(num)

In [56]:
from sys import getsizeof
getsizeof(iterator_11), getsizeof(divisible_list)

(88, 7984)

* the other way to create Generators.

In [57]:
iterator_7 = (num for num in range(10000) if num % 7 == 0)

In [58]:
type(iterator_7)

generator

In [59]:
for i in iterator_7:
    my_list[i] = '_____'

In [60]:
print(my_list[:1000], sep=' ')

['_____', 1, 2, 3, 4, 5, 6, '_____', 8, 9, 10, '------', 12, 13, '_____', 15, 16, 17, 18, 19, 20, '_____', '------', 23, 24, 25, 26, 27, '_____', 29, 30, 31, 32, '------', 34, '_____', 36, 37, 38, 39, 40, 41, '_____', 43, '------', 45, 46, 47, 48, '_____', 50, 51, 52, 53, 54, '------', '_____', 57, 58, 59, 60, 61, 62, '_____', 64, 65, '------', 67, 68, 69, '_____', 71, 72, 73, 74, 75, 76, '_____', 78, 79, 80, 81, 82, 83, '_____', 85, 86, 87, '------', 89, 90, '_____', 92, 93, 94, 95, 96, 97, '_____', '------', 100, 101, 102, 103, 104, '_____', 106, 107, 108, 109, '------', 111, '_____', 113, 114, 115, 116, 117, 118, '_____', 120, '------', 122, 123, 124, 125, '_____', 127, 128, 129, 130, 131, '------', '_____', 134, 135, 136, 137, 138, 139, '_____', 141, 142, '------', 144, 145, 146, '_____', 148, 149, 150, 151, 152, 153, '_____', 155, 156, 157, 158, 159, 160, '_____', 162, 163, 164, '------', 166, 167, '_____', 169, 170, 171, 172, 173, 174, '_____', '------', 177, 178, 179, 180, 181, 

&nbsp;

<span style="color:blue">write a user defined function that iterates over a a list of integers 1 to `n` and yields the integers which are a perfect square</span>

In [71]:
#skipped code
def perf_sq(num):
    for i in range(num):
        if (i**0.5).is_integer():
            yield i
    

In [72]:
perf_sq(200)

<generator object perf_sq at 0x106d72a40>

In [73]:
list(perf_sq(200))

[0, 1, 4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196]

similar to `range()` objects, to display the contents of genertor either print() the elements or convert the generator object to a list

In [76]:
import random

states =["IL","CA","MI","MD","DE","WI","LA","AL","AK","IO","MI","MA","OH","KY"]

random.seed(26)
random_list=random.choices(states,k=50)

print(random_list)

['MI', 'MI', 'MI', 'AK', 'IL', 'MI', 'MI', 'IL', 'IO', 'AL', 'AK', 'KY', 'KY', 'WI', 'MA', 'MA', 'KY', 'IL', 'KY', 'MD', 'CA', 'AK', 'MA', 'KY', 'WI', 'MA', 'IL', 'KY', 'DE', 'AL', 'CA', 'IL', 'IL', 'AK', 'AL', 'MA', 'MA', 'WI', 'MA', 'MI', 'OH', 'MD', 'KY', 'WI', 'WI', 'AL', 'IL', 'CA', 'AL', 'IO']


In [78]:
def state_index(list_,ST):
    for index, st in enumerate(list_):
        if st == ST:
            yield index

In [79]:
index_gen=state_index(random_list,"IL")

In [80]:
#print all the index
for i in index_gen : print(i)

4
7
17
26
31
32
46


&nbsp;

&nbsp;

## 3.2.5 `lambda()` operator.

* `lambda()` operator is an **expression** used to create anonymous functions. Because it is an expression it can appear in place where a `def` statement in not allowed such as inside a list literal or a function call argument. 
* the body of a `lambda()` function is similar to that of a `def()` yet the result is written as a naked expression. It is more limited by virtue of being an expression.   
* it is designed to run simple tasks and it is very instrumental in PySpark when used in conjuction with `map` and `reduce`.   

### ganaral syntax:


#### `lambda()`:
`lambda x,y: x+y`

if `x=(1,2,3)` is a list or a tuple of 3 elements:

`lambda x: x[2]*(x[0] + x[1])`


#### `map()`:

`map(function, iterable)` or `map(lambda(), list)` or `map(lambda(), column)`

`map()`, `filter()` and `reduce()` in conjuction with `lambda()` allow a user to apply functions into an object witout having to write a formal loop or a user defined function. 

In [81]:
def multiply(x,y):
    return(x*y)

In [82]:
multiply(5,6)

30

In [83]:
f = lambda x, y: x*y

In [84]:
f(5,6)

30

* above is only a demonstration to show that `lambda()` and `def()` essentially do the same task but lambda is rarely assigned to a name because it is applied almost always as an expression within a statement. 

 
* `lambda` is used to perform operations on Python containers and data frames wihout the need to write a formal user defined function in a manner that is similar to a list comprehension yet offering more flexibility. 

In [85]:
word_list = ('foo', 'bar', '_molly_', '423','gronk', '_wrong_' ,'hello kitty', 'sling', 'drag', '8', '__make__')

digits = filter(lambda word: word.isdigit(), word_list)
list(digits)

['423', '8']

In [86]:
tup = (3,2)

tup_dir = filter(lambda attri:not attri.endswith('_'), dir(tup))
list(tup_dir)

['count', 'index']

In [87]:
#alternative way
[i for i in dir(tup) if not i.startswith('_')]

['count', 'index']

<span style='color:blue'>convert the following list of speeds in mph to Mach. 1 Mach = 767.269 mph.    
use `round(object, dceimals)` to round to 2 decimal places.  </span>

In [89]:
mph = [1562, 8965, 124, 1125, 754, 3368]

In [None]:
#skipped code


should look like 


`[2.04, 11.68, 0.16, 1.47, 0.98, 4.39]`

* this can be achieved using a simple list comprehension:  

In [90]:
mach = [round(mph/767.269,2) for mph in mph]
mach

[2.04, 11.68, 0.16, 1.47, 0.98, 4.39]

### so why use the lambda operator ? if we can achieve the same using list comprehensions...

* the true power of the `lambda` operator comes when it is used along with other methods such as `map()` and `filter()` and `reduce` to iterate over rows of a column or multiple columns in a dataframe. as such it allows fairly complex processing without the need for an explicit *for-statement*. 

In [91]:
# skipping ahead
import numpy as np
import pandas as pd

In [92]:
np.random.seed(11)
df = pd.DataFrame(np.random.random((20)).round(3).reshape(10,2), columns = ['col1','col2'])
df.head()

Unnamed: 0,col1,col2
0,0.18,0.019
1,0.463,0.725
2,0.42,0.485
3,0.013,0.487
4,0.942,0.851


* `col3` is created by placing the values in `col1` and `col2` in a tuple for every row. 

In [93]:
#combine x and y and put them in a tuple
df['col3'] = list(map(lambda x, y: (x,y) , df.col1, df.col2))
df

Unnamed: 0,col1,col2,col3
0,0.18,0.019,"(0.18, 0.019)"
1,0.463,0.725,"(0.463, 0.725)"
2,0.42,0.485,"(0.42, 0.485)"
3,0.013,0.487,"(0.013, 0.487)"
4,0.942,0.851,"(0.942, 0.851)"
5,0.73,0.109,"(0.73, 0.109)"
6,0.894,0.857,"(0.894, 0.857)"
7,0.165,0.632,"(0.165, 0.632)"
8,0.02,0.117,"(0.02, 0.117)"
9,0.316,0.158,"(0.316, 0.158)"


* `col4` uses the values in a tuple to calculate $\ \ x^2 + 2xy + y^2$ 

In [94]:
df['col4'] = list(map(lambda x: x[0]**2 + 2*x[0]*x[1] + x[1]**2, df.col3))
df

Unnamed: 0,col1,col2,col3,col4
0,0.18,0.019,"(0.18, 0.019)",0.039601
1,0.463,0.725,"(0.463, 0.725)",1.411344
2,0.42,0.485,"(0.42, 0.485)",0.819025
3,0.013,0.487,"(0.013, 0.487)",0.25
4,0.942,0.851,"(0.942, 0.851)",3.214849
5,0.73,0.109,"(0.73, 0.109)",0.703921
6,0.894,0.857,"(0.894, 0.857)",3.066001
7,0.165,0.632,"(0.165, 0.632)",0.635209
8,0.02,0.117,"(0.02, 0.117)",0.018769
9,0.316,0.158,"(0.316, 0.158)",0.224676


* `lambda()` operator can also be used in conjunction with `filter` to find the ocurrence of key words in a text.  
* below we read a sample dummy resume for a data scientist. 

In [3]:
file = open('data/tweets.txt', 'r')
tweets = file.readlines()
file.close()

* this is a small sample of tweets collected from accounts that tweet about topics related to data science and machine learning.      
* we would like to check if any of the phrases in our list `key_words` appear in these tweets. 

In [4]:
tweets

['Meet Vestri - relies on a deeplearning technology called dynamic neural advection - learns like a baby and predicts future outcomes!\n',
 'How Can Natural Language Processing Change Business Intelligence? by @MicRum @Medium |\n',
 'You have until Friday to save 65% on tickets to Open Data Science Conference in Boston, May 1-4. Early bird pricing will end and ticket prices will rise, so get yours today! Register https://hubs.ly/H09DPKc0  #BigData #DataScience #AI DeepLearning MachineLearning #ODSC via @odsc\n',
 'New blog post: "What\'s the difference between data science, machine learning, and artificial intelligence?" http://varianceexplained.org/r/ds-ml-ai/\n',
 "In an era of algorithms and big data mining, Hawkey's a reminder that the best election strategy in Australia is probably still turning up to the cricket and sculling a beer.\n",
 'Machine Learning e Data Mining https://click.linksynergy.com/link?id=*YZD2vKyNUY&offerid=358574.1195734&type=2&murl=https%3A%2F%2Fwww.udemy.com

* we can read the tweets into a DataFrame given the structure of the file. 

In [5]:
import pandas as pd

In [6]:
pd.set_option('display.max_colwidth', 60)

In [7]:
tweets = pd.read_table('data/tweets.txt', names = ['text'])
tweets

Unnamed: 0,text
0,Meet Vestri - relies on a deeplearning technology called...
1,How Can Natural Language Processing Change Business Inte...
2,You have until Friday to save 65% on tickets to Open Dat...
3,"New blog post: ""What's the difference between data scien..."
4,"In an era of algorithms and big data mining, Hawkey's a ..."
5,Machine Learning e Data Mining https://click.linksynergy...
6,"EXCLUSIVE: Cambridge Analytica, the pro-Trump data-minin..."
7,How Will Machine Learning Address Cyber Security Problem...
8,How Can Natural Language Processing Change Business Inte...
9,AI And Deep Learning – A Review of The Past 12 Months AI...


<span style='color:blue'>add a new column `split` in which every row from column `text` is converted to lower case and split by space . use `map` and `lambda`</span>

In [9]:
#skipped code 
#split each one of text in a list of lower cased words
tweets['split']=list(map(lambda x: x.lower().split(),tweets["text"]))

In [10]:
tweets

Unnamed: 0,text,split
0,Meet Vestri - relies on a deeplearning technology called...,"[meet, vestri, -, relies, on, a, deeplearning, technolog..."
1,How Can Natural Language Processing Change Business Inte...,"[how, can, natural, language, processing, change, busine..."
2,You have until Friday to save 65% on tickets to Open Dat...,"[you, have, until, friday, to, save, 65%, on, tickets, t..."
3,"New blog post: ""What's the difference between data scien...","[new, blog, post:, ""what's, the, difference, between, da..."
4,"In an era of algorithms and big data mining, Hawkey's a ...","[in, an, era, of, algorithms, and, big, data, mining,, h..."
5,Machine Learning e Data Mining https://click.linksynergy...,"[machine, learning, e, data, mining, https://click.links..."
6,"EXCLUSIVE: Cambridge Analytica, the pro-Trump data-minin...","[exclusive:, cambridge, analytica,, the, pro-trump, data..."
7,How Will Machine Learning Address Cyber Security Problem...,"[how, will, machine, learning, address, cyber, security,..."
8,How Can Natural Language Processing Change Business Inte...,"[how, can, natural, language, processing, change, busine..."
9,AI And Deep Learning – A Review of The Past 12 Months AI...,"[ai, and, deep, learning, –, a, review, of, the, past, 1..."


&nbsp;

`filter()` from the name applies a filter to an object.    

In [11]:
statement = 'The Koh-i-Noor is a 106 carats diamond which was once the largest diamond in the world.'

In [12]:
stop_words = ['is', 'was','are','which','a','the','in','on','for','from']

In [13]:
list(filter(lambda x: x not in stop_words, statement.lower().split()))

['koh-i-noor',
 '106',
 'carats',
 'diamond',
 'once',
 'largest',
 'diamond',
 'world.']

In [14]:
' '.join( list(filter(lambda x: x not in stop_words, statement.lower().split())) )

'koh-i-noor 106 carats diamond once largest diamond world.'

&nbsp;

<span style="color:blue">add a new column `filtered` to the <u>tweets</u> data frame. this column checks the words in the column `cleaned` against the list of words in list `key_words` and keeps ones common between the two</span>   

In [16]:
key_words = ['data', 'science', 'scientist', 'machine', 'learning', 'deeplearning', 'natural', 'language', \
             'big', 'data', 'deep','machinelearning', 'predictive', 'ai', 'mining', 'data-mining']

In [None]:
#skipped code

In [18]:
tweets["filtered"]= list(map(lambda x: list(filter(lambda y : y in key_words, x)),tweets['split']))

In [19]:
tweets

Unnamed: 0,text,split,filtered
0,Meet Vestri - relies on a deeplearning technology called...,"[meet, vestri, -, relies, on, a, deeplearning, technolog...",[deeplearning]
1,How Can Natural Language Processing Change Business Inte...,"[how, can, natural, language, processing, change, busine...","[natural, language]"
2,You have until Friday to save 65% on tickets to Open Dat...,"[you, have, until, friday, to, save, 65%, on, tickets, t...","[data, science, deeplearning, machinelearning]"
3,"New blog post: ""What's the difference between data scien...","[new, blog, post:, ""what's, the, difference, between, da...","[data, machine]"
4,"In an era of algorithms and big data mining, Hawkey's a ...","[in, an, era, of, algorithms, and, big, data, mining,, h...","[big, data]"
5,Machine Learning e Data Mining https://click.linksynergy...,"[machine, learning, e, data, mining, https://click.links...","[machine, learning, data, mining]"
6,"EXCLUSIVE: Cambridge Analytica, the pro-Trump data-minin...","[exclusive:, cambridge, analytica,, the, pro-trump, data...",[data-mining]
7,How Will Machine Learning Address Cyber Security Problem...,"[how, will, machine, learning, address, cyber, security,...","[machine, learning]"
8,How Can Natural Language Processing Change Business Inte...,"[how, can, natural, language, processing, change, busine...","[natural, language]"
9,AI And Deep Learning – A Review of The Past 12 Months AI...,"[ai, and, deep, learning, –, a, review, of, the, past, 1...","[ai, deep, learning, ai, machinelearning]"


&nbsp;

`lambda()` is a flexible operator and allows for the new object to be of any container type. 

In [20]:
pizza_meat = pd.DataFrame([['pepperoni', 5],['beef', 7],['steak', 5],['sausage', 10], ['italian sausage', 23],['chicken', 6],['anchoves', 6]], 
             columns = ['toppings', 'price'])

pizza_meat

Unnamed: 0,toppings,price
0,pepperoni,5
1,beef,7
2,steak,5
3,sausage,10
4,italian sausage,23
5,chicken,6
6,anchoves,6


In [21]:
pizza_meat['dict'] = list(map(lambda x,y: {x: y}, pizza_meat.toppings, pizza_meat.price))
pizza_meat

Unnamed: 0,toppings,price,dict
0,pepperoni,5,{'pepperoni': 5}
1,beef,7,{'beef': 7}
2,steak,5,{'steak': 5}
3,sausage,10,{'sausage': 10}
4,italian sausage,23,{'italian sausage': 23}
5,chicken,6,{'chicken': 6}
6,anchoves,6,{'anchoves': 6}


In [22]:
pizza_meat["dict"]

0           {'pepperoni': 5}
1                {'beef': 7}
2               {'steak': 5}
3            {'sausage': 10}
4    {'italian sausage': 23}
5             {'chicken': 6}
6            {'anchoves': 6}
Name: dict, dtype: object