# Apply a Function using multiple columns
Sometimes you want to concat the value of two columns
## lambda is used
Although you can apply a function to a column using the axis=1 parameter, we used lambda here and selected the columns by name with the syntax x["day"], x["month"], we passed the dataframe as x in to the lambda
## functions are okay
We could have just as easily predefined a function as follows

    def myfunc(x):
        # x is expected to be a datafram
        return datetime.strptime("%s %s 2020" % (x["day"], x["month"]), "%d %b %Y")

## Usefull web sites
- https://www.programiz.com/python-programming/datetime/strptime 
- https://www.guru99.com/date-time-and-datetime-classes-in-python.html 
- https://www.w3schools.com/python/python_datetime.asp 
- https://stackoverflow.com/questions/13331698/how-to-apply-a-function-to-two-columns-of-pandas-dataframe
- https://stackoverflow.com/questions/34855859/is-there-a-way-in-pandas-to-use-previous-row-value-in-dataframe-apply-when-previ#34856727


In [33]:
import pandas as pd
from datetime import datetime, date

In [34]:
# make a simple dataframe
rd = [['jan', 23], ['feb', 20], ['mar', 1], ['apr', 15]]

In [35]:
y = datetime.strptime("%s %s 2020" % (12, "feb"), "%d %b %Y")

In [36]:
df = pd.DataFrame(rd, columns = ['month', 'day'])

In [37]:
df

Unnamed: 0,month,day
0,jan,23
1,feb,20
2,mar,1
3,apr,15


In [38]:
# applying a no-name function to the columns of the data frame and creating a new column
df["daymonth"] = df.apply(lambda x: "%s %s" %(x["month"], x["day"]), axis=1)

In [39]:
df

Unnamed: 0,month,day,daymonth
0,jan,23,jan 23
1,feb,20,feb 20
2,mar,1,mar 1
3,apr,15,apr 15


In [40]:
df['dto'] = df.apply(lambda x: datetime.strptime("%s %s 2020" % (x["day"], x["month"]), "%d %b %Y"), axis=1)

In [41]:
df

Unnamed: 0,month,day,daymonth,dto
0,jan,23,jan 23,2020-01-23
1,feb,20,feb 20,2020-02-20
2,mar,1,mar 1,2020-03-01
3,apr,15,apr 15,2020-04-15


# Getting Differences Bewteen Rows
Now that we have a dataframe with a date object we can do date math. How many days are there between the dates in the rows? 

In [46]:
df.loc[0, "daysdiff"] = datetime(2020, 10, 10) - datetime(2020, 11, 11)
# we need to create the first instance, the 0th row cannot be generated because there 
# is no -1 row
# we also want to make sure it is a timediff object so all 'cells' in the column are of the same type

In [48]:
# new we iterate over the rowls
for cell in range(1, len(df)):
    df.loc[cell, "daysdiff"] = df.loc[cell-1, "dto"] - df.loc[cell, "dto"]
df

Unnamed: 0,month,day,daymonth,dto,daysdiff
0,jan,23,jan 23,2020-01-23,-32 days
1,feb,20,feb 20,2020-02-20,-28 days
2,mar,1,mar 1,2020-03-01,-10 days
3,apr,15,apr 15,2020-04-15,-45 days
