## Elementwise Operations
Pandas series and dataframes support elementwise operations. An elementwise operation is one that is performed on each element of the series/dataframe. This gives us a convenient way to update columns or to create new columns that are computed from other columns in the dataframe.

Consider the pollution dataset below.

In [1]:
import pandas as pd
pollution_data = pd.read_csv('LSTM-Multivariate_pollution.csv', index_col = 'date', parse_dates = True)

Suppose the 'temp' column was currently measured in Celsius and we wanted to convert to Fahrenheit. We could create a new column with the code:

In [2]:
pollution_data['temp_Fahrenheit'] = pollution_data['temp'] * 9 / 5 + 32
pollution_data

Unnamed: 0_level_0,pollution,dew,temp,press,wnd_dir,wnd_spd,snow,rain,temp_Fahrenheit
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2/01/2010 0:00,129,-16,-4.0,1020.0,SE,1.79,0,0,24.8
2/01/2010 1:00,148,-15,-4.0,1020.0,SE,2.68,0,0,24.8
2/01/2010 2:00,159,-11,-5.0,1021.0,SE,3.57,0,0,23.0
2/01/2010 3:00,181,-7,-5.0,1022.0,SE,5.36,1,0,23.0
2/01/2010 4:00,138,-7,-5.0,1022.0,SE,6.25,2,0,23.0
...,...,...,...,...,...,...,...,...,...
31/12/2014 19:00,8,-23,-2.0,1034.0,NW,231.97,0,0,28.4
31/12/2014 20:00,10,-22,-3.0,1034.0,NW,237.78,0,0,26.6
31/12/2014 21:00,10,-22,-3.0,1034.0,NW,242.70,0,0,26.6
31/12/2014 22:00,8,-22,-4.0,1034.0,NW,246.72,0,0,24.8


Notice that the operations are performed elementwise (to each element of the temp column). You could alternatively replace the 'temp' column if you preferred.

In [3]:
pollution_data['temp'] = pollution_data['temp'] * 9 / 5 + 32
pollution_data

Unnamed: 0_level_0,pollution,dew,temp,press,wnd_dir,wnd_spd,snow,rain,temp_Fahrenheit
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2/01/2010 0:00,129,-16,24.8,1020.0,SE,1.79,0,0,24.8
2/01/2010 1:00,148,-15,24.8,1020.0,SE,2.68,0,0,24.8
2/01/2010 2:00,159,-11,23.0,1021.0,SE,3.57,0,0,23.0
2/01/2010 3:00,181,-7,23.0,1022.0,SE,5.36,1,0,23.0
2/01/2010 4:00,138,-7,23.0,1022.0,SE,6.25,2,0,23.0
...,...,...,...,...,...,...,...,...,...
31/12/2014 19:00,8,-23,28.4,1034.0,NW,231.97,0,0,28.4
31/12/2014 20:00,10,-22,26.6,1034.0,NW,237.78,0,0,26.6
31/12/2014 21:00,10,-22,26.6,1034.0,NW,242.70,0,0,26.6
31/12/2014 22:00,8,-22,24.8,1034.0,NW,246.72,0,0,24.8


Elementwise computations can also involve multiple columns of a dataframe. For example, if we wanted to create a column 'snow and rain' which is an elementwise addition of the 'snow' and the 'rain' columns. We would simply type:

In [4]:
pollution_data['snow and rain'] = pollution_data['snow'] + pollution_data['rain']
pollution_data

Unnamed: 0_level_0,pollution,dew,temp,press,wnd_dir,wnd_spd,snow,rain,temp_Fahrenheit,snow and rain
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2/01/2010 0:00,129,-16,24.8,1020.0,SE,1.79,0,0,24.8,0
2/01/2010 1:00,148,-15,24.8,1020.0,SE,2.68,0,0,24.8,0
2/01/2010 2:00,159,-11,23.0,1021.0,SE,3.57,0,0,23.0,0
2/01/2010 3:00,181,-7,23.0,1022.0,SE,5.36,1,0,23.0,1
2/01/2010 4:00,138,-7,23.0,1022.0,SE,6.25,2,0,23.0,2
...,...,...,...,...,...,...,...,...,...,...
31/12/2014 19:00,8,-23,28.4,1034.0,NW,231.97,0,0,28.4,0
31/12/2014 20:00,10,-22,26.6,1034.0,NW,237.78,0,0,26.6,0
31/12/2014 21:00,10,-22,26.6,1034.0,NW,242.70,0,0,26.6,0
31/12/2014 22:00,8,-22,24.8,1034.0,NW,246.72,0,0,24.8,0


You should recall from the previous page that comparisons are also elementwise. We can use this to compute new columns. For example, if we wanted a column 'is_rain' which is True when there is rain and False when there is no rain. We could type:

In [5]:
pollution_data['is_rain'] = pollution_data['rain'] > 0
pollution_data

Unnamed: 0_level_0,pollution,dew,temp,press,wnd_dir,wnd_spd,snow,rain,temp_Fahrenheit,snow and rain,is_rain
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2/01/2010 0:00,129,-16,24.8,1020.0,SE,1.79,0,0,24.8,0,False
2/01/2010 1:00,148,-15,24.8,1020.0,SE,2.68,0,0,24.8,0,False
2/01/2010 2:00,159,-11,23.0,1021.0,SE,3.57,0,0,23.0,0,False
2/01/2010 3:00,181,-7,23.0,1022.0,SE,5.36,1,0,23.0,1,False
2/01/2010 4:00,138,-7,23.0,1022.0,SE,6.25,2,0,23.0,2,False
...,...,...,...,...,...,...,...,...,...,...,...
31/12/2014 19:00,8,-23,28.4,1034.0,NW,231.97,0,0,28.4,0,False
31/12/2014 20:00,10,-22,26.6,1034.0,NW,237.78,0,0,26.6,0,False
31/12/2014 21:00,10,-22,26.6,1034.0,NW,242.70,0,0,26.6,0,False
31/12/2014 22:00,8,-22,24.8,1034.0,NW,246.72,0,0,24.8,0,False


*** 
## The apply method
While elementwise use of simple operators is supported by Pandas, functions generally speaking will not be applied elementwise to series and dataframes. For example, if we wanted our 'is_rain' column to say 'wet' when there is rain and 'dry' when there is no rain, one could propose the function:

In [6]:
def is_rain(rain):
    if rain > 0:
        return 'wet'
    else:
        return 'dry'

If we try to call this on the rain column like below we will get an error. This is because the function tries to apply the function to the column holistically rather than for each element of the column.

In [7]:
pollution_data['is_rain'] = is_rain(pollution_data['rain'])

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

To apply the function for each element, we can use the apply method. In this case, we just supply one argument which is the function we would like to apply elementwise.

In [8]:
pollution_data['is_rain'] = pollution_data['rain'].apply(is_rain)
pollution_data

Unnamed: 0_level_0,pollution,dew,temp,press,wnd_dir,wnd_spd,snow,rain,temp_Fahrenheit,snow and rain,is_rain
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
2/01/2010 0:00,129,-16,24.8,1020.0,SE,1.79,0,0,24.8,0,dry
2/01/2010 1:00,148,-15,24.8,1020.0,SE,2.68,0,0,24.8,0,dry
2/01/2010 2:00,159,-11,23.0,1021.0,SE,3.57,0,0,23.0,0,dry
2/01/2010 3:00,181,-7,23.0,1022.0,SE,5.36,1,0,23.0,1,dry
2/01/2010 4:00,138,-7,23.0,1022.0,SE,6.25,2,0,23.0,2,dry
...,...,...,...,...,...,...,...,...,...,...,...
31/12/2014 19:00,8,-23,28.4,1034.0,NW,231.97,0,0,28.4,0,dry
31/12/2014 20:00,10,-22,26.6,1034.0,NW,237.78,0,0,26.6,0,dry
31/12/2014 21:00,10,-22,26.6,1034.0,NW,242.70,0,0,26.6,0,dry
31/12/2014 22:00,8,-22,24.8,1034.0,NW,246.72,0,0,24.8,0,dry


In this example, we called the apply method on a series. We can also call the apply method on a dataframe. For example, if we wanted a new column called weather that is 'wet' when there is rain, 'snow' when there is snow and 'dry' if there is no rain or snow. This involved looking at two columns which would form a dataframe. We could define a function like so:

In [9]:
def weather(df):
    if df['rain'] > 0:
        return 'rain'
    elif df['snow'] > 0:
        return 'snow'
    else:
        return 'dry'

To apply it elementwise to each row, we would call the apply method like so:

In [10]:
pollution_data['weather'] = pollution_data.apply(weather, axis = 1)
pollution_data

Unnamed: 0_level_0,pollution,dew,temp,press,wnd_dir,wnd_spd,snow,rain,temp_Fahrenheit,snow and rain,is_rain,weather
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
2/01/2010 0:00,129,-16,24.8,1020.0,SE,1.79,0,0,24.8,0,dry,dry
2/01/2010 1:00,148,-15,24.8,1020.0,SE,2.68,0,0,24.8,0,dry,dry
2/01/2010 2:00,159,-11,23.0,1021.0,SE,3.57,0,0,23.0,0,dry,dry
2/01/2010 3:00,181,-7,23.0,1022.0,SE,5.36,1,0,23.0,1,dry,snow
2/01/2010 4:00,138,-7,23.0,1022.0,SE,6.25,2,0,23.0,2,dry,snow
...,...,...,...,...,...,...,...,...,...,...,...,...
31/12/2014 19:00,8,-23,28.4,1034.0,NW,231.97,0,0,28.4,0,dry,dry
31/12/2014 20:00,10,-22,26.6,1034.0,NW,237.78,0,0,26.6,0,dry,dry
31/12/2014 21:00,10,-22,26.6,1034.0,NW,242.70,0,0,26.6,0,dry,dry
31/12/2014 22:00,8,-22,24.8,1034.0,NW,246.72,0,0,24.8,0,dry,dry


Notice that in this case we need to provide a second argument. By default, the apply method will perform element-wise application of the function for each column. To specify that the function needs to be applied for each row, you need the axis = 1 argument.