# Transforming and Enriching Data

First, install the required R packages if not done already. See [Installing Required R Packages.](../00_Installing_Required_R_Packages.ipynb)

## Load necessary packages

In [1]:
library(tidyverse)

── [1mAttaching core tidyverse packages[22m ──────────────────────── tidyverse 2.0.0 ──
[32m✔[39m [34mdplyr    [39m 1.1.4     [32m✔[39m [34mreadr    [39m 2.1.5
[32m✔[39m [34mforcats  [39m 1.0.0     [32m✔[39m [34mstringr  [39m 1.5.1
[32m✔[39m [34mggplot2  [39m 3.5.2     [32m✔[39m [34mtibble   [39m 3.2.1
[32m✔[39m [34mlubridate[39m 1.9.4     [32m✔[39m [34mtidyr    [39m 1.3.1
[32m✔[39m [34mpurrr    [39m 1.0.4     
── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()
[36mℹ[39m Use the conflicted package ([3m[34m<http://conflicted.r-lib.org/>[39m[23m) to force all conflicts to become errors


## Loading Combined Data

In [2]:
load("01_Combining_Data.RData")

## Feature Engineering

### Replace codes with labels for demHomeOwner

In [3]:
df <- df %>%
  mutate(demHomeOwner = recode(DemHomeOwnerCode,'H' = 'HomeOwner',  'U' = 'Unknown')) %>%
  select(-DemHomeOwnerCode)  # Drop the original column

head(df$demHomeOwner)  # Display the first few values

### Compute customer age

In [4]:
df <- df %>%
  mutate(customerAge = as.numeric(difftime(Sys.Date(), birthDate, units = "days")) / 365.25) %>%
  mutate(customerAge = as.integer(customerAge)) %>%
  select(-birthDate)

head(df$customerAge)  # Display the first few values

### Compute average purchase amount per ad

In [5]:
df <- df %>%
  mutate(AvgPurchasePerAd = AvgPurchaseAmount12 / intAdExposureCount12)

head(df$AvgPurchasePerAd)  # Display the first few values

In [6]:
save(df, file = "02_Transforming_and_Enriching_Data.RData")

In [6]:
dim(df)