<a href="https://cognitiveclass.ai/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkRP0321ENSkillsNetwork25371262-2022-01-01">
    <img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0101EN-Coursera/v2/M1_R_Basics/images/IDSNlogo.png" width="200" align="center">
</a>


<h1>Data Wrangling with dplyr</h1>

Estimated time needed: **60** minutes


## Lab Overview:

In this lab, you will focus on wrangling the Seoul bike-sharing demand historical dataset. This is the core dataset to build a predictive model later.

It contains the following columns:

*   `DATE` : Year-month-day
*   `RENTED BIKE COUNT`- Count of bikes rented at each hour
*   `HOUR`- Hour of he day
*   `TEMPERATURE` - Temperature in Celsius
*   `HUMIDITY` - Unit is `%`
*   `WINDSPEED` - Unit is `m/s`
*   `VISIBILITY` - Multiplied by 10m
*   `DEW POINT TEMERATURE` - The temperature to which the air would have to cool down in order to reach saturation, unit is Celsius
*   `SOLAR RADIATION` - MJ/m2
*   `RAINFALL` - mm
*   `SNOWFALL` - cm
*   `SEASONS` - Winter, Spring, Summer, Autumn
*   `HOLIDAY` - Holiday/No holiday
*   `FUNCTIONAL DAY` - NoFunc(Non Functional Hours), Fun(Functional hours)

For this dataset, you will be asked to use `tidyverse` to perform the following data wrangling tasks:

*   **TASK: Detect and handle missing values**
*   **TASK: Create indicator (dummy) variables for categorical variables**
*   **TASK: Normalize data**

Let's start!


First import the necessary library for this data wrangling task:


In [1]:
# Check if you need to install the `tidyverse` library
#require("tidyverse")
library(tidyverse)

"package 'tidyverse' was built under R version 4.1.3"
-- [1mAttaching packages[22m ------------------------------------------------------------------------------- tidyverse 1.3.2 --
[32mv[39m [34mggplot2[39m 3.3.6     [32mv[39m [34mpurrr  [39m 0.3.4
[32mv[39m [34mtibble [39m 3.1.7     [32mv[39m [34mdplyr  [39m 1.0.9
[32mv[39m [34mtidyr  [39m 1.2.0     [32mv[39m [34mstringr[39m 1.4.0
[32mv[39m [34mreadr  [39m 2.1.2     [32mv[39m [34mforcats[39m 0.5.1
"package 'ggplot2' was built under R version 4.1.3"
"package 'tibble' was built under R version 4.1.3"
"package 'tidyr' was built under R version 4.1.3"
"package 'readr' was built under R version 4.1.3"
"package 'purrr' was built under R version 4.1.3"
"package 'dplyr' was built under R version 4.1.3"
"package 'stringr' was built under R version 4.1.3"
"package 'forcats' was built under R version 4.1.3"
-- [1mConflicts[22m ---------------------------------------------------------------------------------- 

Then load the bike-sharing system data from the csv processed in the previous lab:


In [2]:
bike_sharing_df <- read_csv("raw_seoul_bike_sharing.csv")

[1mRows: [22m[34m8760[39m [1mColumns: [22m[34m14[39m
[36m--[39m [1mColumn specification[22m [36m------------------------------------------------------------------------------------------------[39m
[1mDelimiter:[22m ","
[31mchr[39m  (4): DATE, SEASONS, HOLIDAY, FUNCTIONING_DAY
[32mdbl[39m (10): RENTED_BIKE_COUNT, HOUR, TEMPERATURE, HUMIDITY, WIND_SPEED, VISIBI...

[36mi[39m Use `spec()` to retrieve the full column specification for this data.
[36mi[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


In [3]:
# Or you may read it from here again
# url <- "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0321EN-SkillsNetwork/labs/datasets/raw_seoul_bike_sharing.csv"
# Notice some column names in the raw datasets are not standalized if you haven't done them properly in the previous lab

First take a quick look at the dataset:


In [4]:
summary(bike_sharing_df)
dim(bike_sharing_df)

     DATE           RENTED_BIKE_COUNT      HOUR          TEMPERATURE      
 Length:8760        Min.   :-1.1320   Min.   :-1.6612   Min.   :-2.56760  
 Class :character   1st Qu.:-0.8020   1st Qu.:-0.8306   1st Qu.:-0.79262  
 Mode  :character   Median :-0.2914   Median : 0.0000   Median : 0.06975  
                    Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.00000  
                    3rd Qu.: 0.5524   3rd Qu.: 0.8306   3rd Qu.: 0.80654  
                    Max.   : 4.4008   Max.   : 1.6612   Max.   : 2.22149  
                    NA's   :295                         NA's   :11        
    HUMIDITY          WIND_SPEED        VISIBILITY      DEW_POINT_TEMPERATURE
 Min.   :-2.85950   Min.   :-1.6645   Min.   :-2.3177   Min.   :-2.65489     
 1st Qu.:-0.79687   1st Qu.:-0.7960   1st Qu.:-0.8167   1st Qu.:-0.67179     
 Median :-0.06022   Median :-0.2170   Median : 0.4294   Median : 0.07857     
 Mean   : 0.00000   Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.00000     
 3rd Qu.: 

From the summary, we can observe that:

Columns `RENTED_BIKE_COUNT`, `TEMPERATURE`, `HUMIDITY`, `WIND_SPEED`, `VISIBILITY`, `DEW_POINT_TEMPERATURE`, `SOLAR_RADIATION`, `RAINFALL`, `SNOWFALL` are numerical variables/columns and require normalization. Moreover, `RENTED_BIKE_COUNT` and `TEMPERATURE` have some missing values (NA's) that need to be handled properly.

`SEASONS`, `HOLIDAY`, `FUNCTIONING_DAY` are categorical variables which need to be converted into indicator columns or dummy variables.
Also, `HOUR` is read as a numerical variable but it is in fact a categorical variable with levels ranging from 0 to 23.

Now that you have some basic ideas about how to process this bike-sharing demand dataset, let's start working on it!


# TASK: Detect and handle missing values


The `RENTED_BIKE_COUNT` column has about 295 missing values, and `TEMPERATURE` has about 11 missing values. Those missing values could be caused by not being recorded, or from malfunctioning bike-sharing systems or weather sensor networks. In any cases, the identified missing values have to be properly handled.


Let's first handle missing values in `RENTED_BIKE_COUNT` column:


Considering `RENTED_BIKE_COUNT` is the response variable/dependent variable, i.e., we want to predict the `RENTED_BIKE_COUNT` using other predictor/independent variables later, and we normally can not allow missing values for the response variable, so missing values for response variable must be either dropped or imputed properly.

We can see that `RENTED_BIKE_COUNT` only has about 3% missing values (295 / 8760). As such, you can safely drop any rows whose `RENTED_BIKE_COUNT` has missing values.


*TODO:* Drop rows with missing values in the `RENTED_BIKE_COUNT` column


In [5]:
# Drop rows with `RENTED_BIKE_COUNT` column == NA
bike_sharing_df<-bike_sharing_df %>% drop_na(RENTED_BIKE_COUNT)

In [6]:
# Print the dataset dimension again after those rows are dropped
dim(bike_sharing_df)

Now that you have handled  missing values in the `RENTED_BIKE_COUNT` variable, let's continue processing missing values for the `TEMPERATURE` column.


Unlike the `RENTED_BIKE_COUNT` variable, `TEMPERATURE` is not a response variable. However, it is still an important predictor variable - as you could imagine, there may be a positve correlation between `TEMPERATURE` and `RENTED_BIKE_COUNT`. For example, in winter time with lower temperatures, people may not want to ride a bike, while in summer with nicer weather, they are more likely to rent a bike.


How do we handle missing values for `TEMPERATURE`? We could simply remove the rows but it's better to impute them because `TEMPERATURE` should be relatively easy and reliable to estimate statistically.


Let's first take a look at the missing values in the TEMPERATURE column.


In [7]:
bike_sharing_df %>% 
                filter(is.na(TEMPERATURE))

DATE,RENTED_BIKE_COUNT,HOUR,TEMPERATURE,HUMIDITY,WIND_SPEED,VISIBILITY,DEW_POINT_TEMPERATURE,SOLAR_RADIATION,RAINFALL,SNOWFALL,SEASONS,HOLIDAY,FUNCTIONING_DAY
<chr>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<chr>,<chr>,<chr>
07/06/2018,3.8792535,0.9389561,,-0.06022153,0.94093538,-0.361378,0.9437855,0.44994643,-0.1317924,-0.1718813,Summer,No Holiday,Yes
12/06/2018,0.8046113,0.3611369,,-0.64954264,0.45844961,0.8617053,0.6604857,0.9449126,-0.1317924,-0.1718813,Summer,No Holiday,Yes
13/06/2018,3.0121265,0.7945013,,-0.06022153,1.5199183,-0.8512689,0.9437855,0.34634886,-0.1317924,-0.1718813,Summer,No Holiday,Yes
17/06/2018,2.4921617,0.7945013,,-0.01111144,1.5199183,-0.9400411,0.9667558,0.10462119,-0.1317924,-0.1718813,Summer,No Holiday,Yes
20/06/2018,3.1319987,1.0834108,,0.13621884,0.94093538,-0.3301434,1.0280098,0.03555615,-0.1317924,-0.1718813,Summer,No Holiday,Yes
30/06/2018,0.6458196,0.2166822,,1.41308124,-0.02403616,-1.7209075,1.4644446,0.1621754,2.9705138,-0.1718813,Summer,No Holiday,Yes
05/07/2018,0.1523201,-0.2166822,,0.82376013,-0.60301909,-0.6720807,1.2806826,0.7492283,-0.1317924,-0.1718813,Summer,No Holiday,Yes
11/07/2018,-0.1481386,-0.3611369,,1.85507207,-1.08550486,-1.6222717,1.5946094,-0.18314983,-0.1317924,-0.1718813,Summer,No Holiday,Yes
12/07/2018,-0.2119666,-0.7945013,,1.7077418,-0.60301909,-0.9614122,1.5486689,-0.64358348,-0.1317924,-0.1718813,Summer,No Holiday,Yes
21/07/2018,-0.5949347,-1.0834108,,0.92198032,-0.50652193,-0.3843931,1.3113096,-0.65509432,-0.1317924,-0.1718813,Summer,No Holiday,Yes


It seems that all of the missing values for `TEMPERATURE` are found in `SEASONS == Summer`, so it is reasonable to impute those missing values with the summer average temperature.


*TODO:* Impute missing values for the TEMPERATURE column using its mean value.


In [8]:
# Calculate the summer average temperature
mean_temp<-bike_sharing_df %>% group_by(SEASONS) %>% summarize(avg_temp=mean(TEMPERATURE, na.rm = TRUE))
mean_temp
class(mean_temp)
mean_temp_summer<-mean_temp$avg_temp[mean_temp$SEASONS == 'Summer']
mean_temp_summer

SEASONS,avg_temp
<chr>,<dbl>
Autumn,0.07993194
Spring,0.01296048
Summer,1.14878089
Winter,-1.28998603


In [9]:
# Impute missing values for TEMPERATURE column with summer average temperature
bike_sharing_df<-bike_sharing_df %>% replace_na(list(TEMPERATURE= mean_temp_summer)）         

In [10]:
# Print the summary of the dataset again to make sure no missing values in all columns
summary(bike_sharing_df)

     DATE           RENTED_BIKE_COUNT      HOUR            TEMPERATURE       
 Length:8465        Min.   :-1.1320   Min.   :-1.661230   Min.   :-2.567596  
 Class :character   1st Qu.:-0.8020   1st Qu.:-0.794501   1st Qu.:-0.826109  
 Mode  :character   Median :-0.2914   Median : 0.072227   Median : 0.053008  
                    Mean   : 0.0000   Mean   : 0.001015   Mean   :-0.007919  
                    3rd Qu.: 0.5524   3rd Qu.: 0.938956   3rd Qu.: 0.823281  
                    Max.   : 4.4008   Max.   : 1.661230   Max.   : 2.221494  
    HUMIDITY           WIND_SPEED         VISIBILITY       
 Min.   :-2.859497   Min.   :-1.66449   Min.   :-2.317654  
 1st Qu.:-0.796873   1st Qu.:-0.79601   1st Qu.:-0.824966  
 Median :-0.060221   Median :-0.21703   Median : 0.416200  
 Mean   :-0.003883   Mean   : 0.00094   Mean   :-0.004853  
 3rd Qu.: 0.774650   3rd Qu.: 0.55495   3rd Qu.: 0.925818  
 Max.   : 1.953292   Max.   : 5.47630   Max.   : 0.925818  
 DEW_POINT_TEMPERATURE SOLAR_RADIA

In [11]:
# Save the dataset as `seoul_bike_sharing.csv`
write.csv(bike_sharing_df, "seoul_bike_sharing.csv", row.names=FALSE)

# TASK: Create indicator (dummy) variables for categorical variables


Regression models can not process categorical variables directly, thus we need to convert them into indicator variables.


In the bike-sharing demand dataset, `SEASONS`, `HOLIDAY`, `FUNCTIONING_DAY` are categorical variables.
Also, `HOUR` is read as a numerical variable but it is in fact a categorical variable with levels ranged from 0 to 23.


*TODO:* Convert `HOUR` column from numeric into character first:


In [12]:
# Using mutate() function to convert HOUR column into character type
bike_sharing_df %>%
    select(HOUR) %>%
    mutate_all(type.convert) %>%
    mutate_if(is.numeric, as.character)

"'as.is' should be specified by the caller; using TRUE"


HOUR
<chr>
-1.66122994540396
-1.51677516754275
-1.37232038968153
-1.22786561182032
-1.08341083395911
-0.938956056097891
-0.794501278236677
-0.650046500375463
-0.505591722514249
-0.361136944653035


`SEASONS`, `HOLIDAY`, `FUNCTIONING_DAY`,  `HOUR` are all character columns now and are ready to be converted into indicator variables.

For example, `SEASONS` has four categorical values: `Spring`, `Summer`, `Autumn`, `Winter`. We thus need to create four indicator/dummy variables `Spring`, `Summer`, `Autumn`, and `Winter` which only have the value 0 or 1.

So, given a data entry with the value `Spring` in the `SEASONS` column, the values for the four new columns `Spring`, `Summer`, `Autumn`, and `Winter` will be set to 1 for `Spring` and 0 for the others:

| Spring | Summer | Autumn | Winter |
| ------ | ------ | ------ | ------ |
| 1      | 0      | 0      | 0      |


*TODO:* Convert `SEASONS`, `HOLIDAY`, `FUNCTIONING_DAY`, and `HOUR` columns into indicator columns.

Note that if `FUNCTIONING_DAY` only contains one categorical value after missing values removal, then you don't need to convert it to an indicator column.


In [13]:
# Convert SEASONS, HOLIDAY, FUNCTIONING_DAY, and HOUR columns into indicator columns.
bike_sharing_df<-bike_sharing_df %>%
  mutate(dummy = 1) %>% # column with single value
  spread(
    key = SEASONS, # column to spread
    value = dummy,
    fill = 0)
bike_sharing_df<-bike_sharing_df %>%
  mutate(dummy = 1) %>% # column with single value
  spread(
    key = HOLIDAY, # column to spread
    value = dummy,
    fill = 0)
bike_sharing_df<-bike_sharing_df %>%
  mutate(dummy = 1) %>% # column with single value
  spread(
    key = HOUR, # column to spread
    value = dummy,
    fill = 0)

In [14]:
# Print the dataset summary again to make sure the indicator columns are created properly
summary(bike_sharing_df)

     DATE           RENTED_BIKE_COUNT  TEMPERATURE           HUMIDITY        
 Length:8465        Min.   :-1.1320   Min.   :-2.567596   Min.   :-2.859497  
 Class :character   1st Qu.:-0.8020   1st Qu.:-0.826109   1st Qu.:-0.796873  
 Mode  :character   Median :-0.2914   Median : 0.053008   Median :-0.060221  
                    Mean   : 0.0000   Mean   :-0.007919   Mean   :-0.003883  
                    3rd Qu.: 0.5524   3rd Qu.: 0.823281   3rd Qu.: 0.774650  
                    Max.   : 4.4008   Max.   : 2.221494   Max.   : 1.953292  
   WIND_SPEED         VISIBILITY        DEW_POINT_TEMPERATURE
 Min.   :-1.66449   Min.   :-2.317654   Min.   :-2.654888    
 1st Qu.:-0.79601   1st Qu.:-0.824966   1st Qu.:-0.702416    
 Median :-0.21703   Median : 0.416200   Median : 0.047946    
 Mean   : 0.00094   Mean   :-0.004853   Mean   :-0.009863    
 3rd Qu.: 0.55495   3rd Qu.: 0.925818   3rd Qu.: 0.851904    
 Max.   : 5.47630   Max.   : 0.925818   Max.   : 1.770715    
 SOLAR_RADIATION    

In [15]:
# Save the dataset as `seoul_bike_sharing_converted.csv`
write_csv(bike_sharing_df, "seoul_bike_sharing_converted.csv")

# TASK: Normalize data


Columns `RENTED_BIKE_COUNT`, `TEMPERATURE`, `HUMIDITY`, `WIND_SPEED`, `VISIBILITY`, `DEW_POINT_TEMPERATURE`, `SOLAR_RADIATION`, `RAINFALL`, `SNOWFALL` are numerical variables/columns with different value units and range. Columns with large values may adversely influence (bias) the predictive models and degrade model accuracy. Thus, we need to perform normalization on these numeric columns to transfer them into a similar range.


In this project, you are asked to use Min-max normalization:

**Min-max** rescales each value in a column by first subtracting the minimum value of the column from each value, and then divides the result by the difference between the maximum and minimum values of the column. So the column gets re-scaled such that the minimum becomes 0 and the maximum becomes 1.

$$x\_{new} = \frac{x\_{old} - x\_{min}}{x\_{max} - x\_{min}}$$


*TODO:* Apply min-max normalization on `RENTED_BIKE_COUNT`, `TEMPERATURE`, `HUMIDITY`, `WIND_SPEED`, `VISIBILITY`, `DEW_POINT_TEMPERATURE`, `SOLAR_RADIATION`, `RAINFALL`, `SNOWFALL`


In [16]:
# Use the `mutate()` function to apply min-max normalization on columns 
# `RENTED_BIKE_COUNT`, `TEMPERATURE`, `HUMIDITY`, `WIND_SPEED`, `VISIBILITY`, `DEW_POINT_TEMPERATURE`, `SOLAR_RADIATION`, `RAINFALL`, `SNOWFALL`
bike_sharing_df<-bike_sharing_df %>% mutate(RENTED_BIKE_COUNT = (RENTED_BIKE_COUNT - min(RENTED_BIKE_COUNT)) / (max(RENTED_BIKE_COUNT) - min(RENTED_BIKE_COUNT)))
bike_sharing_df<-bike_sharing_df %>% mutate(TEMPERATURE = (TEMPERATURE - min(TEMPERATURE)) / (max(TEMPERATURE) - min(TEMPERATURE)))
bike_sharing_df<-bike_sharing_df %>% mutate(HUMIDITY = (HUMIDITY - min(HUMIDITY)) / (max(HUMIDITY) - min(HUMIDITY)))
bike_sharing_df<-bike_sharing_df %>% mutate(WIND_SPEED = (WIND_SPEED - min(WIND_SPEED)) / (max(WIND_SPEED) - min(WIND_SPEED)))
bike_sharing_df<-bike_sharing_df %>% mutate(VISIBILITY = (VISIBILITY - min(VISIBILITY)) / (max(VISIBILITY) - min(VISIBILITY)))
bike_sharing_df<-bike_sharing_df %>% mutate(DEW_POINT_TEMPERATURE = (DEW_POINT_TEMPERATURE - min(DEW_POINT_TEMPERATURE)) / (max(DEW_POINT_TEMPERATURE) - min(DEW_POINT_TEMPERATURE)))
bike_sharing_df<-bike_sharing_df %>% mutate(SOLAR_RADIATION = (SOLAR_RADIATION - min(SOLAR_RADIATION)) / (max(SOLAR_RADIATION) - min(SOLAR_RADIATION)))
bike_sharing_df<-bike_sharing_df %>% mutate(RAINFALL = (RAINFALL - min(RAINFALL)) / (max(RAINFALL) - min(RAINFALL)))
bike_sharing_df<-bike_sharing_df %>% mutate(SNOWFALL = (SNOWFALL - min(SNOWFALL)) / (max(SNOWFALL) - min(SNOWFALL)))

In [17]:
# Print the summary of the dataset again to make sure the numeric columns range between 0 and 1
summary(bike_sharing_df)

     DATE           RENTED_BIKE_COUNT  TEMPERATURE        HUMIDITY     
 Length:8465        Min.   :0.00000   Min.   :0.0000   Min.   :0.0000  
 Class :character   1st Qu.:0.05965   1st Qu.:0.3636   1st Qu.:0.4286  
 Mode  :character   Median :0.15194   Median :0.5472   Median :0.5816  
                    Mean   :0.20460   Mean   :0.5345   Mean   :0.5933  
                    3rd Qu.:0.30445   3rd Qu.:0.7080   3rd Qu.:0.7551  
                    Max.   :1.00000   Max.   :1.0000   Max.   :1.0000  
   WIND_SPEED       VISIBILITY     DEW_POINT_TEMPERATURE SOLAR_RADIATION   
 Min.   :0.0000   Min.   :0.0000   Min.   :0.0000        Min.   :0.000000  
 1st Qu.:0.1216   1st Qu.:0.4602   1st Qu.:0.4412        1st Qu.:0.000000  
 Median :0.2027   Median :0.8429   Median :0.6107        Median :0.002841  
 Mean   :0.2332   Mean   :0.7131   Mean   :0.5977        Mean   :0.161326  
 3rd Qu.:0.3108   3rd Qu.:1.0000   3rd Qu.:0.7924        3rd Qu.:0.264205  
 Max.   :1.0000   Max.   :1.0000   Max. 

In [18]:
# Save the dataset as `seoul_bike_sharing_converted_normalized.csv`
write_csv(bike_sharing_df, "seoul_bike_sharing_converted_normalized.csv")

## Standardize the column names again for the new datasets


Since you have added many new indicator variables, you need to standardize their column names again by using the following code:


In [19]:
# Dataset list
dataset_list <- c('seoul_bike_sharing.csv', 'seoul_bike_sharing_converted.csv', 'seoul_bike_sharing_converted_normalized.csv')

for (dataset_name in dataset_list){
    # Read dataset
    dataset <- read_csv(dataset_name)
    # Standardized its columns:
    # Convert all columns names to uppercase
    names(dataset) <- toupper(names(dataset))
    # Replace any white space separators by underscore, using str_replace_all function
    names(dataset) <- str_replace_all(names(dataset), " ", "_")
    # Save the dataset back
    write.csv(dataset, dataset_name, row.names=FALSE)
}

[1mRows: [22m[34m8465[39m [1mColumns: [22m[34m14[39m
[36m--[39m [1mColumn specification[22m [36m------------------------------------------------------------------------------------------------[39m
[1mDelimiter:[22m ","
[31mchr[39m  (4): DATE, SEASONS, HOLIDAY, FUNCTIONING_DAY
[32mdbl[39m (10): RENTED_BIKE_COUNT, HOUR, TEMPERATURE, HUMIDITY, WIND_SPEED, VISIBI...

[36mi[39m Use `spec()` to retrieve the full column specification for this data.
[36mi[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.
[1mRows: [22m[34m8465[39m [1mColumns: [22m[34m41[39m
[36m--[39m [1mColumn specification[22m [36m------------------------------------------------------------------------------------------------[39m
[1mDelimiter:[22m ","
[31mchr[39m  (2): DATE, FUNCTIONING_DAY
[32mdbl[39m (39): RENTED_BIKE_COUNT, TEMPERATURE, HUMIDITY, WIND_SPEED, VISIBILITY, ...

[36mi[39m Use `spec()` to retrieve the full column specification for t

# Next Steps


Great! Now that you have processed all of the necessary datasets, you are ready to perform exploratory data analysis to get some inital insights from them.


## Authors

<a href="https://www.linkedin.com/in/yan-luo-96288783/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkRP0321ENSkillsNetwork25371262-2022-01-01" target="_blank">Yan Luo</a>


### Other Contributors

Jeff Grossman


## Change Log

| Date (YYYY-MM-DD) | Version | Changed By | Change Description      |
| ----------------- | ------- | ---------- | ----------------------- |
| 2021-04-08        | 1.0     | Yan        | Initial version created |
|                   |         |            |                         |
|                   |         |            |                         |

## <h3 align="center"> © IBM Corporation 2021. All rights reserved. <h3/>
