<a href="https://cognitiveclass.ai/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkRP0321ENSkillsNetwork25371262-2022-01-01">
    <img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0101EN-Coursera/v2/M1_R_Basics/images/IDSNlogo.png" width="200" align="center">
</a>


<h1>Data Wrangling with Regular Expressions</h1>

Estimated time needed: **40** minutes


## Lab Overview:

In the previous data collection labs, you collected some raw datasets from several different sources. In this lab, you need to perform data wrangling tasks in order to improve data quality.


You will again use regular expressions, along with the `stringr` package (part of `tidyverse`), to clean up the bike-sharing systems data that you previously web scraped from the wiki page:

[https://en.wikipedia.org/wiki/List_of_bicycle-sharing_systems](https://en.wikipedia.org/wiki/List_of_bicycle-sharing_systems?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkRP0321ENSkillsNetwork25371262-2022-01-01)

<a href="https://cognitiveclass.ai/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkRP0321ENSkillsNetwork25371262-2022-01-01">
    <img src="https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0321EN-SkillsNetwork/labs/module_1/images/l2-list-bike-sharing-systems.png" width="800" align="center">
</a>


One typical challenge of web scraping is that data extracted from HTML pages may contain unnecessary or inconsistently fomatted information.\
For example:

*   Textual annotations in numeric fields: `1000 (Updated with 1050)`
*   Attached reference links: `Bike sharing system [123]`
*   Inconsistent data formats: `Yes` and `Y` for the logical value `TRUE` or `2021-04-09` and `Apr 09, 2021` for the same date
*   HTML style tags: `<span style="color:blue">Bike sharing system</span>`
*   Special characters: `&nbsp` for a white space

Many more such examples of noise may be encountered in real-world scraped data and most of such text related noises could be handled by regular expressions.


To summarize, you will be using `stringr` (part of `tidyverse`) and regular expressions to perform the following data wrangling tasks:

*   TASK: Standardize column names for all collected datasets
*   TASK: Remove undesired reference links from the scraped bike-sharing systems dataset
*   TASK: Extract only the numeric value from undesired text annotations


Let's begin by importing the libraries you will use for these data wrangling tasks.


In [1]:
# Check whether you need to install the `tidyverse` library
#require("tidyverse")
library(tidyverse)

"package 'tidyverse' was built under R version 4.1.3"
-- [1mAttaching packages[22m ------------------------------------------------------------------------------- tidyverse 1.3.2 --
[32mv[39m [34mggplot2[39m 3.3.6     [32mv[39m [34mpurrr  [39m 0.3.4
[32mv[39m [34mtibble [39m 3.1.7     [32mv[39m [34mdplyr  [39m 1.0.9
[32mv[39m [34mtidyr  [39m 1.2.0     [32mv[39m [34mstringr[39m 1.4.0
[32mv[39m [34mreadr  [39m 2.1.2     [32mv[39m [34mforcats[39m 0.5.1
"package 'ggplot2' was built under R version 4.1.3"
"package 'tibble' was built under R version 4.1.3"
"package 'tidyr' was built under R version 4.1.3"
"package 'readr' was built under R version 4.1.3"
"package 'purrr' was built under R version 4.1.3"
"package 'dplyr' was built under R version 4.1.3"
"package 'stringr' was built under R version 4.1.3"
"package 'forcats' was built under R version 4.1.3"
-- [1mConflicts[22m ---------------------------------------------------------------------------------- 

## TASK: Standardize column names for all collected datasets


In the previous data collection labs, you collected four datasets in csv format:

*   `raw_bike_sharing_systems.csv`:  A list of active bike-sharing systems across the world
*   `raw_cities_weather_forecast.csv`: 5-day weather forecasts for a list of cities, from OpenWeather API
*   `raw_worldcities.csv`: A list of major cities' info (such as name, latitude and longitude) across the world
*   `raw_seoul_bike_sharing.csv`: Weather information (Temperature, Humidity, Windspeed, Visibility, Dewpoint, Solar radiation, Snowfall, Rainfall), the number of bikes rented per hour, and date information, from Seoul bike-sharing systems


*Optional:* If you had some difficulties finishing the data collection labs, you may download the datasets directly from the following URLs:


In [2]:
# Download raw_bike_sharing_systems.csv
url <- "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0321EN-SkillsNetwork/labs/datasets/raw_bike_sharing_systems.csv"
download.file(url, destfile = "raw_bike_sharing_systems.csv")

# Download raw_cities_weather_forecast.csv
url <- "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0321EN-SkillsNetwork/labs/datasets/raw_cities_weather_forecast.csv"
download.file(url, destfile = "raw_cities_weather_forecast.csv")

# Download raw_worldcities.csv
url <- "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0321EN-SkillsNetwork/labs/datasets/raw_worldcities.csv"
download.file(url, destfile = "raw_worldcities.csv")

# Download raw_seoul_bike_sharing.csv
url <- "https://cf-courses-data.s3.us.cloud-object-storage.appdomain.cloud/IBMDeveloperSkillsNetwork-RP0321EN-SkillsNetwork/labs/datasets/raw_seoul_bike_sharing.csv"
download.file(url, destfile = "raw_seoul_bike_sharing.csv")

To improve dataset readbility by both human and computer systems, we first need to standardize the column names of the datasets above using the following naming convention:

*   Column names need to be UPPERCASE
*   The word separator needs to be an underscore, such as in `COLUMN_NAME`


You can use the following dataset list and the `names()` function to get and set each of their column names, and convert them according to our defined naming convention.


In [3]:
dataset_list <- c('raw_bike_sharing_systems.csv', 'raw_seoul_bike_sharing.csv', 'raw_cities_weather_forecast.csv', 'raw_worldcities.csv')

*TODO*: Write a `for` loop to iterate over the above datasets and convert their column names


In [4]:
for (dataset_name in dataset_list){
    # Read dataset
    dataset <- read_csv(dataset_name)
    # Standardized its columns:
    dataset<-dataset %>% mutate_if(is.numeric,scale)
    # Convert all column names to uppercase
    dataset_colname <- toupper(names(dataset))
    # Replace any white space separators by underscores, using the str_replace_all function
    dataset_colname <- str_replace_all(dataset_colname, " ", "_")
    names(dataset)<-dataset_colname
    # Save the dataset 
    write.csv(dataset, dataset_name, row.names=FALSE)
}

[1mRows: [22m[34m480[39m [1mColumns: [22m[34m10[39m
[36m--[39m [1mColumn specification[22m [36m------------------------------------------------------------------------------------------------[39m
[1mDelimiter:[22m ","
[31mchr[39m (10): COUNTRY, City, Name, SYSTEM, OPERATOR, LAUNCHED, DISCONTINUED, STA...

[36mi[39m Use `spec()` to retrieve the full column specification for this data.
[36mi[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.
[1mRows: [22m[34m8760[39m [1mColumns: [22m[34m14[39m
[36m--[39m [1mColumn specification[22m [36m------------------------------------------------------------------------------------------------[39m
[1mDelimiter:[22m ","
[31mchr[39m  (4): Date, SEASONS, HOLIDAY, FUNCTIONING_DAY
[32mdbl[39m (10): RENTED_BIKE_COUNT, Hour, TEMPERATURE, HUMIDITY, WIND_SPEED, Visibi...

[36mi[39m Use `spec()` to retrieve the full column specification for this data.
[36mi[39m Specify the column ty

*TODO*: Read the resulting datasets back and check whether their column names follow the naming convention


In [5]:
for (dataset_name in dataset_list){
    # Print a summary for each data set to check whether the column names were correctly converted
    dataset <- read_csv(dataset_name)
    print(summary(dataset))
}

[1mRows: [22m[34m480[39m [1mColumns: [22m[34m10[39m
[36m--[39m [1mColumn specification[22m [36m------------------------------------------------------------------------------------------------[39m
[1mDelimiter:[22m ","
[31mchr[39m (10): COUNTRY, CITY, NAME, SYSTEM, OPERATOR, LAUNCHED, DISCONTINUED, STA...

[36mi[39m Use `spec()` to retrieve the full column specification for this data.
[36mi[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


   COUNTRY              CITY               NAME              SYSTEM         
 Length:480         Length:480         Length:480         Length:480        
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
   OPERATOR           LAUNCHED         DISCONTINUED         STATIONS        
 Length:480         Length:480         Length:480         Length:480        
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
   BICYCLES         DAILY_RIDERSHIP   
 Length:480         Length:480        
 Class :character   Class :character  
 Mode  :character   Mode  :character  


[1mRows: [22m[34m8760[39m [1mColumns: [22m[34m14[39m
[36m--[39m [1mColumn specification[22m [36m------------------------------------------------------------------------------------------------[39m
[1mDelimiter:[22m ","
[31mchr[39m  (4): DATE, SEASONS, HOLIDAY, FUNCTIONING_DAY
[32mdbl[39m (10): RENTED_BIKE_COUNT, HOUR, TEMPERATURE, HUMIDITY, WIND_SPEED, VISIBI...

[36mi[39m Use `spec()` to retrieve the full column specification for this data.
[36mi[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


     DATE           RENTED_BIKE_COUNT      HOUR          TEMPERATURE      
 Length:8760        Min.   :-1.1320   Min.   :-1.6612   Min.   :-2.56760  
 Class :character   1st Qu.:-0.8020   1st Qu.:-0.8306   1st Qu.:-0.79262  
 Mode  :character   Median :-0.2914   Median : 0.0000   Median : 0.06975  
                    Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.00000  
                    3rd Qu.: 0.5524   3rd Qu.: 0.8306   3rd Qu.: 0.80654  
                    Max.   : 4.4008   Max.   : 1.6612   Max.   : 2.22149  
                    NA's   :295                         NA's   :11        
    HUMIDITY          WIND_SPEED        VISIBILITY      DEW_POINT_TEMPERATURE
 Min.   :-2.85950   Min.   :-1.6645   Min.   :-2.3177   Min.   :-2.65489     
 1st Qu.:-0.79687   1st Qu.:-0.7960   1st Qu.:-0.8167   1st Qu.:-0.67179     
 Median :-0.06022   Median :-0.2170   Median : 0.4294   Median : 0.07857     
 Mean   : 0.00000   Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.00000     
 3rd Qu.: 

[1mRows: [22m[34m160[39m [1mColumns: [22m[34m12[39m
[36m--[39m [1mColumn specification[22m [36m------------------------------------------------------------------------------------------------[39m
[1mDelimiter:[22m ","
[31mchr[39m  (3): CITY, WEATHER, SEASON
[32mdbl[39m  (8): VISIBILITY, TEMP, TEMP_MIN, TEMP_MAX, PRESSURE, HUMIDITY, WIND_SPE...
[34mdttm[39m (1): FORECAST_DATETIME

[36mi[39m Use `spec()` to retrieve the full column specification for this data.
[36mi[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


     CITY             WEATHER            VISIBILITY             TEMP        
 Length:160         Length:160         Min.   :-12.57005   Min.   :-2.0307  
 Class :character   Class :character   1st Qu.:  0.07906   1st Qu.:-0.7870  
 Mode  :character   Mode  :character   Median :  0.07906   Median :-0.1550  
                                       Mean   :  0.00000   Mean   : 0.0000  
                                       3rd Qu.:  0.07906   3rd Qu.: 0.6792  
                                       Max.   :  0.07906   Max.   : 2.6445  
    TEMP_MIN          TEMP_MAX          PRESSURE           HUMIDITY       
 Min.   :-2.0115   Min.   :-2.0397   Min.   :-8.53900   Min.   :-1.89891  
 1st Qu.:-0.7765   1st Qu.:-0.7519   1st Qu.:-0.31104   1st Qu.:-0.74544  
 Median :-0.1417   Median :-0.1623   Median : 0.08709   Median :-0.05335  
 Mean   : 0.0000   Mean   : 0.0000   Mean   : 0.00000   Mean   : 0.00000  
 3rd Qu.: 0.6663   3rd Qu.: 0.7209   3rd Qu.: 0.48522   3rd Qu.: 0.69641  
 Max.   : 2

[1mRows: [22m[34m26569[39m [1mColumns: [22m[34m11[39m
[36m--[39m [1mColumn specification[22m [36m------------------------------------------------------------------------------------------------[39m
[1mDelimiter:[22m ","
[31mchr[39m (7): CITY, CITY_ASCII, COUNTRY, ISO2, ISO3, ADMIN_NAME, CAPITAL
[32mdbl[39m (4): LAT, LNG, POPULATION, ID

[36mi[39m Use `spec()` to retrieve the full column specification for this data.
[36mi[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


     CITY            CITY_ASCII             LAT               LNG         
 Length:26569       Length:26569       Min.   :-3.9310   Min.   :-2.2750  
 Class :character   Class :character   1st Qu.:-0.2312   1st Qu.:-0.9117  
 Mode  :character   Mode  :character   Median : 0.3181   Median : 0.1433  
                                       Mean   : 0.0000   Mean   : 0.0000  
                                       3rd Qu.: 0.6650   3rd Qu.: 0.5551  
                                       Max.   : 2.1712   Max.   : 2.5793  
                                                                          
   COUNTRY              ISO2               ISO3            ADMIN_NAME       
 Length:26569       Length:26569       Length:26569       Length:26569      
 Class :character   Class :character   Class :character   Class :character  
 Mode  :character   Mode  :character   Mode  :character   Mode  :character  
                                                                            
               

## Process the web-scraped bike sharing system dataset


By now we have standardized all column names. Next, we will focus on cleaning up the values in the web-scraped bike sharing systems dataset.


In [6]:
# First load the dataset
bike_sharing_df <- read_csv("raw_bike_sharing_systems.csv")

[1mRows: [22m[34m480[39m [1mColumns: [22m[34m10[39m
[36m--[39m [1mColumn specification[22m [36m------------------------------------------------------------------------------------------------[39m
[1mDelimiter:[22m ","
[31mchr[39m (10): COUNTRY, CITY, NAME, SYSTEM, OPERATOR, LAUNCHED, DISCONTINUED, STA...

[36mi[39m Use `spec()` to retrieve the full column specification for this data.
[36mi[39m Specify the column types or set `show_col_types = FALSE` to quiet this message.


In [7]:
# Print its head
head(bike_sharing_df)

COUNTRY,CITY,NAME,SYSTEM,OPERATOR,LAUNCHED,DISCONTINUED,STATIONS,BICYCLES,DAILY_RIDERSHIP
<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>,<chr>
Albania,Tirana,Ecovolis,,,March 2011,,8,200,
Argentina,Mendoza,Metrobici,,,2014,,2,40,
Argentina,"San Lorenzo, Santa Fe",Biciudad,Biciudad,,27 November 2016,,8,80,
Argentina,Buenos Aires,Ecobici,Serttel Brasil,Bike In Baires Consortium.[10],2010,,400,4000,21917.0
Argentina,Rosario,Mi Bici Tu Bici[11],,,2 December 2015,,47,480,
Australia,Melbourne[12],Melbourne Bike Share,PBSC & 8D,Motivate,June 2010,30 November 2019[13],53,676,


Even from the first few rows, you can see there is plenty of undesireable embedded textual content, such as the reference link included in `Melbourne[12]`.


In this project, let's only focus on processing the following revelant columns (feel free to process the other columns for more practice):

*   `COUNTRY`: Country name
*   `CITY`: City name
*   `SYSTEM`: Bike-sharing system name
*   `BICYCLES`: Total number of bikes in the system


In [8]:
# Select the four columns
sub_bike_sharing_df <- bike_sharing_df %>% select(COUNTRY, CITY, SYSTEM, BICYCLES)

Let's see the types of the selected columns


In [9]:
sub_bike_sharing_df %>% 
    summarize_all(class) %>%
    gather(variable, class)

variable,class
<chr>,<chr>
COUNTRY,character
CITY,character
SYSTEM,character
BICYCLES,character


They are all interpreted as character columns, but we expect the `BICYCLES` column to be of numeric type. Let's see why it wasn't loaded as a numeric column - possibly some entries contain characters. Let's create a simple function called `find_character` to check that.


In [10]:
# grepl searches a string for non-digital characters, and returns TRUE or FALSE
# if it finds any non-digital characters, then the bicyle column is not purely numeric
find_character <- function(strings) grepl("[^0-9]", strings)

Let's try to find any elements in the `Bicycles` column containing non-numeric characters.


In [11]:
sub_bike_sharing_df %>% 
    select(BICYCLES) %>% 
    filter(find_character(BICYCLES)) %>%
    slice(0:10)

BICYCLES
<chr>
4115[22]
310[59]
500[72]
[75]
180[76]
600[77]
[78]
initially 800 (later 2500)
100 (220)
370[114]


As you can see, many rows have non-numeric characters, such as `32 (including 6 rollers) [162]` and `1000[253]`. This is actually very common for a table scraped from Wiki when no input validation is enforced.

Later, you will use regular expressions to clean them up.


Next, let's take a look at the other columns, namely `COUNTRY`, `CITY`, and `SYSTEM`, to see if they contain any undesired reference links, such as in `Melbourne[12]`.


In [12]:
# Define a 'reference link' character class, 
# `[A-z0-9]` means at least one character 
# `\\[` and `\\]` means the character is wrapped by [], such as for [12] or [abc]
ref_pattern <- "\\[[A-z0-9]+\\]"
find_reference_pattern <- function(strings) grepl(ref_pattern, strings)

In [13]:
# Check whether the COUNTRY column has any reference links
sub_bike_sharing_df %>% 
    select(COUNTRY) %>% 
    filter(find_reference_pattern(COUNTRY)) %>%
    slice(0:10)

COUNTRY
<chr>


Ok, looks like the `COUNTRY` column is clean. Let's check the `CITY` column.


In [14]:
# Check whether the CITY column has any reference links
sub_bike_sharing_df %>% 
    select(CITY) %>% 
    filter(find_reference_pattern(CITY)) %>%
    slice(0:10)

CITY
<chr>
Melbourne[12]
Brisbane[14][15]
Lower Austria[18]
Namur[19]
Brussels[21]
Salvador[23]
Belo Horizonte[24]
Jo<e3>o Pessoa[25]
(Pedro de) Toledo[26]
Rio de Janeiro[27]


Hmm, looks like the `CITY` column has some reference links to be removed. Next, let's check the `SYSTEM` column.


In [15]:
# Check whether the System column has any reference links
sub_bike_sharing_df %>% 
    select(SYSTEM) %>% 
    filter(find_reference_pattern(SYSTEM)) %>%
    slice(0:10)

SYSTEM
<chr>
EasyBike[58]
4 Gen.[61]
3 Gen. SmooveKey[113]
3 Gen. Smoove[141][142][143][139]
3 Gen. Smoove[179]
3 Gen. Smoove[181]
3 Gen. Smoove[183]


So the `SYSTEM` column also has some reference links.


After some preliminary investigations, we identified that the `CITY` and `SYSTEM` columns have some undesired reference links, and the `BICYCLES` column has both reference links and some
textual annotations.

Next, you need to use regular expressions to clean up the unexpected reference links and text annotations in numeric values.


# TASK: Remove undesired reference links using regular expressions


*TODO:* Write a custom function using `stringr::str_replace_all` to replace all reference links with an empty character for columns `CITY` and `SYSTEM`


In [16]:
# remove reference link
remove_ref <- function(strings) {
    ref_pattern <- "\\[[A-z0-9]+\\]"
    # Replace all matched substrings with a white space using str_replace_all()
    str_replace_all(strings, ref_pattern,"")
    # Trim the reslt if you want
}

*TODO:* Use the `dplyr::mutate()` function to apply the `remove_ref` function to the `CITY` and `SYSTEM` columns


In [17]:
# sub_bike_sharing_df %>% mutate(column1=remove_ref(column1), ... )
result<- sub_bike_sharing_df %>% mutate(CITY=remove_ref(CITY), SYSTEM=remove_ref(SYSTEM), BICYCLES=remove_ref(BICYCLES))

*TODO:* Use the following code to check whether all reference links are removed:


In [18]:
result %>% 
    select(CITY, SYSTEM, BICYCLES) %>% 
    filter(find_reference_pattern(CITY) | find_reference_pattern(SYSTEM) | find_reference_pattern(BICYCLES))

CITY,SYSTEM,BICYCLES
<chr>,<chr>,<chr>


# TASK: Extract the numeric value using regular expressions


*TODO:* Write a custom function using `stringr::str_extract` to extract the first digital substring match and convert it into numeric type For example, extract the value '32' from `32 (including 6 rollers) [162]`.


In [19]:
# Extract the first number
extract_num <- function(columns){
    # Define a digital pattern
    digitals_pattern <- "\\d+"
    # Find the first match using str_extract
    result<-str_extract(columns, digitals_pattern)
    # Convert the result to numeric using the as.numeric() function
    result<-as.numeric(result)
}

*TODO:* Use the `dplyr::mutate()` function to apply `extract_num` on the `BICYCLES` column


In [20]:
# Use the mutate() function on the BICYCLES column
result<- result %>% mutate(BICYCLES=extract_num(BICYCLES))

*TODO:* Use the summary function to check the descriptive statistics of the numeric `BICYCLES` column


In [21]:
summary(result$BICYCLES)

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
      5     100     350    2022    1400   78000      78 

*TODO:* Write the cleaned bike-sharing systems dataset into a csv file called `bike_sharing_systems.csv`


In [22]:
# Write dataset to `bike_sharing_systems.csv`
 write_csv(result, "bike_sharing_systems.csv")

# References:


If you need to refresh your memory about regular expressions, please refer to this good Regular Expression cheat sheet:

<a href="https://www.rstudio.com/wp-content/uploads/2016/09/RegExCheatsheet.pdf?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkRP0321ENSkillsNetwork25371262-2022-01-01" target="_blank">Basic Regular Expressions in R</a>


# Next Steps


Great! Now you have cleaned up the bike-sharing system dataset using regular expressions. Next, you will use other `tidyverse` functions to perform data wrangling on the bike-sharing demand dataset.


## Authors

<a href="https://www.linkedin.com/in/yan-luo-96288783/?utm_medium=Exinfluencer&utm_source=Exinfluencer&utm_content=000026UJ&utm_term=10006555&utm_id=NA-SkillsNetwork-Channel-SkillsNetworkCoursesIBMDeveloperSkillsNetworkRP0321ENSkillsNetwork25371262-2022-01-01" target="_blank">Yan Luo</a>


### Other Contributors

Jeff Grossman


## Change Log

| Date (YYYY-MM-DD) | Version | Changed By | Change Description      |
| ----------------- | ------- | ---------- | ----------------------- |
| 2021-04-08        | 1.0     | Yan        | Initial version created |
|                   |         |            |                         |
|                   |         |            |                         |

## <h3 align="center"> © IBM Corporation 2021. All rights reserved. <h3/>
