Step 2: Parse First Name - Final Project #67

Dimetry-Adel · 2021-10-03T20:47:26Z

Hi,

Also, when running the functions for step 2.

top.25 <- head( d$Full.Name,25 )
first.25 <- get_first_name( name=top.25 )

data.frame( top.25, first.25 ) %>% pander()

This error appears.

Please, could you have an explanation for it?
Thanks

The text was updated successfully, but these errors were encountered:

lecy · 2021-10-03T23:47:31Z

You need to write the name parsing function called get_first_name() that will return only the first name.

The code is showing you what it should look like when working properly (full name vs first name):

voznyuky · 2021-10-05T00:34:09Z

Are we using split strings in this or am I way off?

lecy · 2021-10-05T00:37:42Z

Yep, you can build from what you did in lab 3 finding the first or last words in titles.

voznyuky · 2021-10-05T01:31:41Z

Lab 3 is the one I struggled on the most. This is what I have so far, although it is just breaking up the strings. How do I pull the 2nd word from each string, when there are multiple strings? I might need a little extra help on this step of the assignment.

get_first_name <- function( x )
{
first.names <- strsplit( d$Full.Name, " " )
return( first.names )
}

get_first_name()

lecy · 2021-10-05T02:02:06Z

What is the right delimiter to use for the split? Is it a space in this context?

lecy · 2021-10-05T02:02:34Z

Describe your pseudo code

lecy · 2021-10-06T21:05:44Z

Some code to get you started:

> x <- head( d$Full.Name, 5 )
> x
[1] "ABBASI, Mohammad"                    "ARQUIZA, Jose Maria Reynaldo Apollo"
[3] "Aaberg, Kelsea"                      "Abadjivor, Enyah"                   
[5] "Abayesu, Precious" 
> 
> # DROP LAST NAMES: 
> x.list <- strsplit( x, ", " )
> x.list
[[1]]
[1] "ABBASI"   "Mohammad"

[[2]]
[1] "ARQUIZA"                    "Jose Maria Reynaldo Apollo"

[[3]]
[1] "Aaberg" "Kelsea"

[[4]]
[1] "Abadjivor" "Enyah"    

[[5]]
[1] "Abayesu"  "Precious"

> 
> # get second element in each vector
> x.list[[1]][2]
[1] "Mohammad"
> x.list[[2]][2]
[1] "Jose Maria Reynaldo Apollo"
> x.list[[3]][2]
[1] "Kelsea"
> 
> # scale this with a loop?
> 
> x.second <- NULL
> for( i in 1:length(x) )
+ {
+    x.second[i] <- x.list[[i]][2]
+ }
> x.second
[1] "Mohammad"                   "Jose Maria Reynaldo Apollo"
[3] "Kelsea"                     "Enyah"                     
[5] "Precious"                  
> 
> # ALTERNATIVELY use a lapply (list apply) function: 
> # 
> # GET SECOND VALUE IN EACH VECTOR
> # function(x){ x[2] }
> 
> x2 <- lapply( x.list, function(x){ x[2] } )
> x2
[[1]]
[1] "Mohammad"

[[2]]
[1] "Jose Maria Reynaldo Apollo"

[[3]]
[1] "Kelsea"

[[4]]
[1] "Enyah"

[[5]]
[1] "Precious"

> x3 <- unlist( x2 )
> x3
[1] "Mohammad"                   "Jose Maria Reynaldo Apollo"
[3] "Kelsea"                     "Enyah"                     
[5] "Precious"

Now you have simplified the problem. Next step is to get the first name in each string:

"Jose Maria Reynaldo Apollo" --> "Jose"

Split it apart again, this time using spaces. And extract the first value in each vector.

Sean-In-The-Library · 2021-10-08T20:44:52Z

Lab 3 is the one I struggled on the most. This is what I have so far, although it is just breaking up the strings. How do I pull the 2nd word from each string, when there are multiple strings? I might need a little extra help on this step of the assignment.

get_first_name <- function( x ) { first.names <- strsplit( d$Full.Name, " " ) return( first.names ) }

get_first_name()

Same, this step is really rough for me. Pretty humbling...

lecy · 2021-10-08T20:57:29Z

Here is some code to get you started:

#83 (comment)

For the second part you basically repeat the same steps, but use a space as the split value then grab the first element in each vector instead of the second. This will drop middle names for cases like Jose:

"ARQUIZA, Jose Maria Reynaldo Apollo"

lecy · 2021-10-08T20:58:51Z

If writing code that doesn't work makes you humble then at this point I might be a saint :-)

lecy · 2021-10-08T21:02:12Z

Here is some test data with both 2020 and 2019 formats (some have space after the comma, some don't):

x <- 
c("ABBASI, Mohammad", "ARQUIZA, Jose Maria Reynaldo Apollo", 
"Aaberg,Kelsea", "Abadjivor, Enyah", "Abayesu,Precious", "Abbas, James", 
"Abbaszadegan, Morteza", "Abbe, Scott", "Abbl, Norma", "Abbott, Joshua", 
"Abbott, Joshua", "Abdollahi,Amir", "Abdou, Olgeanna", "Abdurhman, Abdurazak", 
"Abel, John", "Abele, Kelsey", "Aberle,James", "Abhyankar, Aditya", 
"Abi Karam, Karam", "AbiNader,Millan", "Aboalam, Safaa", "Abraha, Naomi", 
"Abramchuk, Mykola", "Abrams, Cristen", "Abrams,Kristen")

lecy mentioned this issue Oct 6, 2021

Final Lab - Batch Processing #74

Open

lecy added the final-project label Oct 8, 2021

aawoods97 mentioned this issue Oct 9, 2021

First Name Function #94

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Step 2: Parse First Name - Final Project #67

Step 2: Parse First Name - Final Project #67

Dimetry-Adel commented Oct 3, 2021

lecy commented Oct 3, 2021

voznyuky commented Oct 5, 2021

lecy commented Oct 5, 2021

voznyuky commented Oct 5, 2021

lecy commented Oct 5, 2021

lecy commented Oct 5, 2021

lecy commented Oct 6, 2021

Sean-In-The-Library commented Oct 8, 2021 •

edited

Loading

lecy commented Oct 8, 2021

lecy commented Oct 8, 2021

lecy commented Oct 8, 2021

Step 2: Parse First Name - Final Project #67

Step 2: Parse First Name - Final Project #67

Comments

Dimetry-Adel commented Oct 3, 2021

lecy commented Oct 3, 2021

voznyuky commented Oct 5, 2021

lecy commented Oct 5, 2021

voznyuky commented Oct 5, 2021

lecy commented Oct 5, 2021

lecy commented Oct 5, 2021

lecy commented Oct 6, 2021

Sean-In-The-Library commented Oct 8, 2021 • edited Loading

lecy commented Oct 8, 2021

lecy commented Oct 8, 2021

lecy commented Oct 8, 2021

Sean-In-The-Library commented Oct 8, 2021 •

edited

Loading