New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple header rows mini-vignette? #492

Closed
apreshill opened this Issue Jul 9, 2018 · 8 comments

Comments

Projects
None yet
3 participants
@apreshill
Copy link
Contributor

apreshill commented Jul 9, 2018

Hi Jenny,

I accept the challenge to contribute a mini-vignette 馃槃 Do you have a preference on dataset? Would you prefer a small toy example dataset as a demo (as opposed to the data I used in my blog post)?

Thanks!
Alison

jennybc added a commit that referenced this issue Jul 10, 2018

Add a 2nd worksheet to clippy.xls[x]
Should be a good example for #492
@jennybc

This comment has been minimized.

Copy link
Member

jennybc commented Jul 10, 2018

I just added a second sheet to the clippy.xls[x] example that already ships with readxl. I think it's sufficiently annoying? See what you think. I'm happy to get suggestions for change or if you want me to add a couple of rows.

screen shot 2018-07-10 at 1 44 29 pm

@jennybc

This comment has been minimized.

Copy link
Member

jennybc commented Jul 10, 2018

I assume it will go something like this and here's how to access the example sheet:

library(magrittr)
library(readxl)

(cnames <- readxl_example("clippy.xlsx") %>% 
    read_excel(sheet = 2, n_max = 0) %>% 
    names())
#> [1] "name"    "species" "death"   "weight"

readxl_example("clippy.xlsx") %>% 
  read_excel(sheet = 2, skip = 2, col_names = cnames)
#> # A tibble: 1 x 4
#>   name   species   death               weight
#>   <chr>  <chr>     <dttm>               <dbl>
#> 1 Clippy paperclip 2007-01-01 00:00:00    0.9

Created on 2018-07-10 by the reprex package (v0.2.0.9000).

@brianwdavis

This comment has been minimized.

Copy link

brianwdavis commented Jul 11, 2018

Thanks for taking this on @apreshill!

If you look at the issue I opened (#486), I often have files where the first row isn't enough header to uniquely identify a variable, and we often need the data dictionary to be kept with the name. Following the clippy.xls example, I suppose I'd want to have the column "death"/"(date is approximate)" changed to "date"/"of death" and another column added "date"/"of birth" (with value 1/1/1997). However I can also see where this would add unnecessary complexity.

My suggestion if you want to add that as a use case is to move the metadata extraction up to Step 1, and then full_cnames <- paste(cnames, unlist(clippy_meta), sep = "_") and use that as an alternative option for the names in Step 2.

@apreshill

This comment has been minimized.

Copy link
Contributor

apreshill commented Jul 11, 2018

Hi @brianwdavis and @jennybc -

This sounds good, and a nice solution- although for my own use cases (I've had several), the simpler solution was what I needed. So I'm wondering if I could frame this solution using paste as still an alternative, after showing the basic two steps first. It feels like the first hurdle is understanding how you can change the default args to the function, and use the function twice with different args each time, then from there adding complexity for more specific use cases. But, I'm not totally set on this sequence- @jennybc ?

Thanks!
Alison

@jennybc

This comment has been minimized.

Copy link
Member

jennybc commented Jul 11, 2018

I'm hoping this vignette could kick off with the simplest possible example of this phenomenon + solution, then ramp up to some of the more complicated (and realistic) ones. Hopefully @brianwdavis will contribute his.

I'm at useR! right now and not in a position to review this at the moment. I think the main thing is to coordinate what needs to be true about, e.g., the clippy example sheet to support the vignette. It's always possible to add a 3rd sheet with even more pathological behaviour.

@brianwdavis

This comment has been minimized.

Copy link

brianwdavis commented Jul 12, 2018

I think the clippy.xlsx file and mini-vignette (vignette-ette?) look great as is for getting the concept out in the wild.

@jennybc

This comment has been minimized.

Copy link
Member

jennybc commented Jul 13, 2018

@brianwdavis now that #494 is merged, you are welcome to make a PR adding your more complicated example.

@jennybc

This comment has been minimized.

Copy link
Member

jennybc commented Dec 14, 2018

Closing this, since we have a useful basic article now. @brianwdavis (or anyone): I remain open to PRs that extend the article with common, but more complicated situations.

@jennybc jennybc closed this Dec 14, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment