Description
Most of the times, I work with labelled tibbles and many warning appears during joins with dplyr because attributes aren't equals between variables used for joins.
I think labels are a nicely done thing in R (in RStudio with View()
), compared to sas labels.
I have seen this bigger issue before : #2701
And strengejacke has said the key point : when working with sas tibbles or generally labelled tibbles, even if now the join works, it's confusing to see too many warnings.
Finally, it makes dplyr less compatible with haven imports.
For next iteration of dplyr, could it be possible to be more flex with labelled tibbles ? I mean "no warning" ?
Otherwise, I will continue to use sjlabelled::remove_all_labels()
, without labels and there's no issue !
Little example below
table_1 <- sjlabelled::set_label(x = tibble::as_tibble(mtcars),
label = paste('label', 1:ncol(mtcars)))
View(table_1)
table_2 <- sjlabelled::set_label(x = tibble::tibble(carb = rep(1, 7)),
label = 'Carb lab')
View(table_2)
dplyr::inner_join(table_1, table_2)
Joining, by = "carb"
# A tibble: 49 x 11
mpg cyl disp hp drat wt qsec vs am gear carb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
2 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
3 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
4 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
5 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
6 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
7 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
8 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
9 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
10 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
# ... with 39 more rows
Warning message:
Column `carb` has different attributes on LHS and RHS of join
Many thanks.
Guillaume