"Error: Names must be unique" in ggwithinplot #396

XiaoyuZeng opened this issue Mar 17, 2020 · 7 comments

XiaoyuZeng commented Mar 17, 2020

The problem:

I tried to run a function (ggwithinplot) to plot data in ggstatsplot. But it took a long time to run this function, and nothing turned out:


So I shut down this function while it was running. I tried waiting. It didn't work. So this problem was not a time issue.

After that, I was wondering whether it was due to that I got a large number of data points (N=2000). So I tried another sample that includes 250 data points. And this time, I got this error: "ERROR: Names must be unique."

ERROR: Names must be unique. Backtrace: 
1. ggstatsplot::ggwithinstats(...) 
27. vctrs:::validate_unique(names = names) 
28. vctrs:::stop_names_must_be_unique(which(duplicated(names))) 
29. vctrs:::stop_names(...) 
30. vctrs:::stop_vctrs(...)

And I checked traceback:

31.abort(message, class = c(class, "vctrs_error"), ...)
30.stop_vctrs(message, class = c(class, "vctrs_error_names"), locations = locations, ...)
29.stop_names("Names must be unique.", class = "vctrs_error_names_must_be_unique", locations = locations)
27.validate_unique(names = names)
26.vctrs::vec_as_names(names, repair = "check_unique")
25.withCallingHandlers(expr, simpleError = function(cnd) { abort(conditionMessage(cnd), parent = cnd) })
23.doTryCatch(return(expr), name, parentenv, handler)
22.tryCatchOne(expr, names, parentenv, handlers[[1L]])
21.tryCatchList(expr, classes, parentenv, handlers)
20.tryCatch(instrument_base_errors(expr), vctrs_error_subscript = function(cnd) { cnd$subscript_action <- subscript_action(type) cnd$subscript_elt <- "column" cnd_signal(cnd) ...
19.with_subscript_errors(vctrs::vec_as_names(names, repair = "check_unique"))
18.rename_impl(NULL, .vars, quo(c(...)), strict = .strict)
17.tidyselect::vars_rename(names(.data), !!!enquos(...)) = ., variable = skim_variable)
15.dplyr::rename(.data = ., variable = skim_variable)
12.freduce(value, `_function_list`)
10.eval(quote(`_fseq`(`_lhs`)), env, env)
9.eval(quote(`_fseq`(`_lhs`)), env, env)
8.withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
7.dplyr::left_join(x = df_results %>% dplyr::group_modify(.f = ~tibble::as_tibble(skimr::skim(purrr::keep(.x = ., .p = ..f))), keep = FALSE) %>% dplyr::ungroup(x = .), y = dplyr::tally(df_results), by = purrr::map_chr(.x = grouping.vars, .f = rlang::as_string)) %>% dplyr::mutate(.data = ., n = n - n_missing) %>% purrr::set_names(x = ., ...
6.groupedstats::grouped_summary(data = data, grouping.vars = { { x } ...
5.eval(lhs, parent, parent)
4.eval(lhs, parent, parent)
3.groupedstats::grouped_summary(data = data, grouping.vars = { { x } ...
2.mean_labeller(data = data, x = { { x } ...
1.ggwithinstats(data = emotion_rating_dt_50, x = variable, y = Emotion_rating, point.path = FALSE, mean.path = FALSE, effsize.type = "partial_eta", p.adjust.method = "fdr", ggtheme = theme_classic(), palette = "Darjeeling2", package = "wesanderson", ggstatsplot.layer = FALSE, xlab = "Dilemma types"

What I have tried:

  1. I googled about this error. Did not get much helpful information.
  2. I updated r-base and all r packages. Not worked.
  3. I checked whether this problem was specific to ggwithinplot. And I found ggbetweenplot worked well even in the large sample (N=2000).
  4. I checked whether it was due to problems with the input data, which is required to be long-format. I did not find anything dubious.
  5. I checked whether the names of columns in the data frame duplicated. No. So I was really confused about the meaning of "name must be unique."


Here comes the reprex:

Any ideas?

Can you please post output from sessioninfo::sessioninfo()?

Copy link

Can you please post output from sessioninfo::sessioninfo()?

Updated. Sorry.

Copy link

Hmm, strange. I can't seem to find anything amiss about the package versions. They seem to be the same ones I have on my local machine and I don't get any error for the function where you get the error. In your traceback, you can see that the function mean_labeller is where the execution runs into problems.

But it runs just fine, not only on my local machine but also on virtual Travis and AppVeyor machines for both R 3.6 and R 4.0.

What do you get if you run the following?

ggstatsplot:::mean_labeller(data = mtcars, am, wt)

Hmm, strange. I can't seem to find anything amiss about the package versions. They seem to be the same ones I have on my local machine and I don't get any error for the function where you get the error. In your traceback, you can see that the function mean_labeller is where the execution runs into problems.

But it runs just fine, not only on my local machine but also on virtual Travis and AppVeyor machines for both R 3.6 and R 4.0.

#> Registered S3 method overwritten by 'broom.mixed':
#>   method      from 
#>   tidy.gamlss broom
#> Registered S3 methods overwritten by 'car':
#>   method                          from
#>   influence.merMod                lme4
#>   cooks.distance.influence.merMod lme4
#>   dfbeta.influence.merMod         lme4
#>   dfbetas.influence.merMod        lme4

ggstatsplot:::mean_labeller(data = mtcars, am, wt)
#> # A tibble: 2 x 4
#>   am       wt label                               n_label      
#>   <fct> <dbl> <chr>                               <chr>        
#> 1 0      3.77 list(~italic(widehat(mu))== 3.769 ) "0\n(n = 19)"
#> 2 1      2.41 list(~italic(widehat(mu))== 2.411 ) "1\n(n = 13)"

Created on 2020-03-17 by the reprex package (v0.3.0)

Session info
What do you get if you run the following?

ggstatsplot:::mean_labeller(data = mtcars, am, wt)

It runs just fine.

ggstatsplot:::mean_labeller(data = mtcars, am, wt)
Copy link

So this seems to have something to do with your data and I can't diagnose that without a self-contained reprex.

Your original reprex points to data on your local machine:
data <- import("E:/Zengxiaoyu/zxy_projcet/!ncov/data/Covid_Q1Q2data_minus200_0316.xlsx")

Copy link

XiaoyuZeng commented Mar 17, 2020

Thanks, Indrajeet. Thanks for the feedback. And thank you for guiding me to make the problem clearer.

I made another reprex with minimal data. Error remains.

#> Registered S3 methods overwritten by 'broom.mixed':
#>   method         from 
#>   augment.lme    broom
#>   augment.merMod broom
#>   glance.lme     broom
#>   glance.merMod  broom
#>   glance.stanreg broom
#>   tidy.brmsfit   broom
#>   tidy.gamlss    broom
#>   tidy.lme       broom
#>   tidy.merMod    broom
#>   tidy.rjags     broom
#>   tidy.stanfit   broom
#>   tidy.stanreg   broom
#> Registered S3 methods overwritten by 'car':
#>   method                          from
#>   influence.merMod                lme4
#>   cooks.distance.influence.merMod lme4
#>   dfbeta.influence.merMod         lme4
#>   dfbetas.influence.merMod        lme4
df <- read.table(
  header = TRUE,
  text = "  hubei self other province
1 HuBei    7    10        5
2 HuBei    2     0        0
3 HuBei    0     0      -22
4 HuBei    2     2        9
5 HuBei   11    -1       -4
6 HuBei    0     0        3

long_df <- tidyr::pivot_longer(df, 2:4, names_to = "variable", values_to = "Emotion_rating")
  data = long_df,
  x = variable, # > 2 groups
  y = Emotion_rating,
  point.path = FALSE
#> Note: 95% CI for effect size estimate was computed with 100 bootstrap samples.
#> Error: Names must be unique.
Created on 2020-03-18 by the reprex package (v0.3.0)

Copy link

Thanks a lot! I can reproduce this. This actually seems to trace to groupedstats package, so I will move the issue there: IndrajeetPatil/groupedstats#24

