Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create_report fails due to empty data object after na.omit #74

Closed
kmishra9 opened this issue Jun 24, 2018 · 9 comments
Closed

create_report fails due to empty data object after na.omit #74

kmishra9 opened this issue Jun 24, 2018 · 9 comments

Comments

@kmishra9
Copy link

@kmishra9 kmishra9 commented Jun 24, 2018

Hey there,

I'm trying to generate some reports. Generation of these previously worked, but is no longer working with the same function calls.

Here's the traceback:

Error in sum(ind) : invalid 'type' (list) of argument 
25. 
split_columns(data) 
24. 
plot_correlation(data = structure(list(hh_id = integer(0), flu_season = integer(0), 
   studyID = integer(0), year = integer(0), week = integer(0), 
   visit_type = character(0), num_MAARI = integer(0), flu_tested = integer(0), 
   flu_test_type = character(0), flu_test_positive = integer(0),  ... 
23. 
do.call(fun_name, c(list(data = data), report_config[[fun_name]])) at <text>#14
22. 
do_call("plot_correlation", na_omit = TRUE) at <text>#3
21. 
eval(expr, envir, enclos) 
20.
eval(expr, envir, enclos) 
19.
withVisible(eval(expr, envir, enclos)) 
18.
withCallingHandlers(withVisible(eval(expr, envir, enclos)), warning = wHandler, 
   error = eHandler, message = mHandler) 
17.
handle(ev <- withCallingHandlers(withVisible(eval(expr, envir, 
   enclos)), warning = wHandler, error = eHandler, message = mHandler)) 
16.
timing_fn(handle(ev <- withCallingHandlers(withVisible(eval(expr, 
   envir, enclos)), warning = wHandler, error = eHandler, message = mHandler))) 
15.
evaluate_call(expr, parsed$src[[i]], envir = envir, enclos = enclos, 
   debug = debug, last = i == length(out), use_try = stop_on_error != 
       2L, keep_warning = keep_warning, keep_message = keep_message, 
   output_handler = output_handler, include_timing = include_timing) 
14.
evaluate::evaluate(...) 
13.
evaluate(code, envir = env, new_device = FALSE, keep_warning = !isFALSE(options$warning), 
   keep_message = !isFALSE(options$message), stop_on_error = if (options$error && 
       options$include) 0L else 2L, output_handler = knit_handlers(options$render, 
       options)) 
12.
in_dir(input_dir(), evaluate(code, envir = env, new_device = FALSE, 
   keep_warning = !isFALSE(options$warning), keep_message = !isFALSE(options$message), 
   stop_on_error = if (options$error && options$include) 0L else 2L, 
   output_handler = knit_handlers(options$render, options))) 
11.
block_exec(params) 
10.
call_block(x) 
9.
process_group.block(group) 
8.
process_group(group) 
7.
withCallingHandlers(if (tangle) process_tangle(group) else process_group(group), 
   error = function(e) {
       setwd(wd)
       cat(res, sep = "\n", file = output %n% "") ... 
6.
process_file(text, output) 
5.
knitr::knit(knit_input, knit_output, envir = envir, quiet = quiet, 
   encoding = encoding) 
4.
render(input = report_dir, output_file = output_file, output_dir = output_dir, 
   intermediates_dir = output_dir, params = list(data = data, 
       report_config = config, response = y), ...) 
3.
withCallingHandlers(expr, warning = function(w) invokeRestart("muffleWarning")) 
2.
suppressWarnings(render(input = report_dir, output_file = output_file, 
   output_dir = output_dir, intermediates_dir = output_dir, 
   params = list(data = data, report_config = config, response = y), 
   ...)) 
1.
DataExplorer::create_report(data = patientHistory, output_file = paste0(filenames[3], 
   reportSuffix), output_dir = reportDir)

The error seems to be occurring specifically after correlation_analysis on multiple different datasets. Here's a screenshot:

image

@kmishra9
Copy link
Author

@kmishra9 kmishra9 commented Jun 24, 2018

When I configure the report to exclude a correlation analysis, I get a near identical error for principal components analysis.

image

@boxuancui
Copy link
Owner

@boxuancui boxuancui commented Jun 24, 2018

Would you mind sharing the structure of the data, e.g., str(patientHistory)? If possible, would you mind sharing a reproducible sample?

@DSQueen
Copy link

@DSQueen DSQueen commented Jun 27, 2018

I am having the exact same error when I attempt to create a report as well. I converted my dataset to all factors and numbers. I would assume there must be data other than numeric in the call for sum(ind).

I broke out my data frame into factor and numeric only data frames. I am able to successfully generate a report on the numeric-only data frame. I am receiving the sum(ind) error on the data frame of all factor variables. When I replaced all NA values in the factor-only data frame to "test", I was able to successfully generate a report as well.

UPDATE: After removing many columns I did not need, I was able to successfully create a report. I'm not sure which factor column and its data structure was causing the error unfortunately.

@kmishra9
Copy link
Author

@kmishra9 kmishra9 commented Jun 29, 2018

The data I'm working with is unfortunately sensitive, so I can't provide a sample :/. Here's an obfuscated str() of the dataset:

'data.frame':	762 obs. of  30 variables:
 $ some_id          : int  123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ...
 $ some_var         : int  123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ...
 $ some_id2         : int  123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ...
 $ some_var2        : int  123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ...
 $ some_var3        : int  123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ...
 $ some_var4        : chr  "SomeChars" "SomeChars" "SomeChars" "SomeChars" ...
 $ some_var5        : int  NA NA 123456789 NA 123456789 123456789 NA NA NA NA ...
 $ some_var6        : int  NA NA 123456789 123456789 NA NA NA NA 123456789 NA ...
 $ some_var7        : chr  "" "" "SomeChars" "SomeChars" ...
 $ some_var8        : int  NA NA NA NA NA NA NA NA NA NA ...
 $ some_var9        : int  NA 123456789 123456789 123456789 123456789 123456789 123456789 123456789 NA 123456789 ...
 $ some_var10       : int  123456789 NA 123456789 NA NA NA NA NA NA NA ...
 $ some_var11       : int  123456789 NA 123456789 NA NA NA NA NA NA NA ...
 $ some_var12       : int  NA NA 123456789 123456789 NA NA NA NA 123456789 NA ...
 $ some_var13       : int  NA NA NA NA NA NA NA NA NA NA ...
 $ some_var14       : int  NA NA NA NA NA NA NA NA NA NA ...
 $ some_var15       : int  NA NA NA 123456789 NA NA NA NA NA NA ...
 $ some_var16       : int  NA NA NA NA NA NA NA NA NA NA ...
 $ some_var17       : int  NA NA NA NA NA NA NA NA NA NA ...
 $ some_var18       : int  NA NA NA NA NA NA NA NA NA NA ...
 $ some_var19       : chr  "SomeChars" "SomeChars" "SomeChars" "SomeChars" ...
 $ some_var20       : int  123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ...
 $ some_var21       : int  NA NA NA NA NA NA NA NA NA NA ...
 $ some_var22       : int  NA NA 123456789 NA NA NA NA NA NA NA ...
 $ some_var23       : num  123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 123456789 ...
 $ some_var24       : chr  "SomeChars" "SomeChars" "SomeChars" "SomeChars" ...
 $ some_var25       : chr  "SomeChars" "SomeChars" "SomeChars" "SomeChars" ...
 $ some_var26       : int  NA 123456789 NA 123456789 123456789 123456789 NA 123456789 123456789 123456789 ...
 $ some_var27       : chr  "SomeChars" "SomeChars" "SomeChars" "SomeChars" ...
 $ some_var28       : num  0.123456789 0.123456789 0.123456789 0.123456789 0.123456789 0.123456789 0.123456789 0.123456789 0.123456789 0.123456789 ...

Hope it's helpful

@NabeelahB
Copy link

@NabeelahB NabeelahB commented Jul 17, 2018

I am having the same error, any possible fix found?

@boxuancui
Copy link
Owner

@boxuancui boxuancui commented Jul 17, 2018

Hi @NabeelahB without a reproducible example, it is kind of difficult to find the root cause.

I was traveling for the past month, and will take some time to look into the issue soon. Apologies for any inconvenience.

@boxuancui boxuancui self-assigned this Jul 17, 2018
@boxuancui boxuancui added the type: bug label Jul 17, 2018
@boxuancui boxuancui added this to the v0.7.0 milestone Jul 17, 2018
@NabeelahB
Copy link

@NabeelahB NabeelahB commented Jul 17, 2018

Have uploaded the file. Hope this helps
sample.xlsx

@boxuancui
Copy link
Owner

@boxuancui boxuancui commented Jul 18, 2018

@kmishra9 @DSQueen @NabeelahB
From the first look of it, it is due to the missing values in your dataset. When calling plot_correlation and plot_prcomp from create_report, your input data is automatically passed to na.omit for cleaning. From @NabeelahB 's sample.xlsx data, na.omit(data) gives me an empty data object, thus failing the function. I will think about better handling this in the next release. For now, please use the config argument in create_report to remove plot_correlation and plot_prcomp, e.g.,

create_report(
	...,
	config = list(
		"introduce" = list(),
		"plot_str" = list(
			"type" = "diagonal",
			"fontSize" = 35,
			"width" = 1000,
			"margin" = list("left" = 350, "right" = 250)
		),
		"plot_missing" = list(),
		"plot_histogram" = list(),
		"plot_bar" = list(),
		"plot_boxplot" = list(),
		"plot_scatterplot" = list()
	)
)

Note: The default config argument is this. See first example from ?create_report.

config <- list(
	"introduce" = list(),
	"plot_str" = list(
		"type" = "diagonal",
		"fontSize" = 35,
		"width" = 1000,
		"margin" = list("left" = 350, "right" = 250)
	),
	"plot_missing" = list(),
	"plot_histogram" = list(),
	"plot_bar" = list(),
	"plot_correlation" = list("use" = "pairwise.complete.obs"),  ## Remove this
	"plot_prcomp" = list(),                                      ## Remove this
	"plot_boxplot" = list(),
	"plot_scatterplot" = list()
)
@boxuancui boxuancui changed the title Error during Report Generation create_report fails due to empty data object after na.omit Jul 18, 2018
@kmishra9
Copy link
Author

@kmishra9 kmishra9 commented Jul 18, 2018

Yup, that was my default solution. Thanks!

boxuancui added a commit that referenced this issue Oct 16, 2018
@boxuancui boxuancui closed this Oct 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.