Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in serialize(data, node$con) : error writing to connection #652

Closed
GrizzledLotus opened this issue Mar 14, 2023 · 5 comments
Closed
Assignees

Comments

@GrizzledLotus
Copy link

Project Robyn

Describe issue

For the last week all of a sudden, I've gotten some version of this error. It usually happens when trying to run OutputCollect <- robyn_outputs(), but it also occurred when I run the OutputModels <- robyn_run(). This error came when I tried running:

this_sr_v1b1_a <- robyn_onepagers(select_model = "1_100_14",
                        OutputCollect = OutputCollect_sr_v1b1,
                        InputCollect = InputCollect_sr_v1b1,
                        export=FALSE,
                        quiet=TRUE)

I used this code below to help clear the parallelization issue from this post: [https://stackoverflow.com/questions/64519640/error-in-summary-connectionconnection-invalid-connection]. But that solution only worked once.

function() {
    env <- foreach:::.foreachGlobals
    rm(list=ls(name=env), pos=env)
}

When the error happens during the robyn_outputs() function, the output folder is created but then crashes when creating the one-pagers. I've tried restarting my session, restarting my computer and this serialization error is still occurring. I've gotten the error whether I ran the initial multi-core forcing two lines of code. I've gotten this error with both Robyn 3.9 and 3.10. I successfully ran a half dozen models using the same machine and setup and then this error started suddenly.

This error has happened on my local machine and on a windows virtual machine that was dedicated just to my Robyn computing.

Provide reproducible example

I can provide an anonymized dataset if absolutely necessary. Let me know if its needed.

Environment & Robyn version

sessionInfo()
R version 4.2.2 (2022-10-31 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.utf8 LC_CTYPE=English_United States.utf8 LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C LC_TIME=English_United States.utf8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] Robyn_3.10.0.9000

loaded via a namespace (and not attached):
[1] reticulate_1.28 shape_1.4.6 minpack.lm_1.2-3 tidyselect_1.2.0 purrr_1.0.1 h2o_3.40.0.1 splines_4.2.2
[8] lattice_0.20-45 colorspace_2.1-0 vctrs_0.5.2 generics_0.1.3 yaml_2.3.7 utf8_1.2.3 survival_3.4-0
[15] rlang_1.0.6 nloptr_2.0.3 pillar_1.8.1 withr_2.5.0 prophet_1.0 glue_1.6.2 lares_5.2.0
[22] rngtools_1.5.2 doRNG_1.8.6 foreach_1.5.2 lifecycle_1.0.3 plyr_1.8.8 rpart.plot_3.1.1 stringr_1.5.0
[29] munsell_0.5.0 gtable_0.3.1 rvest_1.0.3 zip_2.2.2 codetools_0.2-18 doParallel_1.0.17 parallel_4.2.2
[36] fansi_1.0.4 Rcpp_1.0.10 scales_1.2.1 RcppParallel_5.1.7 jsonlite_1.8.4 png_0.1-8 ggplot2_3.4.1
[43] digest_0.6.31 stringi_1.7.12 openxlsx_4.2.5.2 dplyr_1.1.0 grid_4.2.2 cli_3.6.0 tools_4.2.2
[50] bitops_1.0-7 magrittr_2.0.3 RCurl_1.98-1.10 glmnet_4.1-6 patchwork_1.1.2 tibble_3.2.0 tidyr_1.3.0
[57] pkgconfig_2.0.3 Matrix_1.5-1 xml2_1.3.3 pROC_1.18.0 ggridges_0.5.4 lubridate_1.9.2 timechange_0.2.0
[64] httr_1.4.5 rstudioapi_0.14 iterators_1.0.14 R6_2.5.1 rpart_4.1.19 compiler_4.2.2

@laresbernardo
Copy link
Collaborator

Hi @GrizzledLotus this is a known issue and it's related with memory limits reached. We are working on a way to reduce the size of the returned objects to avoid these crashes. For now, please try turning off parallel computing (cores = 1), reducing iterations and trials, or trying cloud resources with larger specs. We will let all users know when we improve it.

@laresbernardo laresbernardo self-assigned this Mar 14, 2023
@laresbernardo
Copy link
Collaborator

Closing this task now that we deployed some improvement related to reducing the outputs sizes. Feel free to reach back anytime.

@LMiddles
Copy link

LMiddles commented Jul 5, 2023

Getting the error
Error in serialize(data, node$con): error writing to connection

Full rundown from command

Calculating clusters for model selection using Pareto fronts...
Auto selected k = 7 (clusters) based on minimum WSS variance of 5%
Collecting 129 pareto-optimum results into: /Users/dynamiceconometrics/Documents/source bmx/robyn output/Robyn_202307051159_init/
Exporting general plots into directory...
Exporting pareto results as CSVs into directory...
Exporting pareto one-pagers into directory...
Generating only cluster results one-pagers (7)...
Plotting 7 selected models on 31 cores...
| | 0%Failed exporting results, but returned model results anyways:
Error in serialize(data, node$con): error writing to connection

when running
OutputCollect <- robyn_outputs(
InputCollect, OutputModels,
pareto_fronts = "auto", # automatically pick how many pareto-fronts to fill min_candidates (100)

min_candidates = 100, # top pareto models for clustering. Default to 100

calibration_constraint = 0.1, # range c(0.01, 0.1) & default at 0.1

csv_out = "pareto", # "pareto", "all", or NULL (for none)
clusters = TRUE, # Set to TRUE to cluster similar models by ROAS. See ?robyn_clusters
export = TRUE, # this will create files locally
plot_folder = robyn_directory, # path for plots exports and files creation
plot_pareto = TRUE # Set to FALSE to deactivate plotting and saving model one-pagers
)

Input collect has 39 elements, 1.9mb
OutputModels has 17 elements 309.9mb

I am running this on a virtual machine with 32 cores and more than enough memory

Current version of Robyn - ‘3.10.3’

The dataset is less than 3 years of data (127 obs of 17 variables)

Is this a memory problem or something else ?

@gufengzhou
Copy link
Contributor

In this thread the solution was dropping one core. Does it work for you?

@gufengzhou gufengzhou reopened this Jul 5, 2023
@LMiddles
Copy link

LMiddles commented Jul 5, 2023

cores was set to default so number of cores - 1 , the error is still there

@gufengzhou gufengzhou closed this as not planned Won't fix, can't repro, duplicate, stale Oct 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants