errors do not cancel computation -> endless loop #44

MaximilianPi · 2024-01-19T09:44:26Z

Errors in the workers do not seem to abort the computations and result in an infitine loop (tested on MacOS and Linux):

backend = parabar::start_backend(3L)
parabar::configure_bar(type = "modern", format = ":percent :eta", width = round(getOption("width")/2), clear =F)
results_tuning <- parabar::par_lapply(backend, 1:10, function(i) {
  print() # first error
  stop("Error...") 
  return(0)
})
print("End")
parabar::stop_backend(backend)

It works if you interrupt and rerun it a second time with the same workers/backend.

The text was updated successfully, but these errors were encountered:

mihaiconstantin · 2024-01-30T10:07:25Z

Thanks for reporting this! Do you know if this behavior also occurs when using the R6Class API?

mihaiconstantin · 2024-02-04T13:46:57Z

Consider for a moment the code below:

# Specification instance.
specification <- Specification$new()

# Specification details.
specification$set_cores(cores = 3)
specification$set_type(type = "psock")

# Backend instance.
backend <- AsyncBackend$new()

# Start the backend.
backend$start(specification)

# Run the task.
backend$sapply(1:10, function(x) {
    stop("First intended error.")
    stop("Second intended error.")
    return(0)
})

# Read the output.
backend$get_output(wait = TRUE)

# Stop it.
backend$stop()

The call backend$get_output(wait = TRUE) successfully reports that an error has occurred in the sub-session:

Error: ! in callr subprocess.
Caused by error in `checkForRemoteErrors(val)`:
! 3 nodes produced errors; first error: First intended error.

This is what we expect to see because in AsyncBackend.R we check the sub-session for errors and raise them in the interactive session (i.e., as seen in the lines below):

parabar/R/AsyncBackend.R

Lines 269 to 273 in 8bbeaab

    
           # If an error ocurred in the session. 
        
           if (!is.null(output$error)) { 
        
               # Throw error in the main session. 
        
               Exception$async_task_error(output$error) 
        
           }

Since the backend works as intended, I tend to believe the issue is with the context classes in which this backend operates (i.e., maybe ProgressTrackingContext.R).

mihaiconstantin · 2024-02-04T14:46:37Z

It looks like the problem is, indeed, with the progress tracking, and not with the backend.

While the tasks are being executed, each worker reports the progress. The progress is then monitored from the interactive session by the .show_progress method of ProgressTrackingContext.R and displayed (e.g., as a progress bar). However, since each worker throws an error after the first task execution, subsequent executions are stopped and, consequently, no more progress is being reported. Despite this, the .show_progress is still waiting around for tasks to be executed, without knowledge that no further tasks will be executed, i.e.:

parabar/R/ProgressTrackingContext.R

Lines 216 to 219 in 8bbeaab

    
           # While there are still tasks1 to be processed. 
        
           while (tasks_processed < total) { 
        
               # Get the current number of tasks processed. 
        
               current_tasks_processed <- length(readLines(log, warn = FALSE))

We need to let .show_progress know when the tasks stop executing. Otherwise, the progress bar would just get stuck at the point in time where an error occurs. In your example this happens right at the beginning, but it can also happen later, e.g.:

function(x) {
    Sys.sleep(0.01)
    if(x == 50) {
        stop("First intended error.")
        stop("Second intended error.")
    }
    return(0)
}

Addresses #44.

In relation to #44.

mihaiconstantin · 2024-02-05T16:04:44Z

@MaximilianPi, this is now fixed in #49 and will be in the next release.

MaximilianPi · 2024-02-06T11:10:20Z

Great, thanks! @mihaiconstantin

mihaiconstantin added the bug Something isn't working label Jan 30, 2024

mihaiconstantin self-assigned this Jan 30, 2024

mihaiconstantin added a commit that referenced this issue Feb 5, 2024

Fix: interrupt progress bar if the task is completed

00c68bd

Addresses #44.

mihaiconstantin mentioned this issue Feb 5, 2024

Fix hanging progress bar on task errors in ProgressTrackingContext #49

Merged

mihaiconstantin added a commit that referenced this issue Feb 5, 2024

Test: add tests for progress tracking when tasks throw errors

5bcd70d

In relation to #44.

mihaiconstantin closed this as completed in 22ff294 Feb 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

errors do not cancel computation -> endless loop #44

errors do not cancel computation -> endless loop #44

MaximilianPi commented Jan 19, 2024

mihaiconstantin commented Jan 30, 2024

mihaiconstantin commented Feb 4, 2024

mihaiconstantin commented Feb 4, 2024

mihaiconstantin commented Feb 5, 2024

MaximilianPi commented Feb 6, 2024

errors do not cancel computation -> endless loop #44

errors do not cancel computation -> endless loop #44

Comments

MaximilianPi commented Jan 19, 2024

mihaiconstantin commented Jan 30, 2024

mihaiconstantin commented Feb 4, 2024

mihaiconstantin commented Feb 4, 2024

mihaiconstantin commented Feb 5, 2024

MaximilianPi commented Feb 6, 2024