Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Callback API #427

Closed
hadley opened this issue Jun 8, 2016 · 2 comments
Closed

Callback API #427

hadley opened this issue Jun 8, 2016 · 2 comments
Assignees
Labels
feature a feature request or enhancement
Milestone

Comments

@hadley
Copy link
Member

hadley commented Jun 8, 2016

Mostly a placeholder for now, but for the next release of readr we need to make sure we have a callback interface so that have a way to handle files that are bigger than memory.

Rough sketch of interface after discussions with @jcheng5

read_csv_chunked <- function(..., callback) {
  callback <- as_chunk_callback(callback)

  callback$begin()
  on.exit(callback$finally(), add = TRUE)
  pos <- 1

  while (callback$continue() && has_more_rows()) {
    data <- read_more_rows()
    callback$receive(data, pos)
    pos <- pos + nrow(data)
  }

  return(callback$result())
}

as_chunk_callback <- function(x) UseMethod("as_chunk_callback")
as_chunk_callback.function <- function(x) {
  SideEffectChunkCallback$new(callback)
}
as_chunk_callback.R6ClassGenerator <- function(x) {
  as_chunk_callback(x$new())
}
as_chunk_callback.ChunkCallback <- function(x) {
  x
}

# This would be used if the result should be thrown away
SideEffectChunkCallback <- R6Class("SideEffectChunkCallback", "ChunkCallback",
  private = list(
    callbackFunc = NULL,
    cancel = FALSE
  ),
  public = list(
    initialize = function(callbackFunc) {
      check_callback_fun(callbackFunc)
      private$callbackFunc <- callbackFunc
    },
    receive = function(data, index) {
      result <- private$callbackFunc(data, index)
      private$cancel <- identical(result, FALSE)
    },
    continue = function() {
      !private$cancel
    }
  )
)

# Used if the result of each chunk should be combined
# at the end
DataFrameCallback <- R6::R6Class("DataFrameCallback", "ChunkCallback",
  private = list(
    callbackFunc = NULL,
    results = list()
  ),
  public = list(
    initialize = function(callbackFunc) {
      private$callbackFunc <- callbackFunc
    },
    receive = function(data, index) {
      result <- private$callbackFunc(data, index)
      private$results <- c(private$results, list(result))
    },
    finally = function() {
      dplyr::bind_rows(private$results)
    }
  )
)
check_callback_fun <- function(x) {
  n_args <- length(formals(x))
  if (n_args < 2) {
    stop("`callbackFunc` must have two or more arguments", call. = FALSE)
  }
}
@hadley hadley modified the milestone: 0.3.0 Jul 13, 2016
@hadley hadley added the feature a feature request or enhancement label Jul 13, 2016
@vspinu
Copy link
Member

vspinu commented Dec 18, 2016

Wasn't this one "fixed" by now?

@jimhester
Copy link
Collaborator

I think now that #498 is merged this is done and can be closed.

@lock lock bot locked and limited conversation to collaborators Sep 24, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
feature a feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants