Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Callback API #427

Closed
hadley opened this issue Jun 8, 2016 · 2 comments
Closed

Callback API #427

hadley opened this issue Jun 8, 2016 · 2 comments
Assignees
Labels
Milestone

Comments

@hadley
Copy link
Member

@hadley hadley commented Jun 8, 2016

Mostly a placeholder for now, but for the next release of readr we need to make sure we have a callback interface so that have a way to handle files that are bigger than memory.

Rough sketch of interface after discussions with @jcheng5

read_csv_chunked <- function(..., callback) {
  callback <- as_chunk_callback(callback)

  callback$begin()
  on.exit(callback$finally(), add = TRUE)
  pos <- 1

  while (callback$continue() && has_more_rows()) {
    data <- read_more_rows()
    callback$receive(data, pos)
    pos <- pos + nrow(data)
  }

  return(callback$result())
}

as_chunk_callback <- function(x) UseMethod("as_chunk_callback")
as_chunk_callback.function <- function(x) {
  SideEffectChunkCallback$new(callback)
}
as_chunk_callback.R6ClassGenerator <- function(x) {
  as_chunk_callback(x$new())
}
as_chunk_callback.ChunkCallback <- function(x) {
  x
}

# This would be used if the result should be thrown away
SideEffectChunkCallback <- R6Class("SideEffectChunkCallback", "ChunkCallback",
  private = list(
    callbackFunc = NULL,
    cancel = FALSE
  ),
  public = list(
    initialize = function(callbackFunc) {
      check_callback_fun(callbackFunc)
      private$callbackFunc <- callbackFunc
    },
    receive = function(data, index) {
      result <- private$callbackFunc(data, index)
      private$cancel <- identical(result, FALSE)
    },
    continue = function() {
      !private$cancel
    }
  )
)

# Used if the result of each chunk should be combined
# at the end
DataFrameCallback <- R6::R6Class("DataFrameCallback", "ChunkCallback",
  private = list(
    callbackFunc = NULL,
    results = list()
  ),
  public = list(
    initialize = function(callbackFunc) {
      private$callbackFunc <- callbackFunc
    },
    receive = function(data, index) {
      result <- private$callbackFunc(data, index)
      private$results <- c(private$results, list(result))
    },
    finally = function() {
      dplyr::bind_rows(private$results)
    }
  )
)
check_callback_fun <- function(x) {
  n_args <- length(formals(x))
  if (n_args < 2) {
    stop("`callbackFunc` must have two or more arguments", call. = FALSE)
  }
}
@vspinu
Copy link
Member

@vspinu vspinu commented Dec 18, 2016

Wasn't this one "fixed" by now?

@jimhester
Copy link
Member

@jimhester jimhester commented Jan 23, 2017

I think now that #498 is merged this is done and can be closed.

@jimhester jimhester closed this Jan 23, 2017
@lock lock bot locked and limited conversation to collaborators Sep 24, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants