Handling current_block_id and effective_grid for tooling on top of blockApply

Yes, I know I really shouldn't be tangling with either of these two variables, but bear with me here...

**beachmat** implements the [`colBlockApply()` function](https://github.com/LTLA/beachmat/blob/master/R/colBlockApply.R), which was originally a wrapper around `blockApply` with `grid=` just set to `colAutoGrid(x)` so that I didn't have to keep on typing it all out. (Same for `rowBlockApply`.) Over time, though, it evolved to add some additional features that were useful to me:

- Avoid the overhead of a `*gCMatrix` to `SparseArraySeed` back to `*gCMatrix` conversion when dealing with functions that were capable of taking `*gCMatrix` inputs.
- Split up in-memory matrices (specifically, `gCMatrix` and ordinary matrices) prior to calling **BiocParallel** functions, to avoid the cost of serializing the entire matrix when only a fragment is used in each worker.
- Avoid the overhead of block processing altogether when the matrix is in memory and no parallelization is requested, in which case we can just apply `FUN` on the full matrix directly.

To do this, sometimes I would pass `FUN` to `blockApply()`, and other times I would apply `FUN` directly to the matrix or its split-up fragments. This worked pretty well, provided I added the grid ID attributes to the matrix prior to calling `FUN` manually. However, this is no longer the case with the changes I requested from #69. Oops.

I can mimic the creation of `current_block_id` and `effective_grid` so that it gets found by `effectiveGrid()` and friends, but this makes my code pretty fragile to any changes you make in the `effectiveGrid()` discovery mechanism. So I wonder whether it would be possible to expose a setter mechanism for my use case. 

Of course, I'd be happy to push some of my changes to `blockApply()` itself, and then `colBlockApply()` could revert to being the wrapper that it used to be. However, some of those changes are a bit opinionated, e.g., it assumes that `FUN` is capable of taking `*gCMatrix` inputs because that's also what **beachmat** v3 supports. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling current_block_id and effective_grid for tooling on top of blockApply #71

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Handling current_block_id and effective_grid for tooling on top of blockApply #71

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions