Add Backed AnnData Support via Low-Memory Chunked Processing and Numba Kernels #64

nick-youngblut · 2025-12-05T17:22:18Z

This PR introduces a major enhancement to pdex by adding support for backed AnnData objects, enabling differential expression analysis on datasets that are too large to fit into memory. This is achieved through a new low-memory, chunked processing pipeline, accelerated by dual Numba kernels for the Wilcoxon rank-sum test.

Changes

Features & Enhancements

Low-Memory Chunked Processing:
- A new low-memory backend has been added to _single_cell.py that processes genes in chunks, drastically reducing peak memory usage.
- The parallel_differential_expression function now automatically detects if the input AnnData is backed and switches to this low-memory implementation.
- A low_memory=True option has been added to allow users to force the chunked implementation for large in-memory datasets.
Numba-accelerated Wilcoxon Kernels:
- A new _parallel.py module introduces high-performance Numba kernels for the Wilcoxon rank-sum test.
- It features a dual-kernel implementation:
  - A fast, histogram-based kernel for integer count data (O(n + k)).
  - A sorting-based kernel for float data (O(n log n)).
- The appropriate kernel is automatically selected based on the data type of the expression matrix, ensuring optimal performance for both raw counts and normalized data.
Enhanced Parallelization Control:
- The parallel_differential_expression function now offers two levels of parallelization for the low-memory backend:
  - num_workers: Controls the number of threads for target-level parallelism.
  - num_threads: Controls the number of Numba threads for gene-level parallelism within the Wilcoxon kernel.

Testing

Integration Tests:
- A new integration test suite (tests/test_integration.py) has been added to ensure that the results of the new low-memory implementation are consistent with the standard in-memory implementation across various scenarios (dense/sparse, count/log-transformed data).
Unit Tests:
- A new test file (tests/test_parallel.py) provides unit tests for the new Numba kernels and parallelization helper functions.
- Existing tests in tests/test_pdex.py have been extended to validate the low-memory implementation against the standard one.

Documentation

The README.md file has been significantly updated to explain the new features, including the automatic backend selection for backed AnnData and the new parallelization options.

…ifferential expression calculations - Added `_is_backed_array` function to check for backed/HDF5 arrays. - Introduced `_load_chunk` to handle loading of matrix chunks. - Updated `parallel_differential_expression` to automatically select processing strategy based on data type. - Enhanced error handling for backed AnnData in differential expression functions. - Improved documentation for clarity on parameters and return values.

…rential_expression_vec_wrapper - Introduced low_memory option for optimized processing. - Added gene_chunk_size parameter to control chunk size during computations.

- Added show_progress parameter to parallel_differential_expression and parallel_differential_expression_vec_wrapper for better user feedback during processing. - Updated handling of fc calculations to return NaN for zero means in specific cases, improving robustness of differential expression calculations.

- Introduced a new module `_parallel.py` containing utilities for parallel processing in differential expression calculations. - Implemented functions for default parallelization settings, Numba thread management, and processing targets in chunks. - Added a vectorized Wilcoxon ranksum test implementation for improved performance. - Created a new test suite in `test_parallel.py` to validate the functionality of the parallelization helpers.

- Moved ranksum buffer preparation and kernel functions from `_single_cell.py` to `_parallel.py` for better organization and performance. - Introduced a new `prepare_ranksum_buffers` function to allocate per-thread buffers. - Added a vectorized implementation of the ranksum test using Numba for parallel processing. - Updated `_single_cell.py` to utilize the new ranksum test functions, enhancing modularity and code clarity.

…ameters and refactor processing logic - Added `num_workers` and `num_threads` parameters to `_parallel_differential_expression_chunked` and `parallel_differential_expression` for improved control over parallel processing. - Refactored target processing logic to utilize `process_target_in_chunk` and `process_targets_parallel` for better modularity and performance. - Updated documentation to clarify the usage of new parameters and their impact on processing behavior.

- Updated progress descriptions in `process_targets_parallel` and `_parallel_differential_expression_chunked` to include the number of workers and Numba thread status for better user feedback. - Enhanced logging to provide details on the number of threads configured for Numba, improving transparency during execution.

- Added detailed explanations for handling in-memory and backed AnnData objects in the README, clarifying execution strategies and memory management. - Updated `parallel_differential_expression` docstring to specify the roles of `num_workers` and `num_threads` in low-memory mode, improving user understanding of parallelization options. - Enhanced documentation for parallel processing utilities in `_parallel.py`, emphasizing their modularity and reusability.

- Introduced `is_integer_data` and `should_use_numba` functions to determine Numba applicability based on data type, improving performance for integer-like data. - Updated `_parallel_differential_expression_chunked` to log warnings when Numba is disabled due to non-integer values, ensuring users are informed of fallback to SciPy. - Adjusted default `num_threads` parameter in several functions to improve usability and consistency in parallel processing settings. - Added tests for new functions to ensure correct behavior in various scenarios, enhancing overall test coverage.

…unctions - Changed `num_workers` parameter in `parallel_differential_expression_vec_wrapper` to accept `None`, allowing for more flexible worker configuration. - Updated index creation in `build_small_anndata` and `var` DataFrame to use `pd.Index`, enhancing compatibility with pandas operations. - Modified `_sort_results` function to accept both `pd.DataFrame` and `pl.DataFrame`, improving versatility in handling different DataFrame types.

…ions - Reformatted the `_compute_means` call in `process_target_in_chunk` for better readability. - Simplified the iterable creation in `process_targets_parallel` by removing unnecessary line breaks, enhancing clarity in the code structure. - Updated the DataFrame creation in `build_small_anndata` to streamline the dictionary definition, improving overall code conciseness.

…upport - Added dual-kernel functionality for the Wilcoxon ranksum test, optimizing performance for both integer and float data types. - Updated documentation to clarify the use of histogram-based and sorting-based kernels, improving user understanding of the implementation. - Refactored related functions for better organization and clarity, ensuring seamless integration with existing processing workflows. - Adjusted parameter handling in `_parallel_differential_expression_chunked` to support enhanced Numba integration.

…tings - Replaced print statements with logging for design parameter output in `pseudobulk_dex`, enhancing traceability. - Added warnings in `parallel_differential_expression_vec_wrapper` for ignored parameters, improving user awareness of configuration limitations. - Updated test name for clarity regarding functionality related to low memory float data handling.

…ksum processing - Enhanced function signatures in `_parallel.py` to include return type hints, improving code clarity and type safety. - Reformatted function definitions for better readability, aligning with modern Python standards. - Updated documentation comments to reflect changes in function signatures, ensuring consistency and clarity for users.

… documentation for pdex.

- Added support for polars DataFrame conversion in integration tests, ensuring compatibility with varying data types. - Updated assertions in tests to allow for small numerical differences between results from different implementations, improving robustness of tests. - Refactored correlation assertions for clarity and consistency in error messaging.

gemini-code-assist · 2025-12-05T17:22:42Z

Summary of Changes

Hello @nick-youngblut, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances pdex's capability to handle large single-cell datasets by introducing robust support for backed AnnData objects. It achieves this through a novel low-memory, chunked processing pipeline, which intelligently loads data in smaller segments from disk. Performance is further boosted by integrating highly optimized Numba kernels for the Wilcoxon rank-sum test, capable of efficiently handling both integer count and float data. The changes also provide users with granular control over parallel execution, ensuring pdex remains performant and scalable for diverse data sizes and types.

Highlights

Backed AnnData Support: Introduced low-memory, chunked processing for AnnData objects too large for memory, automatically detecting backed objects and switching to this efficient implementation. A low_memory=True option allows forcing this mode for large in-memory datasets.
Numba-accelerated Wilcoxon Kernels: Implemented dual Numba kernels for the Wilcoxon rank-sum test: a fast histogram-based kernel for integer count data (O(n + k)) and a more general sorting-based kernel for float data (O(n log n)). The appropriate kernel is automatically selected for optimal performance.
Enhanced Parallelization Control: The parallel_differential_expression function now offers two levels of parallelization for the low-memory backend: num_workers for target-level parallelism and num_threads for gene-level Numba parallelism within the Wilcoxon kernel.
Comprehensive Testing: New integration tests (tests/test_integration.py) ensure consistency between the new low-memory implementation and the standard in-memory approach. New unit tests (tests/test_parallel.py) validate the Numba kernels and parallelization helpers, and existing tests were extended.
Updated Documentation: The README.md has been significantly updated to explain the new features, including the automatic backend selection for backed AnnData and the new parallelization options.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new low-memory, chunked processing mode for differential expression analysis, primarily benefiting backed AnnData objects or very large in-memory datasets. A new _parallel.py module was added to centralize parallelization helpers, including a dual-kernel Numba-accelerated Wilcoxon rank-sum test that automatically dispatches between a histogram-based kernel for integer data and a sorting-based kernel for float data. The main parallel_differential_expression function in _single_cell.py was refactored to act as a dispatcher, automatically selecting between the original shared-memory multiprocessing approach and the new chunked mode based on whether the AnnData object is backed or if low_memory=True is explicitly set. The README.md was updated with extensive documentation on these new execution strategies and parallelization controls (num_workers, num_threads). Integration tests were added to validate the consistency of results between the standard and low-memory modes across various data types (dense/sparse, counts/log1p) and chunk sizes. Review comments highlighted concerns about the use of fastmath=True in the Numba kernel due to potential precision issues in statistical calculations, suggested removing an unreachable else block in process_target_in_chunk, and questioned the high atol tolerance used in integration tests for comparing p-values, implying it might mask numerical differences or be a consequence of fastmath.

src/pdex/_parallel.py

tests/test_pdex.py

- Updated assertion formatting in integration tests for better readability and consistency in error messages. - Adjusted numerical tolerance in `test_pdex.py` to allow for smaller differences, enhancing test robustness.

nick-youngblut added 16 commits December 2, 2025 19:34

feat: add low_memory and gene_chunk_size parameters to parallel_diffe…

70affb1

…rential_expression_vec_wrapper - Introduced low_memory option for optimized processing. - Added gene_chunk_size parameter to control chunk size during computations.

feat: Add initial analysis notebooks, scripts, integration tests, and…

ad31145

… documentation for pdex.

nick-youngblut mentioned this pull request Dec 5, 2025

feat: Add automatic low-memory mode for backed AnnData objects #63

Closed

gemini-code-assist bot reviewed Dec 5, 2025

View reviewed changes

src/pdex/_parallel.py Show resolved Hide resolved

src/pdex/_parallel.py Show resolved Hide resolved

tests/test_pdex.py Show resolved Hide resolved

nick-youngblut and others added 2 commits December 5, 2025 09:28

chore(semver): bump

1bf8cad

noamteyssier merged commit 787f81d into ArcInstitute:main Dec 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Backed AnnData Support via Low-Memory Chunked Processing and Numba Kernels #64

Add Backed AnnData Support via Low-Memory Chunked Processing and Numba Kernels #64

Uh oh!

nick-youngblut commented Dec 5, 2025

Uh oh!

gemini-code-assist bot commented Dec 5, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add Backed AnnData Support via Low-Memory Chunked Processing and Numba Kernels #64

Add Backed AnnData Support via Low-Memory Chunked Processing and Numba Kernels #64

Uh oh!

Conversation

nick-youngblut commented Dec 5, 2025

Changes

Features & Enhancements

Testing

Documentation

Uh oh!

gemini-code-assist bot commented Dec 5, 2025

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants