Skip to content

feat: Add var handling + validation#133

Merged
felix0097 merged 10 commits intomainfrom
ff/var-handling
Feb 10, 2026
Merged

feat: Add var handling + validation#133
felix0097 merged 10 commits intomainfrom
ff/var-handling

Conversation

@felix0097
Copy link
Copy Markdown
Collaborator

This should implement: #109

@felix0097 felix0097 self-assigned this Feb 5, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 5, 2026

Codecov Report

❌ Patch coverage is 87.09677% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 91.00%. Comparing base (38c96d4) to head (1ee9a8b).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/annbatch/loader.py 86.20% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #133      +/-   ##
==========================================
- Coverage   93.41%   91.00%   -2.41%     
==========================================
  Files          10       10              
  Lines         805      823      +18     
==========================================
- Hits          752      749       -3     
- Misses         53       74      +21     
Files with missing lines Coverage Δ
src/annbatch/types.py 100.00% <100.00%> (ø)
src/annbatch/utils.py 85.58% <100.00%> (-5.41%) ⬇️
src/annbatch/loader.py 88.92% <86.20%> (-3.96%) ⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@felix0097 felix0097 requested a review from ilan-gold February 5, 2026 11:49
Comment thread src/annbatch/loader.py Outdated
Comment thread src/annbatch/loader.py Outdated
Comment thread src/annbatch/utils.py Outdated
X=g["X"] if isinstance(g["X"], zarr.Array) else ad.io.sparse_dataset(g["X"]), obs=ad.io.read_elem(g["obs"])
X=g["X"] if isinstance(g["X"], zarr.Array) else ad.io.sparse_dataset(g["X"]),
obs=ad.io.read_elem(g["obs"]),
var=pd.DataFrame(index=pd.Index(g["var/_index"][:], name=g["var"].attrs.get("_index"))),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

USe g["var"].attrs.get("_index") for getting the actual column i.e., g[f"var/{index_name_from_attrs}"]

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed it 👍

felix0097 and others added 3 commits February 10, 2026 16:15
Co-authored-by: Ilan Gold <ilanbassgold@gmail.com>
Co-authored-by: Ilan Gold <ilanbassgold@gmail.com>
@felix0097 felix0097 requested a review from ilan-gold February 10, 2026 15:19
Comment thread tests/test_dataset.py Outdated
Comment on lines +534 to +543
# Test add_anndata raises error when adding second anndata with different var
loader = Loader(chunk_size=10, preload_nchunks=4, batch_size=20)
loader.add_anndata(adata1_on_disk)
with pytest.raises(ValueError, match="All datasets must have identical var DataFrames"):
loader.add_anndata(adata2_on_disk)

# Test add_anndatas raises error when passing list with mismatched var
loader = Loader(chunk_size=10, preload_nchunks=4, batch_size=20)
with pytest.raises(ValueError, match="All datasets must have identical var DataFrames"):
loader.add_anndatas([adata1_on_disk, adata2_on_disk])
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also add_datasets tests

Comment thread src/annbatch/loader.py
Comment on lines +402 to +403
var
:class:`~pandas.DataFrame` var, generally from :attr:`anndata.AnnData.var`.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention here and in add_anndata that the added var must match any previously added ones

@felix0097 felix0097 requested a review from ilan-gold February 10, 2026 15:49
@ilan-gold ilan-gold added the skip-gpu-ci Whether gpu ci should be skipped label Feb 10, 2026
Comment thread pyproject.toml Outdated
@felix0097 felix0097 merged commit 0628019 into main Feb 10, 2026
11 checks passed
@felix0097 felix0097 deleted the ff/var-handling branch February 10, 2026 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

skip-gpu-ci Whether gpu ci should be skipped

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants