fix: fix ImageDataset axis order and add tests by ganow · Pull Request #105 · KamitaniLab/bdpy

ganow · 2025-04-14T13:01:03Z

Problem

ImageDataset had three bugs, with no tests to catch them:

Wrong axis order in returned image array
__getitem__ returned images in HWC format (H, W, C), but PyTorch's DataLoader and most deep learning models expect CHW format (C, H, W). This caused shape mismatches when feeding images into networks.
Incorrect auto-detection of stimulus names
When stimulus_names=None, file stems were extracted using a custom _removesuffix helper instead of Path.stem. This was unnecessarily verbose and fragile.
Non-deterministic order of auto-detected stimulus names
When stimulus_names=None, Path.glob() was used without sorting, so the order of stimulus names was filesystem-dependent and not reproducible.

Fix

Added .transpose(2, 0, 1) to convert image arrays from HWC to CHW before returning.
Replaced the _removesuffix-based list comprehension with path.stem.
Added sorted() to ensure auto-detected stimulus names are always in alphabetical order.

Tests

Added tests/dl/torch/test_dataset.py with 9 test cases:

test_getitem_returns_chw_shape — verifies CHW axis order using a non-square image (H≠W) to fully discriminate every axis
test_getitem_preserves_channels — verifies per-channel values are correctly mapped after transpose
test_dataloader_integration_batch_shape — end-to-end check via DataLoader (the original failure path)
test_value_range_normalized_to_unit_interval — verifies pixel values are in [0, 1]
test_len_matches_stimulus_names — verifies __len__
test_explicit_stimulus_names_respected — verifies explicit stimulus_names are used as-is
test_explicit_stimulus_names_preserve_input_order — verifies input order is preserved when stimulus_names is given
test_auto_detected_stimulus_names_use_stem — verifies file stems are used when stimulus_names=None
test_auto_detected_stimulus_names_are_sorted — verifies alphabetical ordering of auto-detected names

Also removed the empty TestImageDataset stub from test_torch.py.

github-actions · 2026-05-13T05:29:26Z

Coverage Report

File	Stmts	Miss	Cover	Missing
bdpy/bdata
bdata.py	399	195	51%	79, 104, 109, 113, 118, 122, 132–134, 190, 233–239, 252–262, 276–277, 310, 314, 318–356, 405–411, 419–420, 425–426, 443–450, 468–469, 475, 508, 539, 548, 560, 589–598, 610, 625, 661, 683–691, 696–729, 738, 750–757, 761–767, 771–799, 803–824, 828–862, 866–868, 872–874, 878–887
featureselector.py	64	12	81%	62–67, 69–74
metadata.py	67	1	99%	84
utils.py	113	37	67%	71, 82, 85–86, 95, 127–173, 201, 246, 258, 263
bdpy/dataform
datastore.py	107	85	21%	59–75, 90–93, 97–98, 102–113, 116–119, 122–127, 131–132, 137–158, 190–197, 222–259, 262–265
features.py	298	165	45%	31–32, 43–46, 90–92, 101–103, 107, 111, 115, 119, 152–153, 157–161, 168–197, 214–215, 224–225, 232–236, 274, 288, 305–319, 323, 327, 331, 335, 339, 343, 347, 351, 355, 359, 364–394, 398–418, 422–462, 465, 470–477, 491–493, 496–499, 502–505, 508–512, 515–516, 536–549
kvs.py	177	13	93%	21, 24, 114, 118, 127–131, 171, 173, 185, 254, 282
pd.py	9	5	44%	25–27, 43–44
sparse.py	67	7	90%	29, 52–58, 74, 109, 123
utils.py	12	12	0%	3–18
bdpy/dataset
utils.py	45	45	0%	3–98
bdpy/distcomp
distcomp.py	92	18	80%	33, 35, 49, 53, 55, 66–70, 74, 76, 81–82, 89–93, 97
bdpy/dl
caffe.py	60	60	0%	4–129
bdpy/dl/torch
base.py	43	24	44%	31–41, 48, 54, 60, 63, 73–83, 90, 96, 102, 105
dataset.py	74	40	46%	37–39, 67–72, 75, 78–88, 122–130, 133, 136–149, 196, 199
models.py	333	226	32%	148–169, 297–316, 327–331, 345–350, 442–494, 515–517, 528–587, 611–614, 625–684, 708–711, 722–771, 790–793, 804–853, 872–875
torch.py	121	55	55%	188–225, 228, 231–243, 246–281
bdpy/dl/torch/domain
core.py	46	2	96%	47, 63
feature_domain.py	24	1	96%	30
image_domain.py	81	3	96%	91, 94, 257
bdpy/evals
metrics.py	95	45	53%	49–53, 82–112, 130–142, 151–152, 157, 172–179
bdpy/feature
feature.py	30	2	93%	69–70
bdpy/fig
__init__.py	5	5	0%	6–10
draw_group_image_set.py	90	90	0%	3–182
fig.py	88	88	0%	16–164
makeplots2.py	263	263	0%	1–608
makeplots.py	336	336	0%	1–729
tile_images.py	59	59	0%	1–193
bdpy/ml
crossvalidation.py	59	27	54%	47–48, 113–114, 117–118, 138, 164–196
learning.py	313	97	69%	9, 47–48, 52, 56, 63, 95–108, 113–129, 132, 162–174, 188–213, 297, 313, 317–319, 322–323, 333, 343–344, 349–350, 360–368, 371–372, 380, 415–422, 443, 456, 464, 473, 505–507, 546, 559, 562, 571, 580, 585, 606
model.py	140	120	14%	29–39, 54–70, 86–144, 156–169, 184–222, 225, 230–250, 254–258, 271–285
searchlight.py	16	13	19%	32–51
bdpy/mri
fmriprep.py	497	451	9%	25–34, 38, 44–62, 65–75, 78–89, 92–160, 163–194, 230–360, 367–380, 384, 388–390, 394, 398–400, 410–434, 437–454, 457–464, 471–472, 475–491, 494, 498, 502–815, 819–831, 842–862
glm.py	40	36	10%	46–95
image.py	24	19	21%	29–54
load_epi.py	28	18	36%	36–50, 56–63, 82–88
load_mri.py	19	16	16%	16–36
roi.py	248	217	12%	37–100, 165–235, 241–314, 320–387, 399–466, 473–499
spm.py	158	139	12%	26–155, 162–166, 170, 174–179, 183–300
bdpy/opendata
__init__.py	1	1	0%	1
openneuro.py	210	210	0%	1–329
bdpy/pipeline
config.py	36	2	94%	37–38
bdpy/preproc
interface.py	52	16	69%	111–123, 148–157
preprocessor.py	129	69	47%	35, 44, 112–114, 121–128, 138–189, 196–227
select_top.py	23	1	96%	55
bdpy/recon
utils.py	55	55	0%	4–146
bdpy/recon/torch
icnn.py	161	161	0%	15–478
bdpy/recon/torch/modules
critic.py	44	2	95%	58, 132
encoder.py	29	1	97%	29
generator.py	72	5	93%	47, 52, 68, 128, 309
latent.py	34	3	91%	16, 21, 32
bdpy/recon/torch/task
inversion.py	83	11	87%	22, 40, 45, 50, 57, 62, 67, 72, 96, 210, 225
bdpy/stats
corr.py	43	3	93%	57, 68, 102
bdpy/task
callback.py	71	4	94%	114, 161, 166, 234
core.py	16	1	94%	50
bdpy/util
info.py	47	36	23%	19–79
utils.py	36	8	78%	60, 116–121, 140–142
TOTAL	5981	3636	39%

Tests	Skipped	Failures	Errors	Time
218	0 💤	0 ❌	0 🔥	18.256s ⏱️

github-actions · 2026-05-13T05:29:26Z

Coverage Report

File	Stmts	Miss	Cover	Missing
bdpy/bdata
bdata.py	399	195	51%	79, 104, 109, 113, 118, 122, 132–134, 190, 233–239, 252–262, 276–277, 310, 314, 318–356, 405–411, 419–420, 425–426, 443–450, 468–469, 475, 508, 539, 548, 560, 589–598, 610, 625, 661, 683–691, 696–729, 738, 750–757, 761–767, 771–799, 803–824, 828–862, 866–868, 872–874, 878–887
featureselector.py	64	12	81%	62–67, 69–74
metadata.py	67	1	99%	84
utils.py	113	37	67%	71, 82, 85–86, 95, 127–173, 201, 246, 258, 263
bdpy/dataform
datastore.py	107	85	21%	59–75, 90–93, 97–98, 102–113, 116–119, 122–127, 131–132, 137–158, 190–197, 222–259, 262–265
features.py	298	165	45%	31–32, 43–46, 90–92, 101–103, 107, 111, 115, 119, 152–153, 157–161, 168–197, 214–215, 224–225, 232–236, 274, 288, 305–319, 323, 327, 331, 335, 339, 343, 347, 351, 355, 359, 364–394, 398–418, 422–462, 465, 470–477, 491–493, 496–499, 502–505, 508–512, 515–516, 536–549
kvs.py	177	13	93%	21, 24, 114, 118, 127–131, 171, 173, 185, 254, 282
pd.py	9	5	44%	25–27, 43–44
sparse.py	67	7	90%	29, 52–58, 74, 109, 123
utils.py	12	12	0%	3–18
bdpy/dataset
utils.py	45	45	0%	3–98
bdpy/distcomp
distcomp.py	92	18	80%	33, 35, 49, 53, 55, 66–70, 74, 76, 81–82, 89–93, 97
bdpy/dl
caffe.py	60	60	0%	4–129
bdpy/dl/torch
base.py	43	24	44%	31–41, 48, 54, 60, 63, 73–83, 90, 96, 102, 105
dataset.py	74	74	0%	1–192
models.py	333	226	32%	148–169, 297–316, 327–331, 345–350, 442–494, 515–517, 528–587, 611–614, 625–684, 708–711, 722–771, 790–793, 804–853, 872–875
torch.py	121	55	55%	188–225, 228, 231–243, 246–281
bdpy/dl/torch/domain
core.py	46	2	96%	47, 63
feature_domain.py	24	1	96%	30
image_domain.py	81	3	96%	91, 94, 257
bdpy/evals
metrics.py	95	45	53%	49–53, 82–112, 130–142, 151–152, 157, 172–179
bdpy/feature
feature.py	30	2	93%	69–70
bdpy/fig
__init__.py	5	5	0%	6–10
draw_group_image_set.py	90	90	0%	3–182
fig.py	88	88	0%	16–164
makeplots2.py	263	263	0%	1–608
makeplots.py	336	336	0%	1–729
tile_images.py	59	59	0%	1–193
bdpy/ml
crossvalidation.py	59	27	54%	47–48, 113–114, 117–118, 138, 164–196
learning.py	315	97	69%	9, 47–48, 52, 56, 63, 95–108, 113–129, 132, 162–174, 188–213, 297, 313, 317–319, 322–323, 333, 343–344, 349–350, 360–368, 371–372, 380, 415–422, 443, 456, 464, 473, 505–507, 546, 559, 562, 571, 580, 585, 606
model.py	140	120	14%	29–39, 54–70, 86–144, 156–169, 184–222, 225, 230–250, 254–258, 271–285
searchlight.py	16	13	19%	32–51
bdpy/mri
fmriprep.py	497	451	9%	25–34, 38, 44–62, 65–75, 78–89, 92–160, 163–194, 230–360, 367–380, 384, 388–390, 394, 398–400, 410–434, 437–454, 457–464, 471–472, 475–491, 494, 498, 502–815, 819–831, 842–862
glm.py	40	36	10%	46–95
image.py	24	19	21%	29–54
load_epi.py	28	18	36%	36–50, 56–63, 82–88
load_mri.py	19	16	16%	16–36
roi.py	248	217	12%	37–100, 165–235, 241–314, 320–387, 399–466, 473–499
spm.py	158	139	12%	26–155, 162–166, 170, 174–179, 183–300
bdpy/opendata
__init__.py	1	1	0%	1
openneuro.py	210	210	0%	1–329
bdpy/pipeline
config.py	36	2	94%	37–38
bdpy/preproc
interface.py	52	16	69%	111–123, 148–157
preprocessor.py	129	69	47%	35, 44, 112–114, 121–128, 138–189, 196–227
select_top.py	23	1	96%	55
bdpy/recon
utils.py	55	55	0%	4–146
bdpy/recon/torch
icnn.py	161	161	0%	15–478
bdpy/recon/torch/modules
critic.py	44	2	95%	58, 132
encoder.py	29	1	97%	29
generator.py	72	5	93%	47, 52, 68, 128, 309
latent.py	34	3	91%	16, 21, 32
optimizer.py	22	9	59%	8–26
bdpy/recon/torch/task
inversion.py	88	15	83%	11–16, 22, 40, 45, 50, 57, 62, 67, 72, 96, 210, 225
bdpy/stats
corr.py	43	3	93%	57, 68, 102
bdpy/task
callback.py	71	4	94%	114, 161, 166, 234
core.py	16	1	94%	50
bdpy/util
info.py	47	36	23%	19–79
utils.py	36	8	78%	60, 116–121, 140–142
TOTAL	5998	3683	39%

Tests	Skipped	Failures	Errors	Time
209	0 💤	0 ❌	0 🔥	15.920s ⏱️

Add tests/dl/torch/test_dataset.py with 9 test cases covering CHW axis order, per-channel values, DataLoader integration, value normalization, length, explicit stimulus ordering, and auto-detection via Path.stem. Also sort auto-detected stimulus names for deterministic ordering, and remove the empty TestImageDataset stub from test_torch.py. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

KenyaOtsuka

Thanks for the fix. I left a few minor comments.

Optionally, since this changes the behavior of ImageDataset from HWC to CHW, it might be helpful to document that ImageDataset now returns images in CHW format.

KenyaOtsuka · 2026-05-13T13:50:28Z

+from pathlib import Path
+
+import numpy as np
+import torch


torch seems to be unused in this file. Could you remove it?

that's true. thank you for mentioning it

fixed in c82dfac

KenyaOtsuka · 2026-05-13T13:53:56Z

+        root = Path(self.tmpdir.name)
+        _save_image(root / "a.jpg", r=200, g=100, b=50)
+        _save_image(root / "b.jpg", r=10, g=20, b=30)
+        _save_image(root / "c.jpg", r=0, g=128, b=255)


For tests that check exact pixel values, it may be better to use a lossless format such as PNG.

fixed in 57451e7

… and add notes on image format

KenyaOtsuka

LGTM. Thank you!

ganow marked this pull request as ready for review April 14, 2025 13:02

ganow added the bug label Apr 14, 2025

ganow commented Apr 15, 2025

View reviewed changes

Comment thread bdpy/dl/torch/dataset.py Outdated

ganow changed the base branch from main to dev October 24, 2025 11:18

ganow and others added 2 commits May 13, 2026 14:27

bugfix: fix the behavior of ImageDataset

d047bea

Bugfix: bdpy/dl/torch/dataset.py

d1c0b5c

ganow force-pushed the fix-image-dataset branch from cb5aeb3 to d1c0b5c Compare May 13, 2026 05:27

ganow requested a review from KenyaOtsuka May 13, 2026 05:29

ganow changed the title ~~bugfix: fix the behavior of ImageDataset~~ fix: fix the behavior of ImageDataset May 13, 2026

ganow changed the title ~~fix: fix the behavior of ImageDataset~~ fix: fix ImageDataset axis order and add tests May 13, 2026

KenyaOtsuka reviewed May 13, 2026

View reviewed changes

ganow added 3 commits May 13, 2026 23:10

refactor: remove unused import of torch in test_dataset.py

c82dfac

test: update ImageDataset tests to use PNG format for images

57451e7

doc: update ImageDataset docstring to clarify stimulus_names behavior…

662f818

… and add notes on image format

KenyaOtsuka approved these changes May 14, 2026

View reviewed changes

KenyaOtsuka merged commit 0bdb419 into dev May 14, 2026
4 of 6 checks passed

KenyaOtsuka deleted the fix-image-dataset branch May 14, 2026 02:49

ganow mentioned this pull request May 18, 2026

Release v0.26: backmerge release into main #119

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: fix ImageDataset axis order and add tests#105

fix: fix ImageDataset axis order and add tests#105
KenyaOtsuka merged 6 commits into
devfrom
fix-image-dataset

ganow commented Apr 14, 2025 •

edited

Loading

Uh oh!

Uh oh!

github-actions Bot commented May 13, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

KenyaOtsuka left a comment

Uh oh!

KenyaOtsuka May 13, 2026

Uh oh!

ganow May 13, 2026

Uh oh!

ganow May 13, 2026

Uh oh!

KenyaOtsuka May 13, 2026

Uh oh!

ganow May 13, 2026

Uh oh!

KenyaOtsuka left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ganow commented Apr 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Tests

Uh oh!

Uh oh!

github-actions Bot commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

KenyaOtsuka left a comment

Choose a reason for hiding this comment

Uh oh!

KenyaOtsuka May 13, 2026

Choose a reason for hiding this comment

Uh oh!

ganow May 13, 2026

Choose a reason for hiding this comment

Uh oh!

ganow May 13, 2026

Choose a reason for hiding this comment

Uh oh!

KenyaOtsuka May 13, 2026

Choose a reason for hiding this comment

Uh oh!

ganow May 13, 2026

Choose a reason for hiding this comment

Uh oh!

KenyaOtsuka left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ganow commented Apr 14, 2025 •

edited

Loading

github-actions Bot commented May 13, 2026 •

edited

Loading