Multifile scanimage #297

pauladkisson · 2024-03-20T18:16:53Z

Fixes #296

src/roiextractors/extractors/tiffimagingextractors/scanimagetiffimagingextractor.py

weiglszonja · 2024-03-20T19:14:43Z

This looks great @pauladkisson, thank you! I can confirm it works with the data from Dombeck; I'm just wondering now whether it is possible to speed up the initialisation of the "parent" extractor. We're parsing the metadata for each file when initialising the individual extractors, but they're all time same except for these keys:

Different values for common keys: [('frameTimestamps_sec', '0.000000000', '133.152635375'), ('frameNumbers', '1', '4001'), ('frameNumberAcquisition', '1', '4001')]

I think frameTimestamps_sec is parsed also in extract_timestamps_from_file() so we don't lose this if we just do

extract_extra_metadata(file_path)
parse_metadata(self.metadata)

for the first TIF file, but let me know what you think about this!

CodyCBakerPhD · 2024-03-20T19:37:24Z

Can we add some tests, even if they're just manually formed out of symlink copies of the current single file ScanImageTiff files we currently have? To showcase and verify the naming pattern we expect from this multifile procedure generated by the ScanImage software (in particular, the naming pattern of the files providing the ordering of sequences is something we expect to follow a stringent pattern, presumably compatible with natsort)

CodyCBakerPhD · 2024-03-20T19:38:26Z

And yes, consideration of expedited metadata parsing should be considered for speed. For a very long imaging session using this strategy, parsing metadata for each subfile can be expensive

…it to all other files

pauladkisson · 2024-03-27T20:26:56Z

Added tests and option to skip metadata i/o for multifile extractors. @weiglszonja lmk what you think!

weiglszonja · 2024-04-02T11:24:49Z

Sorry for the delay @pauladkisson, is this example data dual or single channel? Can we add test for dual channel multifile as well?

I checked with the data I have from the Dombeck lab (see this note):

from roiextractors.extractors.tiffimagingextractors.scanimagetiffimagingextractor import \
    ScanImageTiffSinglePlaneMultiFileImagingExtractor


extractor1 = ScanImageTiffSinglePlaneMultiFileImagingExtractor(folder_path="/Volumes/LaCie/CN_GCP/Dombeck/2620749R2_231211", extract_all_metadata=False, file_pattern="2620749R2_231211_00001*.tif", channel_name="Channel 1", plane_name="0")
video1 = extractor1.get_video(start_frame=0, end_frame=100)

extractor2 = ScanImageTiffSinglePlaneMultiFileImagingExtractor(folder_path="/Volumes/LaCie/CN_GCP/Dombeck/2620749R2_231211", extract_all_metadata=False, file_pattern="2620749R2_231211_00001*.tif", channel_name="Channel 2", plane_name="0")
video2 = extractor1.get_video(start_frame=0, end_frame=100)

from numpy.testing import assert_array_equal

# This should not be True 
assert_array_equal(video1, video2)

However looks like the same video is returned for both channels, also looking at these plots:

However when I'm using ScanImageTiffSinglePlaneImagingExtractor instead, the frames look as expected:

pauladkisson · 2024-04-02T17:36:51Z

Hey @weiglszonja, thanks for taking a look. It shouldn't matter whether the example data is single channel or multi channel, since ScanImageTiffSinglePlaneImagingExtractor takes care of all of the indexing logic. What would matter is if these files are split midcycle, which is typical of the file name pattern prefix_00001_00001.tif. This is an issue I just discovered from Lawrence Niu -- see #299. Do you know if the Dombeck files are split midcycle? Better yet, can you provide me two or three of these files via Google Drive, so that I can take a look at them myself?

pauladkisson · 2024-04-04T00:22:16Z

Hey @weiglszonja, looks like there is a typo in your example code. You have

...
video2 = extractor1.get_video(start_frame=0, end_frame=100)
...

But it should be

...
video2 = extractor2.get_video(start_frame=0, end_frame=100)
...

I'm not seeing any problems with the multifile imaging extractors on my end:

file_path = '/Volumes/T7/CatalystNeuro/NWB/Dombeck/raw_data/2620749R2_231211_00001_00001.tif'
imaging_extractor = ScanImageTiffSinglePlaneImagingExtractor(
    file_path=file_path,
    channel_name='Channel 1',
    plane_name='0',
)
video_ch1_single = imaging_extractor.get_video(end_frame=10)
imaging_extractor = ScanImageTiffSinglePlaneImagingExtractor(
    file_path=file_path,
    channel_name='Channel 2',
    plane_name='0',
)
video_ch2_single = imaging_extractor.get_video(end_frame=10)

imaging_extractor = ScanImageTiffSinglePlaneMultiFileImagingExtractor(
    folder_path='/Volumes/T7/CatalystNeuro/NWB/Dombeck/raw_data',
    file_pattern='2620749R2_231211_00001_*.tif',
    channel_name='Channel 1',
    plane_name='0',
)
video_ch1_multi = imaging_extractor.get_video(end_frame=10)

imaging_extractor = ScanImageTiffSinglePlaneMultiFileImagingExtractor(
    folder_path='/Volumes/T7/CatalystNeuro/NWB/Dombeck/raw_data',
    file_pattern='2620749R2_231211_00001_*.tif',
    channel_name='Channel 2',
    plane_name='0',
)
video_ch2_multi = imaging_extractor.get_video(end_frame=10)

print(f'ch1 multi == ch1 single: {np.array_equal(video_ch1_multi, video_ch1_single)}') # True
print(f'ch2 multi == ch2 single: {np.array_equal(video_ch2_multi, video_ch2_single)}') # True
print(f'ch1 single == ch2 single: {np.array_equal(video_ch1_single, video_ch2_single)}') # False
print(f'ch1 multi == ch2 multi: {np.array_equal(video_ch1_multi, video_ch2_multi)}') # False

weiglszonja · 2024-04-04T08:31:12Z

Hey @weiglszonja, looks like there is a typo in your example code. You have

Ah, my bad then. Sorry @pauladkisson, and thank you for checking.

codecov · 2024-04-04T15:59:19Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.59%. Comparing base (86126dd) to head (39779da).

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #297      +/-   ##
==========================================
+ Coverage   79.25%   79.59%   +0.34%     
==========================================
  Files          39       39              
  Lines        3100     3147      +47     
==========================================
+ Hits         2457     2505      +48     
+ Misses        643      642       -1

Flag	Coverage Δ
unittests	`79.59% <100.00%> (+0.34%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files	Coverage Δ
src/roiextractors/extractorlist.py	`100.00% <ø> (ø)`
...ctors/extractors/tiffimagingextractors/__init__.py	`100.00% <ø> (ø)`
...imagingextractors/scanimagetiffimagingextractor.py	`99.17% <100.00%> (+0.70%)`	⬆️

pauladkisson added 2 commits March 20, 2024 11:05

added singleplane multifile extractor

43bafd5

added multiplane multifile extractor

3de1be3

pauladkisson requested a review from weiglszonja March 20, 2024 18:17

weiglszonja reviewed Mar 20, 2024

View reviewed changes

src/roiextractors/extractors/tiffimagingextractors/scanimagetiffimagingextractor.py Outdated Show resolved Hide resolved

pauladkisson added 12 commits March 26, 2024 14:32

updated frames_per_slice to work better with single-plane data

d888d99

added multifile extractors to the init

d2dfcd5

added multifile extractors to extractor list

91cd671

added multifile tests

3eac45a

added multifile multiplane tests

51eae64

improved tests to actually check file names

54b23fd

switched to natsorted

18b26e7

added option to only extract first files metadata and then propagate …

1027fd4

…it to all other files

added some docstring info

ca16bd8

added metadata tests

1e8cf55

moved natsort import to __init__

6d99d63

retrigger checks

a09263a

pauladkisson requested a review from weiglszonja March 27, 2024 20:26

pauladkisson added 3 commits March 29, 2024 15:47

propagated parsed_metadata along with metadata

0f5b811

updated docstrings

f94c2f4

added coverage for extract_all_metadata options

e598a71

weiglszonja approved these changes Apr 4, 2024

View reviewed changes

Merge branch 'main' into multifile_scanimage

39779da

pauladkisson merged commit 05a5675 into main Apr 4, 2024
16 checks passed

pauladkisson deleted the multifile_scanimage branch April 4, 2024 15:59

weiglszonja mentioned this pull request Apr 9, 2024

Add interface for single and multi-file ScanImage TIFF files catalystneuro/neuroconv#809

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multifile scanimage #297

Multifile scanimage #297

pauladkisson commented Mar 20, 2024

weiglszonja commented Mar 20, 2024

CodyCBakerPhD commented Mar 20, 2024

CodyCBakerPhD commented Mar 20, 2024

pauladkisson commented Mar 27, 2024

weiglszonja commented Apr 2, 2024

pauladkisson commented Apr 2, 2024 •

edited

pauladkisson commented Apr 4, 2024 •

edited

weiglszonja commented Apr 4, 2024

codecov bot commented Apr 4, 2024

Multifile scanimage #297

Multifile scanimage #297

Conversation

pauladkisson commented Mar 20, 2024

weiglszonja commented Mar 20, 2024

CodyCBakerPhD commented Mar 20, 2024

CodyCBakerPhD commented Mar 20, 2024

pauladkisson commented Mar 27, 2024

weiglszonja commented Apr 2, 2024

pauladkisson commented Apr 2, 2024 • edited

pauladkisson commented Apr 4, 2024 • edited

weiglszonja commented Apr 4, 2024

codecov bot commented Apr 4, 2024

Codecov Report

pauladkisson commented Apr 2, 2024 •

edited

pauladkisson commented Apr 4, 2024 •

edited