Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multifile scanimage #297

Merged
merged 18 commits into from
Apr 4, 2024
Merged

Multifile scanimage #297

merged 18 commits into from
Apr 4, 2024

Conversation

pauladkisson
Copy link
Member

Fixes #296

@weiglszonja
Copy link
Contributor

This looks great @pauladkisson, thank you! I can confirm it works with the data from Dombeck; I'm just wondering now whether it is possible to speed up the initialisation of the "parent" extractor. We're parsing the metadata for each file when initialising the individual extractors, but they're all time same except for these keys:

Different values for common keys: [('frameTimestamps_sec', '0.000000000', '133.152635375'), ('frameNumbers', '1', '4001'), ('frameNumberAcquisition', '1', '4001')]

I think frameTimestamps_sec is parsed also in extract_timestamps_from_file() so we don't lose this if we just do

extract_extra_metadata(file_path)
parse_metadata(self.metadata)

for the first TIF file, but let me know what you think about this!

@CodyCBakerPhD
Copy link
Member

Can we add some tests, even if they're just manually formed out of symlink copies of the current single file ScanImageTiff files we currently have? To showcase and verify the naming pattern we expect from this multifile procedure generated by the ScanImage software (in particular, the naming pattern of the files providing the ordering of sequences is something we expect to follow a stringent pattern, presumably compatible with natsort)

@CodyCBakerPhD
Copy link
Member

And yes, consideration of expedited metadata parsing should be considered for speed. For a very long imaging session using this strategy, parsing metadata for each subfile can be expensive

@pauladkisson
Copy link
Member Author

Added tests and option to skip metadata i/o for multifile extractors. @weiglszonja lmk what you think!

@weiglszonja
Copy link
Contributor

Sorry for the delay @pauladkisson, is this example data dual or single channel? Can we add test for dual channel multifile as well?

I checked with the data I have from the Dombeck lab (see this note):

from roiextractors.extractors.tiffimagingextractors.scanimagetiffimagingextractor import \
    ScanImageTiffSinglePlaneMultiFileImagingExtractor


extractor1 = ScanImageTiffSinglePlaneMultiFileImagingExtractor(folder_path="/Volumes/LaCie/CN_GCP/Dombeck/2620749R2_231211", extract_all_metadata=False, file_pattern="2620749R2_231211_00001*.tif", channel_name="Channel 1", plane_name="0")
video1 = extractor1.get_video(start_frame=0, end_frame=100)

extractor2 = ScanImageTiffSinglePlaneMultiFileImagingExtractor(folder_path="/Volumes/LaCie/CN_GCP/Dombeck/2620749R2_231211", extract_all_metadata=False, file_pattern="2620749R2_231211_00001*.tif", channel_name="Channel 2", plane_name="0")
video2 = extractor1.get_video(start_frame=0, end_frame=100)

from numpy.testing import assert_array_equal

# This should not be True 
assert_array_equal(video1, video2)

However looks like the same video is returned for both channels, also looking at these plots:

Screenshot 2024-04-02 at 13 18 33

However when I'm using ScanImageTiffSinglePlaneImagingExtractor instead, the frames look as expected:

Screenshot 2024-04-02 at 13 23 10

@pauladkisson
Copy link
Member Author

pauladkisson commented Apr 2, 2024

Hey @weiglszonja, thanks for taking a look. It shouldn't matter whether the example data is single channel or multi channel, since ScanImageTiffSinglePlaneImagingExtractor takes care of all of the indexing logic. What would matter is if these files are split midcycle, which is typical of the file name pattern prefix_00001_00001.tif. This is an issue I just discovered from Lawrence Niu -- see #299. Do you know if the Dombeck files are split midcycle? Better yet, can you provide me two or three of these files via Google Drive, so that I can take a look at them myself?

@pauladkisson
Copy link
Member Author

pauladkisson commented Apr 4, 2024

Hey @weiglszonja, looks like there is a typo in your example code. You have

...
video2 = extractor1.get_video(start_frame=0, end_frame=100)
...

But it should be

...
video2 = extractor2.get_video(start_frame=0, end_frame=100)
...

I'm not seeing any problems with the multifile imaging extractors on my end:

file_path = '/Volumes/T7/CatalystNeuro/NWB/Dombeck/raw_data/2620749R2_231211_00001_00001.tif'
imaging_extractor = ScanImageTiffSinglePlaneImagingExtractor(
    file_path=file_path,
    channel_name='Channel 1',
    plane_name='0',
)
video_ch1_single = imaging_extractor.get_video(end_frame=10)
imaging_extractor = ScanImageTiffSinglePlaneImagingExtractor(
    file_path=file_path,
    channel_name='Channel 2',
    plane_name='0',
)
video_ch2_single = imaging_extractor.get_video(end_frame=10)

imaging_extractor = ScanImageTiffSinglePlaneMultiFileImagingExtractor(
    folder_path='/Volumes/T7/CatalystNeuro/NWB/Dombeck/raw_data',
    file_pattern='2620749R2_231211_00001_*.tif',
    channel_name='Channel 1',
    plane_name='0',
)
video_ch1_multi = imaging_extractor.get_video(end_frame=10)

imaging_extractor = ScanImageTiffSinglePlaneMultiFileImagingExtractor(
    folder_path='/Volumes/T7/CatalystNeuro/NWB/Dombeck/raw_data',
    file_pattern='2620749R2_231211_00001_*.tif',
    channel_name='Channel 2',
    plane_name='0',
)
video_ch2_multi = imaging_extractor.get_video(end_frame=10)

print(f'ch1 multi == ch1 single: {np.array_equal(video_ch1_multi, video_ch1_single)}') # True
print(f'ch2 multi == ch2 single: {np.array_equal(video_ch2_multi, video_ch2_single)}') # True
print(f'ch1 single == ch2 single: {np.array_equal(video_ch1_single, video_ch2_single)}') # False
print(f'ch1 multi == ch2 multi: {np.array_equal(video_ch1_multi, video_ch2_multi)}') # False

@weiglszonja
Copy link
Contributor

Hey @weiglszonja, looks like there is a typo in your example code. You have

Ah, my bad then. Sorry @pauladkisson, and thank you for checking.

Copy link

codecov bot commented Apr 4, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 79.59%. Comparing base (86126dd) to head (39779da).

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #297      +/-   ##
==========================================
+ Coverage   79.25%   79.59%   +0.34%     
==========================================
  Files          39       39              
  Lines        3100     3147      +47     
==========================================
+ Hits         2457     2505      +48     
+ Misses        643      642       -1     
Flag Coverage Δ
unittests 79.59% <100.00%> (+0.34%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/roiextractors/extractorlist.py 100.00% <ø> (ø)
...ctors/extractors/tiffimagingextractors/__init__.py 100.00% <ø> (ø)
...imagingextractors/scanimagetiffimagingextractor.py 99.17% <100.00%> (+0.70%) ⬆️

@pauladkisson pauladkisson merged commit 05a5675 into main Apr 4, 2024
16 checks passed
@pauladkisson pauladkisson deleted the multifile_scanimage branch April 4, 2024 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: Dedicated MultiFile ScanImage ImagingExtractors
3 participants