Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ROIs with different size or resolution #74

Open
wants to merge 41 commits into
base: master
Choose a base branch
from

Conversation

ageorgou
Copy link
Contributor

@ageorgou ageorgou commented Sep 8, 2020

Will fix #36.

  • New header version for latest LabView setup
  • Put reading/handling of ROI information in its own class for more flexibility
  • Read extra columns in newer setup (resolution, FOV, ...)
  • Create an imaging plane for each ROI acquisition series if needed

@ageorgou ageorgou marked this pull request as draft September 8, 2020 07:36
Currently does exactly the same as before, but should help us
introduce more flexibility, particularly to customise for
different versions of the LabView setup.
@ageorgou
Copy link
Contributor Author

ageorgou commented Sep 8, 2020

This is still far from done but it's a start! It may be that subclasses for RoiReader are not the way to go (instead, we could create the mappings and specify behaviour through functions or other classes -- composition rather than inheritance). Since there is no immediate prospect of further LabView versions, however, we might as well not overcomplicate this right now.

The main remaining changes are:

  • adding the new ROI.dat columns and their types to RoiReaderv300
  • adding a method to RoiReader that takes a ROI number and an NWBFile (or perhaps our own NwbFile) and creates/returns the imaging plane corresponding to the acquisition for that ROI. For older versions, this should give the ImagingPlane created for the corresponding Z-plane. For the latest version, and only if variable shapes are enabled, it should create a new ImagingPlane based on that ROI's coordinates and resolution (thinking about this, maybe it should take a channel too, so we don't duplicate between Red and Green ROIs?) This method should be called from read_functional_data, when determining the imaging plane for the acquisition named ROI_NNN_Red/Green.

To have full access to the header information (not used yet, but
it will be to check for variable resolution/shape), we need to
create the reader earlier on, since we don't store the header
object anywhere. This also lets us use the reader across
different methods if required (e.g when adding the acquisitions
for each ROI).
Right now the Readers for newer versions don't do anything
different, but some of the structure is there to allow them to.
Copy link
Collaborator

@jonc125 jonc125 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good to me so far!

@@ -184,6 +185,8 @@ def rel(file_name):
self.read_user_config()
header_fields = self.parse_experiment_header_ini(rel('Experiment Header.ini'))
speed_data, expt_start_time = self.read_speed_data(rel('Speed_Data/Speed data 001.txt'))
if expt_start_time is None:
expt_start_time = pd.Timestamp.now() # FIXME get experiment start time from header if not present in speed data
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting one, because there seems to be some inconsistency in the data! For instance, LabViewData2020/200225_19_09_16 has in the header

Experiment Date Time = "25/02/2020 18:22:17"

but the folder name implies a later start, as does the speed data, and all 3 differ. (Speed data starts:

25/02/2020	19:12:13.160587	4	0.00	332.316

initial_time = speed_data.index[0] - initial_offset
return speed_data, initial_time
else:
return None, None # TODO throw warning here!!
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we need to warn if the file isn't there. Except that if we have neither speed data nor a modern header we need to bail out with no start time.

file_path = rel('Single cycle relative times.txt')
assert os.path.isfile(file_path)
timings = LabViewTimingsPre2018(relative_times_path=file_path,
roi_path=roi_path,
dwell_time=self.imaging_info.dwell_time / 1e6)
elif self.labview_version is LabViewVersions.v231:
elif self.labview_version is LabViewVersions.v231 or self.labview_version is LabViewVersions.v300:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't just else be enough here?

src/silverlabnwb/rois.py Show resolved Hide resolved
assert len(roi_row_idx) == 1
roi_row_idx = roi_row_idx[0]
resolution = nwb_file.roi_data.resolution[roi_row_idx]
z_plane = nwb_file.nwb_file.processing['Acquired_ROIs'].get("ImageSegmentation")[plane_name].imaging_plane
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we're still assuming here that each ROI falls entirely within an existing Z plane?

this can (at least currently) only happen in post-2018 versions (2.3.1
and 3.0.0) of LabView, because the latter have trial information in the
header files. In the pre-2018, the speed data is the only way a
deduction of trial times is possible, so in this case we can't do
without a speed data file (and a ValueError is raised.)

This set of changes involves transferring some calculations from the
epochs (which contain the speed data) to the trials in the file. A
side-effect of this are small signature changes, due to the 1e-9 small
difference in start and stop times between epoch and trial.
trial and epoch start_time and end_time are now consistent with each
other.
tests fail for pointing mode sample data (but not for the miniscan).
the shapes and datatypes etc all match up, but the signature values are
different. This is expected because it seems, at least for the pointing
mode, the values in the TDMS file are written as
(roi1_cycle0,...,roi10_cycle0, roi1_cycle1,...,roi10_cycle1,
roi1_cycle2,...) and we are reading the first cycles_per_trial values
for roi0. But I don't understand why the miniscan test passes?

However, this code does create a reasonable-looking variable roi file for real life
data (more tests needed in the future for this though)

The failure is related to the order of things inside the TDMS file, and
the code executed in the two inner loops (channel and roi) of
_write_roi_data. But I am struggling to figure out how one would do this
well.

Maybe we need to handle legacy and variable roi code completely
separately, although it would be more elegant if we didn't have to do
this.
Copy link
Contributor Author

@ageorgou ageorgou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the change from epochs to trial as the "main" interval type. Is it the case that there will always be at least one trial, even if we don't have speed data?

epoch_name=trial,
start_time=start_time if i == 0 else start_time + 1e-9,
stop_time=stop_time + 1e-9,
timeseries=[speed_data_ts])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this argument be an empty list? If that doesn't cause an error, it seem simpler to change this argument (e.g. timeseries=[speed_data_ts] if speed_data_ts is not None else []) rather than switching to trials as the "main" interval type.

# We also record exact start & end times in the trial table, since our epochs
# correspond to trials.
self.nwb_file.add_trial(start_time=start_time, stop_time=stop_time)
self.nwb_file.add_trial(start_time=start_time if i == 0 else start_time + 1e-9, stop_time=stop_time + 1e-9)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I didn't remember these were different! I wonder if there was a reason that was the case.

@@ -711,14 +712,18 @@ def read_functional_data(self, folder_path):
# https://github.com/NeurodataWithoutBorders/pynwb/issues/673
for roi_num, roi_ind in self.roi_mapping[plane_name].items():
roi_name = 'ROI_{:03d}'.format(roi_num)
roi_dimensions = plane[roi_ind, 'dimensions']
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is used below?

Comment on lines 716 to 719
if 'pixels_per_miniscan' in self.roi_mapping.keys() and 'num_lines' in self.roi_mapping.keys():
all_roi_dimensions[roi_num] = [int(plane[roi_ind, 'pixels_per_miniscan']), int(plane[roi_ind, 'num_lines'])]
else:
all_roi_dimensions[roi_num] = plane[roi_ind, 'dimensions']
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what the intention is here. self.roi_mapping maps planes to another dictionary, so I don't think pixels_per_miniscan will ever be a valid key.

@@ -985,7 +989,7 @@ def add_rois(self, roi_path):
organised by ROI number and channel name, so we can iterate there. Issue #16.
"""
self.log('Loading ROI locations from {}', roi_path)
roi_data = self.roi_reader.read_roi_table(roi_path)
self.roi_data = self.roi_reader.read_roi_table(roi_path)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do we gain by making this an attribute instead of a local variable? Is it intended to be used in other methods?

nwb_file.add_imaging_plane(
name=new_plane_name,
description='Imaging plane for variable size ROI nr. ' + str(roi_number),
origin_coords=z_plane.origin_coords,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use the real origin coordinates of the ROI here? (x_start, y_start, z_start)

name=new_plane_name,
description='Imaging plane for variable size ROI nr. ' + str(roi_number),
origin_coords=z_plane.origin_coords,
grid_spacing=z_plane.grid_spacing*resolution
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This always trips me up! So if resolution = 3, does that mean that the pixels are 3 times as close, or that the distance between them is 3 times larger (than the "base" case)? This suggests the latter, right?

I guess we only need the resolution variable if we're adding a new plane, so it can be moved here.

@@ -660,15 +674,14 @@ def read_functional_data(self, folder_path):
self.log('Loading functional data from {}', folder_path)
assert os.path.isdir(folder_path)
# Figure out timestamps, measured in seconds
epoch_names = self.nwb_file.epochs[:, 'epoch_name']
trials = [int(s[6:]) for s in epoch_names] # names start with 'trial_'
trials = [s+1 for s in self.nwb_file.trials.id.data]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can see, we only ever care about the number of trials/epochs here, so there's a chance to simplify this a bit!

src/silverlabnwb/nwb_file.py Outdated Show resolved Hide resolved
ageorgou and others added 5 commits October 12, 2020 10:34
rois.py now stores copy of ROI.dat as a dataframe
and gives access to n_lines_in_roi, pixels_per_line and n_rois

calculation of n_lines_in_roi and pixels_per_line moved from timings.py to
rois.py

adapt timings.py formatting to accommodate variable size rois

timings tests inelegantly avoid using abstract factory pattern
(get_reader) and instantiate subclasses directly, but simpler
this way because we don't need to create synthetic header files.
Copy link
Contributor Author

@ageorgou ageorgou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I think the main part is getting the imaging mode in the RoiReader, and checking the result for the variable shape data (manually and with tests).

"""
This method determines the number of lines and the number of pixels per line by using information
contained in x/y_start, x/y_stop, and angle_deg.
:param roi_number:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use this formatting (:+1:), we might as well explain what the parameter is! (also when overridden)

Comment on lines 102 to 103
n_pixels_per_line = 1
n_lines_in_roi = 1
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not particularly useful here, but just in case: you can also chain assignments together, like

n_pixels_per_line = n_lines_per_roi = 1

(as long as you're assigning the same number)

Comment on lines 178 to 179
n_y_pixels = int(self.roi_data['num_lines'][roi_number])
n_x_pixels = int(self.roi_data['pixels_per_miniscan'][roi_number])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, should we convert these columns to integer when we read the file?

src/silverlabnwb/rois.py Outdated Show resolved Hide resolved
@@ -976,7 +979,7 @@ def read_zstack(self, zstack_folder):
self.zstack[plane_name][channel] = group_name
self._write()

def add_rois(self, roi_path):
def add_rois(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably update the docstring and move the parts about the ROI.dat file to the RoiReader class, now that we no longer use the file directly here.

@@ -74,21 +74,44 @@ def read_roi_table(self, roi_path):
for column in self.column_mapping:
if column not in self.type_mapping:
self.type_mapping[column] = np.float16
roi_data = pd.read_csv(
self.roi_data = pd.read_csv(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's define this in the constructor so we can tell if we've read the file yet or not. See get_n_rois for an example of why. (I thought the linter would complain!)


def get_row_attributes(self, roi_row):
return {
field: getattr(roi_row, field)
for field in self.column_mapping.values()
}

def get_n_rois(self):
return len(self.roi_data['roi_index'])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we haven't read the file yet and we call this method, we'll get an error because self.roi_data is not defined. If we initialise it in the constructor with e.g. None, we could do instead:

return 0 if self.roi_data is None else len(self.roi_data['roi_index'])

(we could also drop the column and get the len of the whole data frame)

This gives a small bit of protection, although you could argue that we want to get an error if we don't read the file first...

Comment on lines 58 to 59
n_lines_per_cycle = np.sum(self.roi_reader.get_lines_pixels(roi_index)[0]
for roi_index in np.arange(0, self.n_rois))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! The plain Python sum and range should also work here.

for roi_index in np.arange(0, self.n_rois):
pixel_time_offsets_by_roi[roi_index] = []
n_lines_in_roi, n_pixels_per_line = self.roi_reader.get_lines_pixels(roi_index)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to avoid calling get_lines_pixels twice for each ROI (we already did in line 58), we could hold the results in an array. I don't imagine it's particularly expensive though.

Without the copy, creating an instance of a RoiReaderv300 would
change the column list (and other attributes) of any existing
readers, even the ClassicalRoiReader. This isn't an issue at the
moment because we only ever use one reader for the whole runtime,
but there's no reason it should.
Split the size lookup in two steps: for pointing mode, it's shared
for all RoiReaders, but for miniscans the method is different.

Also, moved the new behaviour (using the num_lines and
pixel_per_miniscan columns) to the base v3.0.0 class, rather than
the subclass for variable shapes, since the new columns are used
in the new LabVIEW setup regardless of whether variable ROIs are
allowed (I think!)
Partly reverts 4fb96 to simplify the code a bit. Document the
difference in how we store times between trials and epochs.
Also set `trial_times` in all cases, whereas before it was only set
for newer versions. This lets us use it throughout the file.
- It seems the line numbers and pixels can be read directly from
the corresponding column, regardless of the angle.
- We still need to add the epoch name even if there is no speed
data (should have been part of 8db9623).
- `get_roi_imaging_plane` should maybe change so that it takes a
0-based index, like the other methods (or the other way around).
We should make a smaller version of this. I edited the signature
file manually after renaming the input directory, so there may be
some inconsistencies that still need correcting!
num_lines was changed by programmatic manipulation, without running
the tests. If the tests pass for these files, I'm confident that
the signatures for the remaining ones can be overwritten by
generating them again with the latest version of the code.
- Pandas will stop liking indexing with [:, np.newaxis], and we
need to get the underlying numpy array instead.
- np.sum will change behaviour when a generator is passed; we can
use the builtin sum instead.
- The syntax for accessing the channel data in nptdms has changed.
- Mapping has been in abc.collections for a while now, and the old
import will stop working in Python 3.9.
ageorgou and others added 2 commits November 13, 2020 12:36
As with some other datasets, on some platforms these are stored as
32-bit and on others as 64-bit.
using the convert_num_lines_to_int.py script on
the files where the sigs weren't update yet
@codecov-io
Copy link

codecov-io commented Nov 13, 2020

Codecov Report

Merging #74 (8115f8f) into master (358c551) will increase coverage by 1.48%.
The diff coverage is 85.59%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #74      +/-   ##
==========================================
+ Coverage   59.32%   60.80%   +1.48%     
==========================================
  Files          17       19       +2     
  Lines        1866     2008     +142     
  Branches      275      292      +17     
==========================================
+ Hits         1107     1221     +114     
- Misses        716      738      +22     
- Partials       43       49       +6     
Impacted Files Coverage Δ
src/silverlabnwb/signature.py 56.73% <ø> (ø)
tests/test_labview_import.py 75.71% <50.00%> (-0.76%) ⬇️
src/silverlabnwb/rois.py 78.31% <78.31%> (ø)
src/silverlabnwb/nwb_file.py 90.21% <82.43%> (-1.93%) ⬇️
src/silverlabnwb/header.py 88.96% <87.50%> (+0.32%) ⬆️
src/silverlabnwb/metadata.py 89.04% <100.00%> (ø)
src/silverlabnwb/timings.py 98.24% <100.00%> (+4.39%) ⬆️
tests/test_header.py 100.00% <100.00%> (ø)
tests/test_rois.py 100.00% <100.00%> (ø)
tests/test_timings.py 100.00% <100.00%> (ø)
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 358c551...8115f8f. Read the comment docs.

@alessandrofelder alessandrofelder marked this pull request as ready for review November 18, 2020 09:28
@ageorgou
Copy link
Contributor Author

Latest changes look good. Making a note to check that everything works with variable-shape ROIs as it's not yet tested on Travis.

@ageorgou
Copy link
Contributor Author

I suggest we merge this (without creating a new release yet). Have created #80 and #81 for the remaining issues/questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Variable-shape ROIs
4 participants