Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Trialized time series faceting #232

Conversation

h-mayorquin
Copy link
Contributor

@h-mayorquin h-mayorquin commented Oct 31, 2022

The idea is to add a widget for time series that can show trialized data both by filtering (selecting only data where a column -or more- are equal to one value) and faceting (showing the behavior across a grid when filtering for some values).

So far the widget works well as a stand alone:

image

But does not render properly when called from within:
nwb2widget(nwb)

Then there is no error but it never renders:
image

@bendichter

@CodyCBakerPhD
Copy link
Collaborator

CodyCBakerPhD commented Oct 31, 2022

Something I want to add for my data

  • faceting over data columns; it seems fast enough, would be nice way to visualize multiple ROI's at once

Something I need to add for my data

  • TimeInterval table selector; I have multiple tables, and may not always have a trials table

Something to discuss adding would be

  • observers for changes to data column selection; right now I have to push the render button each time, but if I want to rapidly scan through a bunch of ROI's this gets a bit annoying

A widget I'll make that harnesses parts of this (separate PR)

  • a 'heatmap' view of multiple ROI's plotted over the space of the image/pixel masks

@CodyCBakerPhD
Copy link
Collaborator

Also just noting that when you use the plotly interact with the faceting it adjusts each plot; is there maybe an option to control that? Or is it just because they all 'share' the x-axis? (basically I would want different x-axis scales each spanning the data range for each condition)

plotly_facet_all_axes_affected.mp4

@bendichter
Copy link
Collaborator

@CodyCBakerPhD

Also just noting that when you use the plotly interact with the faceting it adjusts each plot; is there maybe an option to control that? Or is it just because they all 'share' the x-axis? (basically I would want different x-axis scales each spanning the data range for each condition)

I don't understand

@CodyCBakerPhD
Copy link
Collaborator

I don't understand

If you watch the video, as I attempt to zoom into one of those tiny data plotting regions for a single facet, it also zooms into that region for all the other facets. Likewise, from a zoomed out perspective, looking at all the data for one facet might just leave a tiny blip for the data on another facet. For those cases I'd prefer the plot to span the axis for each facet in a more reasonable way and then indicate the range on the x-ticks

I can also see why you'd want to be able to view everything aligned to the same x-axis for consistency, so maybe this could be a 'lock-axis' option or something

@bendichter
Copy link
Collaborator

bendichter commented Oct 31, 2022

@h-mayorquin it doesn't look like you have a way of controlling the time range. It looks like in @CodyCBakerPhD 's example you are plotting data for the next 100 seconds? That's way too long. There should be text boxes that allow you to control the time before and after the time of alignment, similar to the PSTH widget

@h-mayorquin
Copy link
Contributor Author

@h-mayorquin it doesn't look like you have a way of controlling the time range. It looks like in @CodyCBakerPhD 's example you are plotting data for the next 100 seconds? That's way too long. There should be text boxes that allow you to control the time before and after the time of alignment, similar to the PSTH widget

Yes, in the Murthy conversion the trials are very short so is OK. I will see how to add this for the general case though.

Right now the trials are aligned to the beginning of the trial. What do you suggest doing when there is no data before. As in, here there is only data within the trials and they are non-contiguous? My default is returning none in this case.

@h-mayorquin
Copy link
Contributor Author

I will take a look into the x-axis locking while zooming on faceting thing @CodyCBakerPhD

@bendichter any idea why would this not render from within the widget accordion but would render on its own?

@CodyCBakerPhD
Copy link
Collaborator

any idea why would this not render from within the widget accordion but would render on its own?

I've seen that sometimes if the figure isn't a go.FigureWidget or isn't properly otherwise embedded in a widgets.Output box, I/we can play around with it tomorrow

@h-mayorquin
Copy link
Contributor Author

@h-mayorquin it doesn't look like you have a way of controlling the time range. It looks like in @CodyCBakerPhD 's example you are plotting data for the next 100 seconds? That's way too long. There should be text boxes that allow you to control the time before and after the time of alignment, similar to the PSTH widget

OK, I did this in the last commit.
@bendichter
Thinking more about this.
Before this widget ploted the traces within the trial, that is, the traces were plotted from (start_time, to stop_time) . I thought this made sense because start_time and stop_time are the meaningful interval for the trial (after all, that's the reason for them to be in the trials table).

I thought that the control that made sense was:

(start_time + "user_selected_time", stop_time + "another_user_selected_time)`:

Where both the user selected times (controlled by widgets) could be positive or negative.

But looking at the alignment control options in the PSTH:
image

It seems that the plotting interval is:
("time_chosen_for_alignment" + "user_selected_time", "time_chosen_for_alignment" + "another_user_selected_time)`:

Where the "time_chosen_for_alignment" is usually the start_time but it can be any other column that follows the convention of ending in _time.

Why does this make more sense than plotting over the whole trial? In what sense was the plotting interval in @CodyCBakerPhD example way too long you mentioned above?

@h-mayorquin
Copy link
Contributor Author

@CodyCBakerPhD
I addressed your in person comment of disabling the filtering selection whe not usted in the last commit.
I think I can disable the x-locking behavior but I don't think it is a good idea to disable it in general. Plotly documentation advice against it and I guess they made it a default for a reason:
https://plotly.com/python/facet-plots/

We can have a checkbox that controls for whether the behavior is active or not if you and @bendichter think that is desirable.

@CodyCBakerPhD
Copy link
Collaborator

CodyCBakerPhD commented Nov 3, 2022

Plotly documentation advice against it and I guess they made it a default for a reason:
https://plotly.com/python/facet-plots/

We can have a checkbox that controls for whether the behavior is active or not if you and @bendichter think that is desirable.

Yeah an optional toggle is all I'm asking for there. Common axes are a good default, but in cases where some slices are very thin on certain facets it can make more sense to try to expand different axes to note the trends one might see on the zoomed-in scale

@h-mayorquin
Copy link
Contributor Author

The widget so far:

showcase.mp4

@CodyCBakerPhD
Copy link
Collaborator

CodyCBakerPhD commented Nov 3, 2022

@h-mayorquin If you want more examples to test/develop with, here's a snippet for using my latest files (recommend using on DANDI Hub for optimal bandwidth speeds)

import h5py
import fsspec
from fsspec.implementations.cached import CachingFileSystem
from pynwb import NWBHDF5IO
from nwbwidgets.timeseries import TrializedTimeSeries

s3_url = "https://dandiarchive.s3.amazonaws.com/blobs/297/fd6/297fd6b9-4c28-48c4-8c86-e9ef2a0fb07d"
cfs = CachingFileSystem(
    fs=fsspec.filesystem("http"),
    cache_storage="/home/jovyan/fsspec_cache",  # Local folder for the cache
)
file_system = cfs.open(s3_url, "rb")
file = h5py.File(file_system)
io = NWBHDF5IO(file=file, load_namespaces=True)
nwbfile = io.read()

# Choose series
#roi_response_series = nwbfile.processing["ophys"]["DfOverF"].roi_response_series["NeuronDfOverF"]
roi_response_series = nwbfile.processing["ophys"]["Fluorescence"].roi_response_series["NeuronFluorescence"]

# Choose table
#target_table = nwbfile.processing["behavior"]["SwimIntervals"]
target_table = nwbfile.processing["behavior"]["ActivityStates"]

TrializedTimeSeries(time_series=roi_response_series, trials_table=target_table)

@h-mayorquin
Copy link
Contributor Author

h-mayorquin commented Nov 9, 2022

OK, I had to do some changes but now it renders with nwb2widget(nwb):

@bendichter
Copy link
Collaborator

@h-mayorquin great! What did you need to change?

@h-mayorquin
Copy link
Contributor Author

@h-mayorquin great! What did you need to change?

There was a silent error because the nwb2widget function requires neurodata_vis_spec items to have a neurodata_vis_spec as an argument if they are callable. I added a router function (route_trialized_time_series) with this argument.

@h-mayorquin
Copy link
Contributor Author

@CodyCBakerPhD I added a toggle to control the x-locking behavior in the last commit. Could you test it to confirm that is the behavior that you wanted?

@h-mayorquin h-mayorquin marked this pull request as ready for review November 9, 2022 19:28
)
else:
return neurodata_vis_spec[NWBDataInterface](df_over_f, neurodata_vis_spec)
return neurodata_vis_spec[NWBDataInterface](df_over_f, neurodata_vis_spec)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a problem here when we make the neurodata_vis_spec[RoiResponseSeries] an OrderedDict to include the TrializedTimeSeries as an option (see the diff in the view.py module).

Because show_df_over_f is a callable the function nwb2widget expects this output to be a visualization (not an ordered dict) and a naive extension of the visualization specification generates an error. The solution I implemented was to roll back the show_df_over_f to return the visualization for the whole Container/DataInterface (see the else in the diff representing the old code) and then the visualization can be properly display as a root of the lazy_tab for that module. That is how the widget is working right now.

The other option was to modify nwb2widget to handle the case when callables return OrderedDict but that seemed like a larger modification given that nwb2widget is such a central function and this might have non-tested side effects in other visualizations. I opted for the simple solution here as that only touches the visualizations concerning the RoiExtractors and DfOverF objects.

return data.flatten()


def trialize_time_series(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you use this function?

def align_by_time_intervals(

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the function implemented there is to build the dataframe into the form where the faceting functionality of plotly can be used.

There is a place inside trialized_time_series where the data from trials is extracted and also the timestamps. For that part, after your last suggestion above, I took a look a the functions in that utils module for timeseries. I don't think they fit that well here without introducing more unnecessary computation.

Basically, right now I have a vectorized bisect function (as provided by numpy) which I use to extract the set of trialized indexes and then I use them on both the data and the timestamps. The functions in the utils module are non vectorized and they return the data in a list without the indexes which I need for the timestamps. Using align_by_time_intervals, for example, would reduce the performance here both by losing the vectorization and then by requiring some more computation to get the indexes of the timestamps.

I think the generalization would be to have a general vectorized function that return the trialized indexes. That hypothetical function then can be used both here and inside bisect_timeseries_by_times to avoid the per trial bisection there.

What do you think?

return empty_figure


def calculate_moving_average_over_trials(df, moving_average_window):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems specific to your widget. Maybe it would be better as a static method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to separate the logic of plot creation (here in the function build_faceting_figure) from the logic of the widget. In my view this separation of concerns (composition) makes it easier to modify and test the plotting function. Reason being:

  1. It is easier for someone who does not know the function to get the plotting function and just use it by itself to understand its behavior. In opposition to this, in the AlignMultiTraceTimeSeriesByTrialsVariable you need to go very deep in the function to understand what is the plotting function.
  2. The plotting function can be tested or modified with minimal modification to the widget class. I think keeping the points of contact between the plotting function and all the other widget class responsibilities small leads to better design.

Basically I view the widget class a place with the following responsabilities:

  1. Data loading from the appropriate neuro data type.
  2. Control definition (maybe can be done with composition using your concept of foreign controlers).
  3. Handling the interaction between controls and the (external) plotting function.

What do you think? Are there downsides that I am not seeing for this separation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That said, I can create a plotting class and make all the methods that are floating now by themselves methods of that class to give the code more coherence.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better way to organize would be to separate calculation from plotting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this got me thinking:
#239

return df_sort


def add_moving_average_traces(figure, df, facet_col, facet_row):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems specific to your widget. Maybe it would be better as a static method.

return figure


def build_faceting_figure(df, facet_col, facet_row, data_label="data", trial_label="trial"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems specific to your widget. Maybe it would be better as a static method.

self.figure_widget.update_xaxes(matches=matches)


def route_trialized_time_series(time_series: TimeSeries, neurodata_vis_spec=None, **kwargs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no routing going on here..

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would also be good if you subclassed TrializedTimeSeries to have proper labels for specific modalities

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking that would be the role of the routing function. Initializing the visualization -for lack of a better term- to match the neurodata type that is specified for it in the visualization spec dictionary. Besides the column of the data that we are visualizing, what other labels do you think should be changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is done in the last commit and in the routing function.

@bendichter
Copy link
Collaborator

@h-mayorquin the sorting looks right. Good thinking using natsort. Could you show a screenshot that displays this working correctly?

@h-mayorquin
Copy link
Contributor Author

@h-mayorquin the sorting looks right. Good thinking using natsort. Could you show a screenshot that displays this working correctly?

Here is some recording of it working on the test data that I have been using:

sharing.mp4

@h-mayorquin
Copy link
Contributor Author

h-mayorquin commented Dec 6, 2022

@bendichter
I ordered the categories so they appear ordered when facetting.
I also added some more machinery to the routing function so when the time series is an RoiResponseSeries the choice over column says ROI and it says electrode when it is an ElectricalSeries in the commit just below.

image

@bendichter bendichter merged commit 29b8d06 into NeurodataWithoutBorders:master Dec 6, 2022
@CodyCBakerPhD CodyCBakerPhD deleted the trialized_time_series_faceting branch December 6, 2022 16:05
@h-mayorquin h-mayorquin mentioned this pull request Dec 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants