Skip to content

Conversation

@zm711
Copy link
Member

@zm711 zm711 commented Oct 23, 2024

This is a light changing of the error messaging in the core extension file along with some docstring touch ups for numpydoc.

Related to #3495

@zm711 zm711 added the documentation Improvements or additions to documentation label Oct 23, 2024
Copy link
Member Author

@zm711 zm711 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix my own typo haha.

Copy link
Collaborator

@h-mayorquin h-mayorquin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@JoeZiminski JoeZiminski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! am currently updating some old teaching code to sorting analyzer so appreciating the utility of such additions 🙏

AnalyzerExtension that select some random spikes.
AnalyzerExtension that select somes random spikes.
This allows for a subsampling of spikes for further calculations and is important
for managing that amount of memory and speed of computation in the analyzer.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very nice addition, could 'for further calculations' and the below ' This will be used by the waveforms/templates extensions.' be combined and emphasise that this choice could potentially have important consequences for results. e.g. 'The samples spikes will be used for calculating waveforms and templates and as such determine many downstream parameters (e.g. quality metrics). Therefore it is important that spikes a sufficient number of spikes are sampled and that these are distributed evenly through the dataset'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm torn. I was just trying to clarify. I prefer

one line short summary
params
return
examples
notes

And do that big explanation in the notes. The statement you're making I would put into notes, but I didn't want to change stuff too much although since I made the diff it's my fault you looked.

The real point of this PR is just to improve error messaging. I think we could improve further the docstrings etc in a separate PR (and this is a reminder for me to focus on my task so we don't get bogged down on other stuff for a PR with a specific purpose :) ).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds good, happy to leave this for another day. It's easier said than done to stick religiously to one change in a PR, there are always some appealing changes to be made that inevitably catch your eye!

error_msg = (
f"You have entered an operator {operator} in your `operators` argument which is "
f"not supported. Please use any of ['average', 'std', 'median', 'mad'] instead."
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice, in that case discard above comment 😆

]
if len(bad_operator_list) > 0:
raise ValueError(
f"Computing templates with operators {bad_operator_list} requires the 'waveforms' extension"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting I assumed waveforms always needed to be computed for templates, what other ways ways of doing it are there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should dig into the templates code. Alessio + Sam ( I think it was just them but maybe Heberto too) developed an accumulator method that uses the waveforms to make an average without saving them so you would

analyzer.compute(['random_spikes', 'templates']) and it will read the waveforms from random spikes while making the templates then discard them to save on memory. I think the only think it breaks would be doing PCA later, but it is way less storage intensive since you don't save all the extra waveforms. It is limited in the types of operators it can do though as you can see from the error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep this accumulator was a great stuf during the analyzer dev.
the only drawback is that MAD cannot be computed that way. but saving ram and disk space is cool.
you can see the idea in waveform_tools.py
every worker accumulate snippet in parralel and the sum + divide is done at teh end.
this is quite fast

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cool! will check that out

bad_operator_list = [
operator for operator in self.params["operators"] if operator not in ("average", "std")
]
if len(bad_operator_list) > 0:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe if any(bad_operator_list) is slightly more readable (?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is stylistic. I don't use any that often in code I think checking the length of the list is just my default. I don't know what others think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's clear enough :)

Co-authored-by: Joe Ziminski <55797454+JoeZiminski@users.noreply.github.com>
@samuelgarcia
Copy link
Member

ok for me.
@alejoe91 you want to have a look ?

@zm711 zm711 requested a review from JoeZiminski October 29, 2024 12:29
@alejoe91 alejoe91 dismissed JoeZiminski’s stale review November 4, 2024 15:10

Joe's comments have been addressed

@alejoe91 alejoe91 merged commit 5f81566 into SpikeInterface:main Nov 4, 2024
15 checks passed
@zm711 zm711 deleted the error-get-data branch November 4, 2024 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants