Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 23 additions & 8 deletions neo/rawio/spikeglxrawio.py
Original file line number Diff line number Diff line change
Expand Up @@ -246,12 +246,20 @@ def _parse_header(self):
chan_name = local_chan
chan_id = f"{stream_name}#{chan_name}"
event_channels.append((chan_name, chan_id, "event"))
# add events_memmap
data = np.memmap(info["bin_file"], dtype="int16", mode="r", offset=0, order="C")
data = data.reshape(-1, info["num_chan"])
# The digital word is usually the last channel, after all the individual analog channels
extracted_word = data[:, len(info["analog_channels"])]
self._events_memmap = np.unpackbits(extracted_word.astype(np.uint8)[:, None], axis=1)
# Create memmap for digital word but defer unpacking until needed
# The digital word is stored as the last channel, after all the individual analog channels
# For example: if there are 8 analog channels (indices 0-7), the digital word is at index 8
num_samples = info["sample_length"]
num_channels = info["num_chan"]
data = np.memmap(
info["bin_file"],
dtype="int16",
mode="r",
shape=(num_samples, num_channels),
order="C",
)
digital_word_channel_index = len(info["analog_channels"])
self._events_memmap_digital_word = data[:, digital_word_channel_index]
event_channels = np.array(event_channels, dtype=_event_channel_dtype)

# No spikes
Expand Down Expand Up @@ -342,7 +350,9 @@ def _get_event_timestamps(self, block_index, seg_index, event_channel_index, t_s
info = self.signals_info_dict[0, "nidq"] # There are no events that are not in the nidq stream
dig_ch = info["digital_channels"]
if len(dig_ch) > 0:
event_data = self._events_memmap
# Unpack bits on-demand - this is when memory allocation happens
event_data = np.unpackbits(self._events_memmap_digital_word.astype(np.uint8)[:, None], axis=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we cache this ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with this. we were previously caching this, so if we are doing the unpacking we should cache right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't cache any other call to data. Why would this be different?

We were not caching before because we intended it to, we were just loading the data at init because someone outside of the team did the PR I think.

I prefer my resources to be freed outside of the call to get_data, how are you guys thinking about it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a fair point. How expensive is unpackbits. I don't know that well. It wouldn't be too hard to say:

if self.event_data is Not None:
event_data = self.event_data
else:
self.event_data = event_data = unpack

I guess the ultimate question is if this is a cheap operation then it doesn't hurt to do it in the init, it's just contrary to our strategy. If it is expensive (time or memory) we should move it. If expensive in memory then caching is expensive. If expensive in time and possibly something a user would want multiple times it would be nice to have a cache.

I'm curious to hear Sam on this one. I could be convinced not to cache it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the giant allocation? Is the memory actually being used to store the events. I just don't know what events look like for this reader.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question was more:

  1. is the demultiplexing that occurs with the unpackbits (+ the memory allocation) relatively slow or relatively fast and how does it scale with length of recording
  2. Does it always demultiplex to the same n_channels such that one could estimate the memory allocation based on the length of recording or is this variable based on what the user does

I think moving this from parse_header (sorry said init earlier), is smart because we just need the memmap. My further question though for caching is speed and memory allocation after. I don't use events from the neo api at all so for me it's hard to imagine use cases. But maybe someone would try to get the events multiple times for something in which case it might be nice to have a cache if the process is relatively slow. If fast then absolutely no caching.

Copy link
Contributor Author

@h-mayorquin h-mayorquin Oct 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general question is whether we cache data request in general. There is a trade-off between memory allocation (the memory allocation will remain after you do the data request which is bad) vs making it faster the next time you request. So this is the lens to discuss I think:

To my surprise, unpackbits turned out to be more expensive than I expected. However, it’s not unusually costly compared to other operations we perform, such as demultiplexing signals (see Intan) or scaling them (see spike interface return scaled). In those cases, we don’t use caching, so I think we should be consistent: either apply caching across all of them or not use it at all.

Here are some performance comparisons:

echo "=== np.unpackbits ===" && python3 -m timeit -n 3 -r 3 "import numpy as np; d = np.random.randint(-32768, 32767, 216_000_000, dtype='int16'); np.unpackbits(d.astype(np.uint8)[:, None], axis=1)" && \
echo "=== bitwise_and ===" && python3 -m timeit -n 3 -r 3 "import numpy as np; d = np.random.randint(-32768, 32767, 216_000_000, dtype='int16'); np.bitwise_and(d, 1) > 0" && \
echo "=== stim decode ===" && python3 -m timeit -n 3 -r 3 "import numpy as np; d = np.random.randint(-32768, 32767, 216_000_000, dtype='int16'); mag = np.bitwise_and(d, 0xFF); sign = np.bitwise_and(np.right_shift(d, 8), 0x01); np.where(sign == 1, -mag.astype(np.int16), mag.astype(np.int16))" && \
echo "=== arithmetic ===" && python3 -m timeit -n 3 -r 3 "import numpy as np; d = np.random.randint(-32768, 32767, 216_000_000, dtype='int16'); d * 0.195"
=== np.unpackbits ===
3 loops, best of 3: 2.25 sec per loop
=== bitwise_and ===
3 loops, best of 3: 461 msec per loop
=== stim decode ===
3 loops, best of 3: 2.89 sec per loop
=== arithmetic ===
3 loops, best of 3: 1.27 sec per loop

think we should merge this as it is because this is consistent with the way that the library works: there are are similarly expensive data request uncached in other places. It is Ok to change this, but we should adopt a general policy and change it consistently in all the places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I think that rawio should remain simple and without these type of complications. People can still cache the data on the spot if they want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with this argument in general. Let's wait until tomorrow in case Sam wants to strongly fight for the other direction and if not we merge tomorrow afternoon?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.


channel = dig_ch[event_channel_index]
ch_idx = 7 - int(channel[2:]) # They are in the reverse order
this_stream = event_data[:, ch_idx]
Expand Down Expand Up @@ -682,7 +692,12 @@ def extract_stream_info(meta_file, meta):
info["sampling_rate"] = float(meta[k])
info["num_chan"] = num_chan

info["sample_length"] = int(meta["fileSizeBytes"]) // 2 // num_chan
# Calculate sample_length from actual file size instead of metadata to handle stub/modified files
# The metadata fileSizeBytes may not match the actual .bin file (e.g., in test stub files)
# Original calculation (only correct for unmodified files):
# info["sample_length"] = int(meta["fileSizeBytes"]) // 2 // num_chan
actual_file_size = Path(meta_file).with_suffix(".bin").stat().st_size
info["sample_length"] = actual_file_size // 2 // num_chan # 2 bytes per int16 sample
info["gate_num"] = gate_num
info["trigger_num"] = trigger_num
info["device"] = device
Expand Down
Loading