Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NsxFile() incorrectly loads data when indexing for less than a full file #6

Open
tlancaster-blackrock opened this issue Feb 10, 2022 · 5 comments

Comments

@tlancaster-blackrock
Copy link
Contributor

tlancaster-blackrock commented Feb 10, 2022

Issue originally raised by Lukas Kunz:

It seems to me that this function reads the data with flipped dimensions. Hence, the output is not channels x time, but rather time x channels. This is not really problematic when reading all channels at once, because one can then simply transpose the data array to obtain the correct dimensions. However, when reading only one channel (e.g., nsxFileContent = nsxFileObj.getdata(1, 0)), then incorrect data is returned because getdata() internally seems to index the data as if they had the correct dimensions (channels x time), although they do not, as described above (time x channels).

  • identify ns5 file
    microFile = glob.glob(sessions[iSess] + '\Micro*.ns5')
  • read the ns5 file
    nsxFileObj = NsxFile(microFile[0])
    nsxFileContent = nsxFileObj.getdata('all', 0)
    nsxFileObj.close()
  • obtain the raw data in the ns5 file, organized as N-by-M array with
  • N = number of channels and M = number of samples
    microData = nsxFileContent['data'][0].T

The python utilities only load the correct data when loading the entire file at once (which can be a problem when the ns5 file is very large).
Loading only a subset of channels (1, 2, ..., n-1) will lead to incorrect data based on internal incorrect indexing (the data for channel i will represent a mixture of data from various channels and various time points). This cannot be corrected by the user post-hoc.

@Sam-NT
Copy link

Sam-NT commented Jan 31, 2023

Hi @tlancaster-blackrock , is there an update for when this issue might be patched? Just spent a day tracking down this issue to report it before checking Github to find it's already been posted

Thanks for the help!

@cboulay
Copy link
Contributor

cboulay commented Apr 18, 2023

@Sam-NT , please see if the latest push to #22 fixes this for you.

@Sam-NT
Copy link

Sam-NT commented May 1, 2023

Hey @cboulay, I just gave it a test and found some differences between the latest commit on the main branch (which I was using previously) and the latest commit to #22. For reference, here are the commands I ran:

from Python-Utilities import brpylib
fpath = 'test.ns5'
nsx = brpylib.NsxFile(fpath)
all_data = nsx.getdata()
ch1_data = nsx.getdata(elec_ids=[1])

Notable differences:

  1. It seems the output dimensions of brpylib.NsxFile.getdata() are transposed (i.e. main has dimensions (samples, num_ch) but in Ptp fix #22 the dimensions are (num_ch, samples))
  2. Both all_data and ch1_data computed via Ptp fix #22 are not split into multiple segments, but the same file run through the latest commit of main splits the data into multiple segments
    • Note: I think the NS5 file with which I am testing should have multiple segments, as it was recorded via two synchronized NeuroPort systems
    • It is critical for those of us using synchronized NSPs that we know which timepoints are synchronized and which are not
  3. In Ptp fix #22, ch1_data appears to contain the same data as all_data. In other words, even though I specify elec_ids=[1] as an argument to the getdata() function for ch1_data, the function still returns the full set of data
  4. The data contained in all_data and ch1_data output by getdata() in Ptp fix #22 is not the same as the data contained in any of the segments of all_data output by getdata() in the latest commit of main

It's possible I am missing some updated instruction for how to use these functions, so please let me know if I am misusing them. I hope the feedback helps!

@cboulay
Copy link
Contributor

cboulay commented May 1, 2023

Very helpful, thanks!

I think I can fix 1 and 3 easily. However, I might need a different test file for the other 2.
@Sam-NT , would you be willing to provide a pair of test files?

@jmgoodman
Copy link

jmgoodman commented May 11, 2023

I am having similar issues.

In #22, the results from querying individual channels simply gives me the entire array instead.

In the latest commit of main, querying individual electrodes gives me an appropriately sized array, but it produces the same result no matter what electrode I select, and this result does not correspond to the data from the electrode I queried. The following test case demonstrates this:

from brpylib import NsxFile
import numpy as np

chan0 = 1
chan1 = 5

# test data both when pulling from two channels separately and when querying the full data
f = NsxFile('sampleData.ns6')
dfull = f.getdata()
dsingle0 = f.getdata(elec_ids=[chan0])
dsingle1 = f.getdata(elec_ids=[chan1])
f.close()

y = dfull['data'][0]
y0 = dsingle0['data'][0]
y1 = dsingle1['data'][0]

# these should be channel-matched (and thus yield 0 difference everywhere)
dy0 = y[:,chan0] - y0[:,0]
dy1 = y[:,chan1] - y1[:,0]

# these should be entirely different channels (and thus yield nonzero almost everywhere)
dysingle = y0 - y1

# anti-test for equivalence of y0 and y1 (as in, it "passes" when it exhibits this pathological behavior)
assert(np.all(dysingle == 0))

# anti-tests for non-equivalence of y0 and y1 with the corresponding columns of y (as in, it "passes" when it exhibits this pathological behavior)
assert(not np.all(dy0==0))
assert(not np.all(dy1==0))
print('tests failed successfully')

# test for the non-equivalence of the two channels in y
assert(not np.all(y[:,chan0] - y[:,chan1] == 0))
print('last test passed successfully')

Sorry for being a pest about this issue if it's something that's already quite acutely on your radar!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants