Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Importing Neuroscan Evoked and Epochs-like files #12367

Open
withmywoessner opened this issue Jan 17, 2024 · 7 comments
Open

Importing Neuroscan Evoked and Epochs-like files #12367

withmywoessner opened this issue Jan 17, 2024 · 7 comments
Labels

Comments

@withmywoessner
Copy link
Contributor

Describe the new feature or enhancement

Adding support for importing .avg and .eeg neuroscan files as evoked and epochs objects.

Describe your proposed implementation

I have written simple scripts using struct to read the raw byte data. I could add these scripts to mne in the form of a mne.read_neuroscan_epochs() function. Here is an example:

def eeg_to_ascii(
        file_name, chanlist='all', triallist='all', typerange='all',
        accepttype='all', rtrange='all', responsetype='all',
        data_format='auto'):
    """This function reads the data from a binary EEG file, extracts and scales the data, and returns it in ASCII format in volts.


    # Check if file ends with .eeg
    if not file_name.endswith('.eeg'):
        raise ValueError("File must be a binary EEG file (.eeg).")

    if not os.path.isfile(file_name):
        raise ValueError(f"File {file_name} not found.")

    with open(file_name, 'rb') as f:
        try:
            # Read general part of the ERP header and set variables
            f.read(20)  # skip revision number
            f.read(342)  # skip the first 362 bytes

            nsweeps = struct.unpack('<H', f.read(2))[0]  # number of sweeps
            f.read(4)  # skip 4 bytes
            # number of points per waveform
            pnts = struct.unpack('<H', f.read(2))[0]
            chan = struct.unpack('<H', f.read(2))[0]  # number of channels
            f.read(4)  # skip 4 bytes
            rate = struct.unpack('<H', f.read(2))[0]  # sample rate (Hz)
            f.read(127)  # skip 127 bytes
            xmin = struct.unpack('<f', f.read(4))[0]  # in s
            xmax = struct.unpack('<f', f.read(4))[0]  # in s
            f.read(387)  # skip 387 bytes

            # Read electrode configuration
            chan_names = []
            baselines = []
            sensitivities = []
            calibs = []
            factors = []
            for elec in range(chan):
                chan_name = f.read(10).decode('ascii').strip('\x00')
                chan_names.append(chan_name)
                f.read(37)  # skip 37 bytes
                baseline = struct.unpack('<H', f.read(2))[0]
                baselines.append(baseline)
                f.read(10)  # skip 10 bytes
                sensitivity = struct.unpack('<f', f.read(4))[0]
                sensitivities.append(sensitivity)
                f.read(8)  # skip 8 bytes
                calib = struct.unpack('<f', f.read(4))[0]
                calibs.append(calib)
                factor = calib * sensitivity / 204.8
                factors.append(factor)

        except struct.error:
            raise ValueError(
                "Error reading binary file. File may be corrupted or not in the expected format.")
        except Exception as e:
            raise ValueError(f"Error reading file: {e}")

    # Read and process epoch datapoints data
    data = np.empty((nsweeps, len(chan_names), pnts), dtype=float)
    sweep_headers = []

    # Constants for the sweep header size in bytes and data point size in bytes
    SWEEP_HEAD_SIZE = 13
    DATA_POINT_SIZE = 4

    with open(file_name, 'rb') as f:
        # Ensure the file pointer is at the beginning of the EEG data
        f.seek((900 + chan * 75))

        for sweep in range(nsweeps):
            # Read the sweep header
            try:
                # f.read(SWEEP_HEAD_SIZE)
                accept = struct.unpack('<c', f.read(1))[0]
                ttype = struct.unpack('<h', f.read(2))[0]
                correct = struct.unpack('<h', f.read(2))[0]
                rt = struct.unpack('<f', f.read(4))[0]
                response = struct.unpack('<h', f.read(2))[0]
                # reserved  struct.unpack('<h', f.read(2))[0]
                f.read(2)  # skip 2 bytes
                sweep_headers.append(
                    (accept, ttype, correct, rt, response, sweep))
            except struct.error:
                raise ValueError(
                    "Error reading sweep header. File may be corrupted or not in the expected format.")
            except Exception as e:
                raise ValueError(f"Error reading sweep header: {e}")

            for point in range(pnts):
                for channel in range(chan):
                    try:
                        # Read the data point as a 4-byte integer
                        value = struct.unpack('<l', f.read(DATA_POINT_SIZE))[0]

                        # Scale the data point to microvolts and store it in the data array
                        data[sweep, channel, point] = value * factors[channel]
                    except struct.error:
                        raise ValueError(
                            "Error reading data points. File may be corrupted or not in the expected format.")
                    except Exception as e:
                        raise ValueError(f"Error reading data points: {e}")

    # Convert data from microvolts to volts
    data = data * 1e-6
    # Return relevant data in ASCII format
    return data, chan_names, rate, xmin, sweep_headers

Describe possible alternatives

I am also working on writing some C++ code to do this as well that mne could make use of.

Ultimately, it may also be simpler to write a separate library myself and just create EpochsArrays from scratch.

Additional context

No response

@larsoner
Copy link
Member

mne.read_evokeds_cnt and mne.read_epochs_cnt or similar seem reasonable to me, we have similar functions for MFF and EEGLAB. It would be great to reuse/refactor as much code from read_raw_cnt as possible. Or is this a different Neuroscan format altogether separate from cnt? If so, what do you use currently to read the raw data, if anything?

@withmywoessner
Copy link
Contributor Author

I just use the struct library to read the raw binary data @larsoner. According to this site (which mne cites as a reference for cnt.py) Here are how the various formats are structured:
image
image
I believe the file headers are the same as cnt, but as you can see the actual data is formatted differently for each.

@larsoner
Copy link
Member

Yeah if we can reuse all the header and info setting code then the new epochs and evoked functions can hopefully be very short!

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Jan 17, 2024

Okay Thanks! Before I start should I work on this in the cnt.py file? If so, should the file be renamed to neuroscan.py. There is also a curry.py which is the newer neuroscan file format. Maybe that should be placed in a neuroscan folder as well.

@larsoner
Copy link
Member

Yes I think cnt.py is the right place. I wouldn't start a new folder, better to stick with our original naming. But it would be good to add a note to read_raw_cnt (and the functions that you add) that it's for reading older neuroscan files. Searching https://mne.tools/dev/generated/mne.io.read_raw_cnt.html for example "neuroscan" doesn't show up at all, I had to figure it out from searching "mne neuroscan" and finding https://mne.tools/stable/auto_tutorials/io/20_reading_eeg_data.html#neuroscan-cnt-cnt (which should also maybe be updated to mention this old/new format stuff).

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Jan 27, 2024

Hey @larsoner , I don't think Neuroscan stores the event times of epochs with respect to the original data, just a list of epochs and some metadata related to response latencies/event code. Is it all right if I include an option to make up event sampling times? I am not really familiar with the kit and eeglab file formats so I am unsure if the readers also do this for those file types

@larsoner
Copy link
Member

Is it all right if I include an option to make up event sampling times? I am not really familiar with the kit and eeglab file formats so I am unsure if the readers also do this for those file types

Yes I would just make them up as np.arange(0, len(events)) * np.ceil((tmax - tmin) * sfreq).astype(int) or similar. Or even better check what we do in read_epochs_* functions to see if we similarly allow inventing event times

https://mne.tools/stable/generated/mne.read_epochs_kit.html
https://mne.tools/stable/generated/mne.read_epochs_eeglab.html
https://mne.tools/stable/generated/mne.read_epochs_fieldtrip.html

I suspect we have no option but to make up times of some sort, so I wouldn't bother making any option to control it (just try to do something reasonable)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants