
# <h1 style="text-align: center; color: #2E86AB; font-size: 32px; font-weight: bold;">📊 SPIKE GLX TO NWB CONVERSION PROCESS</h1>

## <h2 style="color: #A23B72; font-size: 24px;">🎯 **PURPOSE OF THIS NOTEBOOK**</h2>

This notebook performs **two main functions**:

### <h3 style="color: #F18F01; font-size: 20px;">🔍 **CELL 1: FINDING SPIKE GLX BIN FILES**</h3>
- **Searches** through the directory structure to locate SpikeGLX `.bin` files
- **Identifies** each mouse folder and their corresponding recording sessions
- **Prepares** the file paths for conversion

### <h3 style="color: #C73E1D; font-size: 20px;">⚡ **CELL 2: CREATING RAW NWB FILES**</h3>
- **Converts** SpikeGLX `.bin` files to NWB format for each mouse
- **Extracts** only the **electrical signals** from Neuropixels:
  - **LFP (Local Field Potential)** signals
  - **AP (Action Potential)** signals
- **Other electrical signals** like TTL, pizo, I/O signals
- **Creates** a `-raw.nwb` file containing **ONLY** the raw electrical data
- **Saves** the raw NWB files to the specified output directory

---

## <h2 style="color: #2E86AB; font-size: 24px;">�� **OUTPUT STRUCTURE**</h2>

For each mouse session, the process creates:

In [None]:
# this code is to convert the spikeglx data to nwb format for all the sessions of a mouse
import os
# this code is to convert the spikeglx data to nwb format
from datetime import datetime
from dateutil import tz
from pathlib import Path
from neuroconv.converters import SpikeGLXConverterPipe


directory = 'G:\\MiceFolders_ephys\\'     
MouseNames = ['PG076']
for MouseName in range(len(MouseNames)):
    print(MouseName)
    CurrMouseName = MouseNames[MouseName]
    print(CurrMouseName)
    # transferbinMeta({CurrMouseName},'Z:\\analysis\\Parviz_Ghaderi\\MiceFolders\\','E:\\MiceFolders\\')
    SessionNames =[name for name in os.listdir(directory + CurrMouseName + '\\RECORDING\\ELECTROPHYSIOLOGY\\') if os.path.isdir(os.path.join(directory + CurrMouseName + '\\RECORDING\\ELECTROPHYSIOLOGY\\', name))]
    print(SessionNames)
    for SessionName in range(len(SessionNames)):
        CurrSessionName = SessionNames[SessionName]
        print(CurrSessionName)
        g_names = [name for name in os.listdir(directory + CurrMouseName + '\\RECORDING\\ELECTROPHYSIOLOGY\\' + CurrSessionName + '\\') if os.path.isdir(os.path.join(directory + CurrMouseName + '\\RECORDING\\ELECTROPHYSIOLOGY\\' + CurrSessionName + '\\', name))]
        print(g_names)
        for id_g_name in range(1):   # here i just add main session not other session
            Curr_g_name = g_names[id_g_name]
            print(Curr_g_name)
            folder_path = directory + CurrMouseName + '\\RECORDING\\ELECTROPHYSIOLOGY\\' + CurrSessionName + '\\' + Curr_g_name
            streams_list=(SpikeGLXConverterPipe.get_streams(folder_path=folder_path))
            streams_list = [stream for stream in streams_list if stream != "nidq"]
            converter = SpikeGLXConverterPipe(folder_path=folder_path, verbose=True,streams=streams_list)
            # Extract what metadata we can from the source files
            metadata = converter.get_metadata()
            # For data provenance we add the time zone information to the conversion
            session_start_time = metadata["NWBFile"]["session_start_time"].replace(tzinfo=tz.gettz("US/Pacific"))
            metadata["NWBFile"].update(session_start_time=session_start_time)
            extension = ".nwb"
            for file in os.listdir(directory + CurrMouseName + '-nwb\\' + CurrSessionName + '\\' + Curr_g_name + '\\'):
                if file.endswith(extension):
                    filename = os.path.splitext(file)[0]
                    filename = filename.replace("-processed-behavior", "")
                    print(filename)
            nwbfile_path =directory + CurrMouseName + '-nwb\\' + CurrSessionName + '\\' + Curr_g_name + '\\' + filename + '-raw.nwb'
            converter.run_conversion(nwbfile_path=nwbfile_path,metadata =metadata,overwrite=True)

# <h1 style="text-align: center; color: #2E86AB; font-size: 32px; font-weight: bold;">🔄 MERGING RAW + PROCESSED DATA</h1>

## <h2 style="color: #A23B72; font-size: 24px;">📋 **PREREQUISITES**</h2>

Before running the next cell, you should have already created these files using **MATLAB**:

### <h3 style="color: #F18F01; font-size: 20px;">📁 **Required Files from MATLAB Processing**</h3>

- **`-Behavior-Processed.nwb`** - Combined behavioral and processed data

---

## <h2 style="color: #C73E1D; font-size: 24px;">�� **NEXT STEP: MERGE RAW + PROCESSED**</h2>

The **next cell** will:

### <h3 style="color: #2E86AB; font-size: 20px;">⚡ **What it does:**</h3>

1. **Takes** the `-raw.nwb` files (containing only electrical signals)
2. **Takes** the `-Behavior-Processed.nwb` files (containing behavioral + processed data)
3. **Merges** them together into a **new combined file**
4. **Creates** the `-Behavior-Processed-raw.nwb`

### <h3 style="color: #A23B72; font-size: 20px;">�� **Output Structure:**</h3>


In [None]:
# this code combine the raw and processed file to one file
import os
import glob
import shutil
from datetime import datetime
from dateutil import tz
from pathlib import Path
from pynwb import NWBHDF5IO, NWBFile, TimeSeries, get_manager
from pynwb.ecephys import LFP, ElectricalSeries
from hdmf.common import DynamicTableRegion
manager = get_manager()
from neuroconv.converters import SpikeGLXConverterPipe
import dandi
import subprocess

directory = 'G:\\MiceFolders_ephys\\'
directory2write='G:\\nwb-behaviour-processed-raw\\'  
# ['PG019-nwb','PG027-nwb','PG028-nwb','PG030-nwb','PG032-nwb','PG038-nwb','PG061-nwb','PG062-nwb','PG064-nwb','PG082-nwb','PG083-nwb','PG084-nwb','PG085-nwb']
# [ 'PG019-nwb','PG027-nwb','PG028-nwb','PG030-nwb','PG032-nwb','PG038-nwb','PG061-nwb','PG062-nwb','PG064-nwb','PG076-nwb','PG082-nwb','PG083-nwb','PG084-nwb','PG085-nwb']
MouseNames =  ['PG076-nwb']

for MouseName in range(len(MouseNames)):
    print(MouseName)
    CurrMouseName = MouseNames[MouseName]
    print(CurrMouseName)
    SessionNames =[name for name in os.listdir(directory + CurrMouseName ) if os.path.isdir(os.path.join(directory + CurrMouseName , name))]
    print(SessionNames)
    for SessionName in range(len(SessionNames)):
        CurrSessionName = SessionNames[SessionName]
        print(CurrSessionName)
        g_names = [name for name in os.listdir(directory + CurrMouseName +'\\'+ CurrSessionName ) if os.path.isdir(os.path.join(directory + CurrMouseName +'\\'+ CurrSessionName , name))]
        print(g_names)
        for id_g_name in range(1):   # here i just add main session not other session
            Curr_g_name = g_names[id_g_name]
            print(Curr_g_name)
            
            # Find files with names ending with '-raw.nwb' and '-behavior.nwb'
            nwbfile_path1 = glob.glob(directory + CurrMouseName +'\\'+ CurrSessionName +'\\'+Curr_g_name +'\\'+ '/*-raw.nwb')[0]
            nwbfile_path2 = glob.glob(directory + CurrMouseName +'\\'+ CurrSessionName +'\\'+Curr_g_name +'\\'+ '/*-behavior.nwb')[0]
            
            # Create output directory if it doesn't exist
            os.makedirs(directory2write, exist_ok=True)
            
            # Define output path using just the basename
            nwbfile_path3 = os.path.join(directory2write, os.path.basename(nwbfile_path2).replace(".nwb", "-raw.nwb"))
            print(f"Output file: {nwbfile_path3}")
            
            # Read both files
            with NWBHDF5IO(nwbfile_path1, "r", manager=manager) as io1:
                nwbfile1 = io1.read()  # Raw file (has acquisition data)
                with NWBHDF5IO(nwbfile_path2, "r", manager=manager) as io2:
                    nwbfile2 = io2.read()  # Behavior file (has electrodes and processed data)
                    
                    # First, ensure the behavior file has the electrode information
                    if not hasattr(nwbfile2, 'electrodes') or nwbfile2.electrodes is None:
                        print("Warning: Behavior file doesn't have electrodes table!")
                        continue
                    
                    # Copy electrode groups from raw file if they don't exist in behavior file
                    if hasattr(nwbfile1, 'electrode_groups') and nwbfile1.electrode_groups is not None:
                        for group_name, group in nwbfile1.electrode_groups.items():
                            if group_name not in nwbfile2.electrode_groups:
                                nwbfile2.add_electrode_group(group)
                    
                    # Copy devices from raw file if they don't exist in behavior file
                    if hasattr(nwbfile1, 'devices') and nwbfile1.devices is not None:
                        for device_name, device in nwbfile1.devices.items():
                            if device_name not in nwbfile2.devices:
                                nwbfile2.add_device(device)
                    
                    # FIXED: Properly handle electrode information from raw file
                    # Instead of trying to replace electrodes, we'll work with the existing ones
                    if hasattr(nwbfile1, 'electrodes') and nwbfile1.electrodes is not None:
                        print(f"Raw file has {len(nwbfile1.electrodes)} electrodes, behavior file has {len(nwbfile2.electrodes)}")
                        
                        # For single-stream cases (like PG019), we need to handle this differently
                        # The behavior file already has the correct electrode table, so we'll use it
                        if len(nwbfile1.electrodes) > len(nwbfile2.electrodes):
                            print("Raw file has more electrodes than behavior file - this might indicate a multi-stream setup")
                            print("Using behavior file electrodes as the reference")
                        else:
                            print("Using existing electrodes from behavior file")
                    
                    # Now add the electrical series with proper electrode references
                    for acquisition_name, acquisition_data in nwbfile1.acquisition.items():
                        # Check if this acquisition already exists
                        if acquisition_name not in nwbfile2.acquisition:
                            # Get the original electrode reference from the raw file
                            original_electrodes_ref = acquisition_data.electrodes
                            
                            if original_electrodes_ref is not None:
                                # Get the electrode indices from the original reference
                                electrode_indices = original_electrodes_ref.data
                                
                                # For single-stream cases, we need to map the electrode indices correctly
                                # The raw file might have different electrode indexing than the behavior file
                                if len(nwbfile1.electrodes) != len(nwbfile2.electrodes):
                                    print(f"Warning: Electrode count mismatch. Raw: {len(nwbfile1.electrodes)}, Behavior: {len(nwbfile2.electrodes)}")
                                    # Use all available electrodes in the behavior file
                                    num_electrodes = len(nwbfile2.electrodes)
                                    electrode_indices = list(range(min(num_electrodes, len(electrode_indices))))
                                    print(f"Using first {len(electrode_indices)} electrodes from behavior file")
                                
                                # Create a new DynamicTableRegion that references the correct electrodes
                                # in the behavior file's electrode table
                                electrodes_ref = DynamicTableRegion(
                                    name='electrodes',
                                    data=electrode_indices,  # Use the mapped electrode indices
                                    description=f'electrode references for {acquisition_name}',
                                    table=nwbfile2.electrodes  # Reference the electrodes table in the behavior file
                                )
                                
                                print(f"Created electrode reference for {acquisition_name} with {len(electrode_indices)} electrodes")
                            else:
                                print(f"Warning: {acquisition_name} has no electrode reference, using all electrodes")
                                # Fallback: use all electrodes if no specific reference exists
                                num_electrodes = len(nwbfile2.electrodes)
                                electrodes_ref = DynamicTableRegion(
                                    name='electrodes',
                                    data=list(range(num_electrodes)),
                                    description=f'electrode references for {acquisition_name} (fallback)',
                                    table=nwbfile2.electrodes
                                )
                            
                            # Create a new ElectricalSeries with proper electrode reference
                            acquisition_copy = ElectricalSeries(
                                name=acquisition_data.name,
                                data=acquisition_data.data,
                                rate=acquisition_data.rate,
                                electrodes=electrodes_ref,  # Use the correct DynamicTableRegion
                                description=acquisition_data.description if hasattr(acquisition_data, 'description') else None
                            )
                            nwbfile2.add_acquisition(acquisition_copy)
                            print(f"Added {acquisition_name} with proper electrode reference")
                    
                    print("What I should see in the 3rd file:")
                    print(nwbfile2)
                    
                    # Write the merged file
                    with NWBHDF5IO(nwbfile_path3, "w", manager=manager) as io3:
                        io3.write(nwbfile2, link_data=False)
                    # Simple validation
                    try:
                        result = subprocess.run(["nwbinspector", nwbfile_path3], capture_output=True, text=True)
                        if result.returncode == 0:
                            print("✅ Merged file validation passed - no critical issues!")
                            # Move file immediately to final destination
                            final_destination = r'Z:\analysis\Parviz_Ghaderi\Ghaderi_dataset2025\nwb-behaviour-processed-raw'
                            os.makedirs(final_destination, exist_ok=True)
                            final_path = os.path.join(final_destination, os.path.basename(nwbfile_path3))
                            shutil.move(nwbfile_path3, final_path)
                            print(f"File moved to: {final_path}")
                        else:
                            print("❌ Merged file has critical issues!")
                            print("Issues found:")
                            print(result.stdout)
                    except FileNotFoundError:
                        print("⚠️ nwbinspector not installed - skipping validation")

# <h1 style="text-align: center; color: #2E86AB; font-size: 32px; font-weight: bold;">🎬 ATTACHING VIDEO FILES TO NWB</h1>

## <h2 style="color: #A23B72; font-size: 24px;">📋 **PREREQUISITES**</h2>

Before running this cell, you should have:

### <h3 style="color: #F18F01; font-size: 20px;">�� **Required Files**</h3>

- **`-Behavior-Processed-raw.nwb`** - Complete NWB file with all data (raw + processed + behavioral)
- **`.avi` video files** - Recording videos for each session with matching names

---

## <h2 style="color: #C73E1D; font-size: 24px;">🎥 **OPTIONAL STEP: VIDEO ATTACHMENT**</h2>

This **final cell** will:

### <h3 style="color: #2E86AB; font-size: 20px;">⚡ **What it does:**</h3>

1. **Locates** the complete NWB files (`-Behavior-Processed-raw.nwb`)
2. **Finds** corresponding `.avi` video files for each session
3. **Attaches** video file links/references to the NWB file
4. **Creates** a comprehensive dataset with video recordings included

### <h3 style="color: #A23B72; font-size: 20px;">📹 **Video File Requirements:**</h3>

- **File format:** `.avi` files
- **Naming convention:** Should match the session names
- **Location:** Should be in the same directory structure
- **Content:** Behavioral recording videos from the experiment

---

## <h2 style="color: #2E86AB; font-size: 24px;">📁 **Final Output Structure**</h2>


In [None]:
# this code combine the raw and processed file to one file
import os
import glob

#add videos to the nwb file all mice and all sessions
from datetime import datetime
from dateutil import tz
from pathlib import Path
from neuroconv.datainterfaces import VideoInterface
import os

directory = 'G:\\MiceFolders_ephys\\'     
MouseNames = ['PG076-nwb']
for MouseName in range(len(MouseNames)):
    print(MouseName)
    CurrMouseName = MouseNames[MouseName]
    print(CurrMouseName)
    SessionNames =[name for name in os.listdir(directory + CurrMouseName ) if os.path.isdir(os.path.join(directory + CurrMouseName , name))]
    print(SessionNames)
    for SessionName in range(len(SessionNames)):
        CurrSessionName = SessionNames[SessionName]
        print(CurrSessionName)
        g_names = [name for name in os.listdir(directory + CurrMouseName +'\\'+ CurrSessionName ) if os.path.isdir(os.path.join(directory + CurrMouseName +'\\'+ CurrSessionName , name))]
        print(g_names)
        for id_g_name in range(1):   # here i just add main session not other session
            Curr_g_name = g_names[id_g_name]
            print(Curr_g_name)
            
            # Find files with names ending with '-raw.nwb'
            nwbfile_path = glob.glob(directory + CurrMouseName +'\\'+ CurrSessionName +'\\'+Curr_g_name +'\\'+ '/*behavior-raw.nwb')[0]
            video_file_path = nwbfile_path.replace("-processed-behavior-raw.nwb", ".avi")
            video_file_relative_path = os.path.relpath(video_file_path, nwbfile_dir)
            
            print(video_file_relative_path)
            interface = VideoInterface(file_paths=[video_file_relative_path], verbose=True)
            metadatavideo = interface.get_metadata()

            # For data provenance we add the time zone information to the conversion
            #session_start_time = datetime(2024, 10, 5, 12, 30, 0, tzinfo=tz.gettz("US/Pacific"))
            #metadatavideo["NWBFile"].update(session_start_time=session_start_time)
            # Choose a path for saving the nwb file and run the conversion
            # Remove duplicate file names from the metadata
            # interface.run_conversion(nwbfile_path=nwbfile_path, metadata=metadatavideo, overwrite=False)

# <h1 style="text-align: center; color: #2E86AB; font-size: 32px; font-weight: bold;">🔍 INSPECTING NWB FILES</h1>

## <h2 style="color: #A23B72; font-size: 24px;">📋 **PREREQUISITES**</h2>

Before running the inspection cell, make sure you have:

### <h3 style="color: #F18F01; font-size: 20px;">�� **Required Setup**</h3>

- **`nwbinspector`** installed: `pip install nwbinspector`
- **NWB files** located in: `Z:\analysis\Parviz_Ghaderi\Ghaderi_dataset2025\nwb-behaviour-processed-raw`
- **Access** to the directory containing your NWB files

---

## <h2 style="color: #C73E1D; font-size: 24px;">🔍 **INSPECTING YOUR NWB FILES**</h2>

The **next cell** will:

### <h3 style="color: #2E86AB; font-size: 20px;">⚡ **What it does:**</h3>

1. **Runs** `nwbinspector` on your complete dataset directory
2. **Validates** all NWB files for compliance with NWB standards
3. **Checks** for any issues or warnings in the file structure
4. **Reports** the overall health of your NWB dataset

### <h3 style="color: #A23B72; font-size: 20px;">📊 **What you'll see:**</h3>

- **Validation results** for each NWB file
- **Any warnings** or errors that need attention
- **Summary** of the inspection process
- **Confirmation** that your files are properly formatted

---

## <h2 style="color: #2E86AB; font-size: 24px;">📁 **Your Dataset Location**</h2>


In [None]:
# Inspect all NWB files in your dataset directory
import subprocess
import os

# Your dataset directory
dataset_directory = r"Z:\analysis\Parviz_Ghaderi\Ghaderi_dataset2025\nwb-behaviour-processed"

# Check if directory exists
if os.path.exists(dataset_directory):
    print(f"🔍 Inspecting NWB files in: {dataset_directory}")
    print("=" * 80)
    
    # Run nwbinspector
    try:
        result = subprocess.run(
            ["nwbinspector", dataset_directory], 
            capture_output=True, 
            text=True, 
            check=True
        )
        print("✅ Inspection completed successfully!")
        print("\n�� Inspection Results:")
        print(result.stdout)
        
    except subprocess.CalledProcessError as e:
        print("❌ Inspection encountered errors:")
        print(e.stderr)
        
    except FileNotFoundError:
        print("❌ nwbinspector not found. Please install it with: pip install nwbinspector")
        
else:
    print(f"❌ Directory not found: {dataset_directory}")
    print("Please check the path and ensure the directory exists.")