# Converting MATLAB wbstructs to pandas dataframes

Check the README.md for more information when to use this tool. This notebook simply calls the functions in the wbstruct_converter.py file. To get a better understanding of the code, please refer to the wbstruct_converter.py file. If there are unhandled errors, please open an issue on GitHub.

In [None]:
import utils.wbstruct_to_dicts as wbstruct_dictioniaries
import utils.wbstruct_dicts_to_dataframes as wbstruct_dataframes
import dill

# in case we want to reload our module
import importlib
importlib.reload(wbstruct_dictioniaries)
importlib.reload(wbstruct_dataframes)

### Convert the wbstructs to python dictionaries

In [None]:
# defining directory paths where all recordings are located, first is from Rebecca and second from Kerem Uzel
directory_paths = ["Y:\\lisc\\project\\neurobiology\\zimmer\\Rebecca",
                   "Y:\\lisc\\project\\neurobiology\\zimmer\\Kerem_Uzel\\Whole_brain_imaging\\Cleaned_up_datasets\\WT\\ItamarCorrected_OLQ_URY_2023\\"]

# defining target file name which should be wbstruct.mat since this is the file from the wba that we want to convert into a DF
target_file = "wbstruct.mat"

# since we usually have multiple recordings per animal we want to include only those that are relevant for our analysis
# following is specific to Rebecca's and Kerem's Data
include_Rebecca=["Ctrl","used Datasets"]
include_Kerem=["Head","Tail"]
exclude=["not_used","not used","notUsed","cat-2_tdc-1_tph-1"]
recording_type="deltaFOverF_bc"
simple=False
save_as='csv'

In [None]:
# import and save kerem's data
datasets_kerem = wbstruct_dictioniaries.get_datasets_dict(directory_paths[1],target_file, include_Kerem,exclude,recording_type, simple=True, with_coloring=True)

In [None]:
datasets_kerem.keys()

The dictionary returned by the function is also saved as a pickle file named "datasets.pkl". In order to load this file, perform following command:

In [None]:
# load kerem's dictionary
with open('datasets.pkl','rb') as f:
    datasets_kerem = dill.load(f)

Now we can use the dictionaries as they are or convert them to pandas dataframes.

### Convert dictionaries to dataframes
For the following function **get_dataframes()** we assume that the "ID1" contains all IDs (non-IDed objects have the ID "None").

In [None]:
dataframes_kerem = wbstruct_dataframes.get_dataframes(datasets_kerem, recording_type, with_2_traces=True, with_coloring=True)

In [None]:
# load kerem's dictionary
with open('../preprocessing/dataframes_kerem_0602.pkl','wb') as f:
    dill.dump(dataframes_kerem,f)

In [None]:
print("Available Recordings:",list(dataframes_kerem.keys()))

### If the IDs are in separate Excel Sheets [OLD]
In case the IDs of the neurons are stored in separate Excel Sheets. To construct the final dataframe we have to merge the collected IDs with the datasets from the dictionary. This step is old, since a new data struct containing the values but also the IDs and the behaviour state annotation was discovered.
I keep this section here still because we might need it if we want to play around with the raw F traces rather than the bleach corrected and the deltaF/F traces.

In [None]:
# following directory contains all the excel files that hold the information about the recordings
directory_path_ID = "Y:\\lisc\\project\\neurobiology\\zimmer\\Rebecca\\Analyses"
target_ID = ".xlsx"
include_ID = ["Analyses_"]
exclude_ID = ["._","cat-2_tdc-1_tph-1"]

In [None]:
dictofIDs_og = wbstruct_dictioniaries.get_IDs_dict(directory_path_ID, target_ID, include_ID, exclude_ID)

In [None]:
dataframes_rebecca=wbstruct_dataframes.get_dataframes_from_excel(datasets_rebecca, dictofIDs_og, recording_type, save_as='csv')

In [None]:
print("Available Recordings:",list(dataframes_rebecca.keys()))
dataframes_rebecca['20200629_w1'].head()