# Getting Started
This is an introduction on how to import data into DrCELL. The example file is from "[Li N (2022); Data and simulations related to: Thalamus-driven functional populations in frontal cortex activity supports decision-making. Yang et al (2022) Nat Neurosci.](https://doi.org/10.1038/s41593-022-01171-w)". They published their [data](https://doi.org/10.5281/zenodo.6846161) and explain the background in their [paper](https://doi.org/10.1038/s41593-022-01171-w). Here it is used to illustrate the import process into DrCELL and can be substituted by your own data.

In [11]:
import numpy as np
from scipy import io as sio

# change to your path
example_mat_file_path = r"C:\path\to\Data_CompileData1_YangEtAl22.mat"
# load data
example_mat_file = sio.loadmat(example_mat_file_path)
data_array = np.concatenate((example_mat_file["neuron_PSTH_lick_left_correct"],
                             example_mat_file["neuron_PSTH_lick_right_correct"]), axis=1)
matrix_array = np.concatenate((example_mat_file["neuron_info_cell_type"],
                               example_mat_file["neuron_info_photoinhibition"],
                               example_mat_file["neuron_info_activity_mode_w"],
                               example_mat_file["neuron_info_connectivity"],
                               example_mat_file["neuron_info_depth"],
                               example_mat_file["neuron_info_mice_session"]), axis=1)

data_array

array([[0.        , 0.        , 0.        , ..., 0.20833333, 0.20833333,
        0.20833333],
       [0.        , 0.        , 0.        , ..., 0.18518519, 0.18518519,
        0.18518519],
       [0.        , 0.        , 0.        , ..., 0.17241379, 0.17241379,
        0.17241379],
       ...,
       [7.65625   , 7.578125  , 7.5       , ..., 1.63636364, 1.63636364,
        1.63636364],
       [0.66666667, 0.66666667, 0.66666667, ..., 3.33333333, 3.33333333,
        3.33333333],
       [6.33333333, 6.33333333, 6.        , ..., 5.5       , 5.5       ,
        5.5       ]])

Bring your data in the following dataformat:
data_df is a dataframe containing the high-dimensional data that will be reduced. In this case these are the traces ouf our different neurons. For each neuron the samples get reduced to two dimensions to be plotted.
matrix_df is a dataframe, that describes the metadata for each datapoint. Here it contains data like the celltype or depth of each neuron.
data_variables is a list of the column names from the matrix_array, that contain relevant metadata. It can be used to filter for specific datapoints or change the color setting to these categories.
display_hover_variables is a list containing all the column names from the matrix_array, that should be displayed when hovering over the datapoints. This can provide additional information to put the resulting projection into context. There is also the build-in column "pdIndex", giving a index over all datapoints.
config is a dictionary, that contains data_variables, display_hover_variables, as well as "recording_type". "recording_type" can either be "None", "2P" for Two Photon Microscopy data or "Ephys" for electrophysiological data. It mainly changes the way the graphs in the hover tool is plotted.


In [12]:
import pandas as pd

import drcell

# converts the arrays to dataframes and assigns column names to the matrix_df
data_df = pd.DataFrame(data_array)
matrix_df = pd.DataFrame(matrix_array,
                         columns=["cell_type", "photoinhibition", "activity_mode_w_c0", "activity_mode_w_c1",
                                  "activity_mode_w_c2", "activity_mode_w_c3", "activity_mode_w_c4",
                                  "activity_mode_w_c5", "connectivity_c0", "connectivity_c1", "connectivity_c2",
                                  "depth", "mice_session_c0", "mice_session_c1", "mice_session_c2",
                                  "mice_session_c3"])

config = {
    # can either be "None", "2P" for Two Photon Microscopy data or "Ephys" for electrophysiological data. 
    # This mainly changes the way the graphs in the hover tool is plotted.
    "recording_type": "2P",
    # variables from the matrix_df, that is selectable in the Color and Filter setting
    "data_variables": ["cell_type", "photoinhibition"],
    # variables from the matrix_df, that gets displayed in the hover tool
    "display_hover_variables": ["pdIndex", "depth", "cell_type", "photoinhibition"],
}

# Then just combine all of these variables into a DrCELL.h5 file
drcell.save_as_dr_cell_h5("example_drcell.h5", data_df, matrix_df, config)

matrix_df

Unnamed: 0,cell_type,photoinhibition,activity_mode_w_c0,activity_mode_w_c1,activity_mode_w_c2,activity_mode_w_c3,activity_mode_w_c4,activity_mode_w_c5,connectivity_c0,connectivity_c1,connectivity_c2,depth,mice_session_c0,mice_session_c1,mice_session_c2,mice_session_c3
0,1.0,,,,,,,,,,,303.0,1.0,1.0,2.0,1.0
1,1.0,,,,,,,,,,,803.0,1.0,1.0,3.0,1.0
2,1.0,,,,,,,,,,,803.0,1.0,1.0,5.0,1.0
3,1.0,,,,,,,,,,,703.0,1.0,1.0,6.0,1.0
4,1.0,,,,,,,,,,,603.0,1.0,1.0,7.0,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9621,1.0,,,,,,,,,,,560.0,10.0,6.0,16.0,6.0
9622,0.0,,,,,,,,,,,585.0,10.0,6.0,17.0,6.0
9623,2.0,,,,,,,,,1.0,,635.0,10.0,6.0,18.0,6.0
9624,1.0,,,,,,,,,,,660.0,10.0,6.0,19.0,6.0


Now you can open your data in DrCELL as shown in the [README.md](https://github.com/lucakoe/DrCELL), by specifying the path to the file or the folder it is in.