Skip to content

Working with DLStream output

Jens Schweihoff edited this page Feb 23, 2021 · 3 revisions

The understanding of the output of any software is crucial for the adoption and adaptation, especially in the open source community. Even if authors are thinking that their software's output is easy to understand, it never hurts to explain it again. In the following section, we will go other the major layout of DLStream's output step by step, so that you can start analysing your experimental results right away.

Let's start with the general output. DeepLabStream is outputting a raw video and a table-based csv file for every dlc-enabled session that you run with it. While the video is a simple recording of the entire session (if you pressed "Start recording" or started an experiment) and quite straight forward, the table-based file can be harder to understand and work with. You can open the csv file with any notepad or excel (or directly in python)

Here is a modified example from one of the experiments used for the paper:


;Animal1;Animal1;Animal1;Animal1;Animal1;Animal1;Experiment;Experiment;Time

;neck;neck;nose;nose;tailroot;tailroot;Status;Trial;

;x;y;x;y;x;y;;;

Frame;;;;;;;;;

1;464.0;131.0;475.0;127.0;427.0;131.0;False;False;0.001

2;423.0;336.0;409.0;345.0;436.0;318.0;False;False;0.033

3;425.0;336.0;412.0;346.0;437.0;318.0;True;False;0.066

4;425.0;336.0;412.0;346.0;437.0;318.0;True;False;0.1

5;424.0;336.0;411.0;348.0;437.0;317.0;True;True;0.133

6;424.0;336.0;411.0;348.0;437.0;317.0;True;True;0.166

7;424.0;336.0;414.0;348.0;437.0;317.0;True;False;0.199

8;424.0;336.0;414.0;348.0;437.0;317.0;True;False;0.243

You can see that the general structure is made up by entries seperated by ";" (a delimiter). When importing the file to excel for example, you need to specify the delimiter in order for excel to detect new columns/cells.

In addition you can see that the headers are actually multiple rows, detailing the underlying numbers. We start from left to right and from top to bottom. As you have probably already guessed, "Frame" is referring to the discrete time series (the frame sequence) and acts therefore as an index as well.

Animal1

stands for the tracked animal and any pose estimation data (in multiple animal experiments the second animal would be "Animal2" and so on), while "Experiment" is a collection of experiment based parameters that can change frame-by-frame. The entry "Time" however is measuring the time that has passed since the start of the experiment and gives a good estimation of your overall performance (e.g. 0.032 ms between Frame 1 and 2), the performance time is the time between two full loops in DLStream (including the whole experiment) and its main time demanding component is the prediction itself. Coming to the next lines we see the DLC-style reporting of bodyparts (neck, nose, tailroot) and their estimated x/y positions.

Status

is a boolean parameter that states whether the experiment was started using the "Start Experiment" button. The unprocessed csv file includes all pose estimation data from the moment of "Start Analysis", which has to be initiated first in order for the network to be at "full speed" before starting any experiment. If you want to restrict your output to the experiment, you can use this column to filter for TRUE values.

Trial

referes to the experiment specific output. It can be a bool (as in the example) or a string that specifies which trial was active (e.g. in the conditioning task the trial column would specify if it was am aversive or appetetive stimulation trial). Look at the experiment section to learn more about this output. If you are using a simple stimulation (e.g. optogenetic) a boolean is the best way to ease any future analysis.

If you are familiar with python programming and pandas, I recommend using the following lines for easy import of DLStream's csv output:

import pandas as pd`

df = pd.read_csv('path_to_csv',header=[0,1,2],sep=';',index_col=0)

Sometimes pd.read_csv will not read empty rows correctly (due to an issue with multiple indexing and unique names). One way to solve this is to remove the automatic naming again with this (See https://stackoverflow.com/questions/41221079/rename-multiindex-columns-in-pandas):

def remove_unnamed_lvl_multiindex(df: pd.DataFrame):

    for i,columns_old in enumerate(df.columns.levels):

        columns_new = np.where(columns_old.str.contains('Unnamed'),'',columns_old)

        df.rename(columns=dict(zip(columns_old,columns_new)),level=i,inplace=True)

    return df

df = remove_unnamed_lvl_multiindex(df)
df.head()

In the future, we plan to release a few utitility scripts that will help you to process results obtained from DeepLabStream.

Growing utility script repo:

Check out Dlstream Utils to get some useful utility scripts.