# Generating XML Output

MouseBytes is an open-access and web-based repository for integrating and sharing cognitive data. MouseBytes was mainly focused on behavioral data obtained using bussey-saksida touchscreen technology since such a system facilitates the automatic execution of several cognitive tasks with standardized outputs. Since MouseBytes accepts only XML format, the following script is aiming to provide the output in a format that can be uploaded to MouseBytes. So, any results/outputs obtained using devices other than touchscreen technology can use this script to convert their output to what MouseBytes can handle.

Please pay attention to all the documentations/guidelines provided in this script and contact us at <b>mousebytes@uwo.ca</b> if you have any question/concern.

In [46]:
import xml.etree.cElementTree as ET
from lxml import etree
from xml.dom import minidom
import xml.dom.minidomimport sysimport time

#### Each XML output consists of two main nodes as below: <br/>
<b>1) SessionInformation:</b> All the constant features like "Animal ID", "Date/Time", etc. are included between this node. No matter what the cognitive task is, the same set of features as mentioned below in the script should be added to this node. <br/>

<b>2) MarkerData:</b> This node contains the dynamic features, and depending on the cognitive task, different set of features are added to this node. There are some key features for each cognitive task, and such features are commonly used for the analysis and visualizing the data. So, we expect external systems to map the names of key features to what defined in MouseBytes to facilitate the comparison and reproducibility.


In [47]:
# Xml files with different outputs are built based on what the cognitive task is.
# Selecting the cognitive task: you Should select that the XML file should be generated for which cognitive task.
print('The following numbers are used for generating the XML file for each cognitive task: \n 1= 5-Choice \n 2= PD \n 3= PAL')

TaskId = input('Please enter the right number/code for your task (for example, enter 1 if your task is 5Choice): ')
TaskId = int(TaskId)
if TaskId not in [1,2,3]:
    print("The task code is wrong:")
    time.sleep(6)
    sys.exit()

The following numbers are used for generating the XML file for each cognitive task: 
 1= 5Choice 
 2= PD 
 3= PAL
Please enter the right number/code for your task (for example, enter 1 if your task is 5Choice): 1


In [48]:
# Adding main nodes to xml file
LiEvent = etree.Element("LiEvent")
SessionInformation = etree.SubElement(LiEvent, "SessionInformation")
MarkerData = etree.SubElement(LiEvent, "MarkerData")

### Adding static features to SessionInformation node:

Features belong to this node are constant, and variable "SessionInfo_Features_Name" contains list of such features. Note that you can add extra features to this list. The following features MUST have a value; otherwise, the generated xml file can not be uploaded to the system. 
<ul>
<li><b>Date/Time:</b>This feature shows the the date of test.</li>
<li><b>Analysis Name:</b>This feature should indicate the name of task. For example, if the task is PAL, it must contain the keyword pal.</li>
<li><b>Animal ID:</b></li>
<li><b>Max_Number_Trials:</b>It shows the max number of trials that should be performed by the animal during a test.</li>
<li><b>Max_Schedule_Time:</b> It shows the max amount of time (in seconds) required to perform the test.</li>
</ul>

In [49]:
# "SessionInfo_Features_Name" contains list of the constant features need to be added to "SessionInformation" node.
# As mentioned bove, there must be a value for such features as "Animal ID" and "Analysis Name". However, other features can remain empty.

SessionInfo_Features_Name = ['Database', 'Date/Time', 'Environment', 'Machine Name', 'Analysis Name', 'Schedule Name',
                        'Guid', 'Schedule Run ID', 'Version', 'Version Name', 'Animal ID', 'Application_Version',
                        'Max_Number_Trials', 'Max_Schedule_Time', 'Schedule_Description', 'Schedule_Start_Time']

SessionInfo_Features_Val= [''] * len (SessionInfo_Features_Name)

#Creating a dictionary for sessionInfo items
SessioInfo_dict = {}
for i in range(len(SessionInfo_Features_Name)):
    SessioInfo_dict[SessionInfo_Features_Name[i]] = SessionInfo_Features_Val[i]

# Exampples of initializing the required features
SessioInfo_dict['Date/Time'] = '9/12/2014 12:07:04 PM'
SessioInfo_dict['Animal ID']= '10'
SessioInfo_dict['Max_Number_Trials'] = '30'
SessioInfo_dict['Max_Schedule_Time'] = '3600'

if TaskId == 1:
    SessioInfo_dict['Analysis Name'] = '5-Choice'
    
elif TaskId == 2:
    SessioInfo_dict['Analysis Name'] = 'PD'

elif TaskId == 3:
    SessioInfo_dict['Analysis Name'] = 'PAL'


print(SessioInfo_dict)

{'Database': '', 'Date/Time': '9/12/2014 12:07:04 PM', 'Environment': '', 'Machine Name': '', 'Analysis Name': '5-C', 'Schedule Name': '', 'Guid': '', 'Schedule Run ID': '', 'Version': '', 'Version Name': '', 'Animal ID': '10', 'Application_Version': '', 'Max_Number_Trials': '30', 'Max_Schedule_Time': '3600', 'Schedule_Description': '', 'Schedule_Start_Time': ''}


In [50]:
# Adding constant features to "SessionInforamtion" node.
for feature, value in SessioInfo_dict.items():
    Information = etree.SubElement(SessionInformation, "Information")
    etree.SubElement(Information, "Name").text = feature
    etree.SubElement(Information, "Value").text = value 

### Adding dynamic features to MarkerData node:

Features belong to this node are divided into two groups. While Some features are repeated several times depending on what the maximum number of trial is, some features may happen only once. So here, two different functions were defined and used to add the dynamic features to this node (i.e. <b>AddDynamicFeature()</b> and <b>AddDynamicFeature_withRepeatition</b>).

<br/>Besides that, there is a variable called <b>SourceType</b> which is indicating the type of feature. For example, some features are temporal and measuring the amount of time; so, the value of SourceType for such features is "Measure". There are three types of features; hence, SourceType can have any of the following three values: <br />

<b>1) Evaluation:</b> This type is used for features which are calculating the performance like "Accuracy", "Omission", .etc.<br />
<b>2) Measure:</b> This type is used for temporal features like "Correct Touc

h Latency", etc. <br /> 
<b>3) Count:</b> This type is used for features calculating the count/number like "Premature Responses" in 5Choice task.

In [51]:
# Function definition to add a feature that appears only once to "MarkerData" node in XML file
def AddDynamicFeature(SourceType, FeatureName: str, FeatureVal=None, TimeVal=None, DurationVal=None):
    
    # If SourceType is "Evaluation" or "Count", input param "FeatureVal" must have a value and passed to the function
    #and no need to provide values for "TimeVal" and "DurationVal".
    
    # If SourceType is "Measure","TimeVal" and "DurationVal" must have a value, and None should be passed for "FeatureVal".
    # Note that "DurationVal" is the main feature showing the amount of time, and "TimeVal" just shows the time the event
    # happened and, its value can be empty; however; it must be passed to the function.

    if SourceType == 'Evaluation':
        Marker = etree.SubElement(MarkerData, "Marker")
        etree.SubElement(Marker, "Name").text = FeatureName
        etree.SubElement(Marker, "SourceType").text = 'Evaluation'
        if FeatureVal!='':
            etree.SubElement(Marker, "Results").text = FeatureVal
        
    elif SourceType == 'Measure':
        Marker = etree.SubElement(MarkerData, "Marker")
        etree.SubElement(Marker, "Name").text = FeatureName
        etree.SubElement(Marker, "SourceType").text = 'Measure'
        if TimeVal!='' or DurationVal!='':
            etree.SubElement(Marker, "Time").text = TimeVal 
            etree.SubElement(Marker, "Duration").text = DurationVal
        
    elif SourceType == 'Count':
        Marker = etree.SubElement(MarkerData, "Marker")
        etree.SubElement(Marker, "Name").text = FeatureName
        etree.SubElement(Marker, "SourceType").text = 'Count'
        if FeatureVal!='':
            etree.SubElement(Marker, "Count").text = FeatureVal
    else:
            print('The value of input param SourceType is invlaid!')
      

In [52]:
# Function definition to add a feature that appears multiple times to MarkerData node in XML file. 
def AddDynamicFeature_withRepetition(SourceType, FeatureName: str, count: int, FeatureVal=None, TimeVal=None, DurationVal=None):
    
    # If SourceType is "Evaluation" or "Count", input param "FeatureVal" must have a value and passed to the function
    #and no need to provide values for "TimeVal" and "DurationVal".
    
    # If SourceType is "Measure","TimeVal" and "DurationVal" must have a value, and None should be passed for "FeatureVal".
    # Note that "DurationVal" is the main feature showing the amount of time, and "TimeVal" just shows the time the event
    # happened and its value can be empty; however; it must be passed to the function.
    
    # Input parameter "count" indicates the number of times the feature needs to be repeated and its value must be greater than 1.
    # Note that input parameters "FeatureVal", "TimeVal", and "DurationVal" are list of strings, and the length of all lists
    # must be equal to "count" (i.e. the number of repetition).
    
    if(count>1):
        for i in range(count):
        
            if SourceType == 'Evaluation':
                if len(FeatureVal) == count:
                    Marker = etree.SubElement(MarkerData, "Marker")
                    etree.SubElement(Marker, "Name").text = FeatureName
                    etree.SubElement(Marker, "SourceType").text = 'Evaluation'
                    if FeatureVal[i]!='':
                        etree.SubElement(Marker, "Results").text = FeatureVal[i]
                else:
                    print('The length of FeatureVal should be equal to the number of times feature is repeated (i.e count).')

            elif SourceType == 'Measure':
                if (len(TimeVal) ==  count & len(DurationVal) == count): 
                    Marker = etree.SubElement(MarkerData, "Marker")
                    etree.SubElement(Marker, "Name").text = FeatureName
                    etree.SubElement(Marker, "SourceType").text = 'Measure'
                    if TimeVal[i]!='' or DurationVal[i]!='':
                        etree.SubElement(Marker, "Time").text = TimeVal[i]
                        etree.SubElement(Marker, "Duration").text = DurationVal[i]
                else:
                    print('The length of TimeVal and DurationVal should be equal to the number of times feature is repeated (i.e count).')

            elif SourceType == 'Count':
                if len(FeatureVal) == count: 
                    Marker = etree.SubElement(MarkerData, "Marker")
                    etree.SubElement(Marker, "Name").text = FeatureName
                    etree.SubElement(Marker, "SourceType").text = 'Count'
                    if FeatureVal[i]!='':
                        etree.SubElement(Marker, "Count").text = FeatureVal[i]
                else:
                    print('The length of FeatureVal should be equal to the number of times feature is repeated (i.e count).')
            else:
                print('The value of input param SourceType is invlaid!')
        
    else:
        print('Input parameter "count" shows number of repetition, and it must be greater than 1.')
    

### Adding key features of each cognitive task to the XML file:
Each cognitive task has its own key features. As mentioned earlier, these key features are commonly used for the analysis and visualizing the data. So, the name of such features should be mapped to what defined in MouseBytes to facilitate the comparison and reproducibility.

Xml files with different outputs are built based on what the cognitive task is. Below, you can find the name of key features for some cognitive tasks. (we will update this file for other tasks, later.)

<ul> 
 <li><b>5-Choice</b>
     <ul>
       <li>Threshold - Accuracy %</li>
       <li>Threshold - Omission %</li>
       <li>Threshold - Stimulus Duration</li>
       <li>Perseverative Correct - Total</li>
       <li>Premature Responses - Total</li>
       <li>Trial Analysis - Accuracy%</li>
       <li>Trial Analysis - Omission%</li>
       <li>Trial Analysis - Correct Response Latency</li>
       <li>Trial Analysis - Reward Collection Latency</li>
       
    </ul>
 </li>
 <br/>
 <li><b>PD</b>
     <ul>
       <li>End Summary - % Correct</li>
       <li>End Summary - No Correction Trials</li>
       <li>Correct touch latency</li>
       <li>Correct Reward Collection</li>
    </ul>
 </li>
 <br/>
 <li><b>PAL</b>
     <ul>
       <li>End Summary - % Correct</li>
       <li>End Summary - No Correction Trials</li>
       <li>Correct touch latency</li>
       <li>Correct Reward Collection</li>
    </ul>
 </li>
 
</ul>

In the next cells of code, you will see examples of calling functions for adding the key features when they happen once or multiple times for each cognitive task. You can add your extra features to the xml file by calling the corresponding functions.

In [53]:
# Examples of adding the key features to the xml file depending on which cognitive task is selected by the user.

if TaskId == 1: #(5-Choice)
    
    AddDynamicFeature("Evaluation", "Threshold - Accuracy%", FeatureVal="80.0", TimeVal=None, DurationVal=None)
    AddDynamicFeature("Evaluation", "Threshold - Omission %", FeatureVal="30.0", TimeVal=None, DurationVal=None)
    AddDynamicFeature("Evaluation", "Threshold - Stimulus Duration", FeatureVal="0.6", TimeVal=None, DurationVal=None)
    AddDynamicFeature("Count", "Perseverative Correct - Total", FeatureVal="5", TimeVal=None, DurationVal=None)
    AddDynamicFeature("Count", "Premature Responses - Total", FeatureVal="2", TimeVal=None, DurationVal=None)
         
    # Adding a feature with repetition
    AddDynamicFeature_withRepetition("Evaluation", "Trial Analysis - Accuracy%", 5,
                                     ['100', '', '70', '75', ''], TimeVal=None, DurationVal=None) 
    #Note that the value at some trials could be empty space, but it must be included in the list just like the example.
    
    AddDynamicFeature_withRepetition("Evaluation", "Trial Analysis - Omission%", 5,
                                     ['', '30', '', '', '20'], TimeVal=None, DurationVal=None)
    
    AddDynamicFeature_withRepetition("Measure", "Trial Analysis - Correct Response Latency", 5,
                                     FeatureVal=None,
                                     TimeVal=['155606000', '277627000', '312959000', '511027000', '546697000'],
                                     DurationVal=['630000', '930000', '1474000', '857000', '770000'])
    
    AddDynamicFeature_withRepetition("Measure", "Trial Analysis - Reward Collection Latency", 5,
                                     FeatureVal=None,
                                     TimeVal=['721434000', '738198000', '759511000', '809336000', '910001000'],
                                     DurationVal=['783000', '681000', '582000', '727000', '621000'])                                         
    
elif TaskId == 2: #(PD)
                                              
    AddDynamicFeature("Evaluation", "End Summary - % Correct", FeatureVal="", TimeVal=None, DurationVal=None)
    AddDynamicFeature("Evaluation", "End Summary - No Correction Trials", FeatureVal="35", TimeVal=None, DurationVal=None)
    
    #with repetition
    AddDynamicFeature_withRepetition("Measure", "Correct touch latency", 5,
                                     FeatureVal=None,
                                     TimeVal=['77597000', '103225000', '163159000', '192688000', '216698000'],
                                     DurationVal=['2414000', '1269000', '6314000', '1920000', '1374000'])
                                              
    AddDynamicFeature_withRepetition("Measure", "Correct Reward Collection", 5,
                                     FeatureVal=None,
                                     TimeVal=['80011000', '104494000', '169473000', '194608000', '218072000'],
                                     DurationVal=['1248000', '927000', '2478000', '1002000', '1009000'])

elif TaskId == 3: #(PAL)
    
    AddDynamicFeature("Evaluation", "End Summary - % Correct", FeatureVal="", TimeVal=None, DurationVal=None)
    AddDynamicFeature("Evaluation", "End Summary - No Correction Trials", FeatureVal="35", TimeVal=None, DurationVal=None)
    
    #with repetition
    AddDynamicFeature_withRepetition("Measure", "Correct touch latency", 5,
                                     FeatureVal=None,
                                     TimeVal=['97597000', '123225000', '173659000', '223688000', '286699000'],
                                     DurationVal=['3514000', '1369000', '7314000', '2320000', '1674000'])
                                              
    AddDynamicFeature_withRepetition("Measure", "Correct Reward Collection", 5,
                                     FeatureVal=None,
                                     TimeVal=['90011000', '124494000', '178473000', '224608000', '224608001'],
                                     DurationVal=['1458000', '105000', '2578000', '112000', '112001'])
    
else:
    print('Wrong task code selected by user!')

In [58]:
# doc = etree.tostring(LiEvent, encoding="utf-8", method="html", xml_declaration=True, pretty_print=True)
# doc = doc.decode("utf-8")
# print(doc)

In [55]:
xmlstr = minidom.parseString(etree.tostring(LiEvent, encoding='utf8', method='html', xml_declaration=True)).toprettyxml(indent="   ")
with open("output.xml", "w") as f:
    f.write(xmlstr)