# Analysis of AVID output files

Any information of an AVID session can be accessed via the xml-style session files. However, the file itself is hard to read. Instead, we will look at how to analyze `.avid` files programmatically.

## Loading
First, we specify the file we want to open. Here we will use the `output/example.avid` from the basic example notebook.

When loading the artefacts in this way, we can opt to use the argument `check_validity`. It determines if the loaded artefacts are checked for their validity (i.e. do the respective files exist?) and set to `invalid` if needed. This can be relevant when analyzing outputs that were produced on a different machine, e.g. when running a workflow on a high-perfomance cluster, and your current machine does not have access to the session data. In that case, you can avoid _every_ artefact being marked as invalid by setting the argument `check_validity = False`. This will read in the session as is, without checking/changing anything.
In this case, it doesn't matter since we have access to all the files used in the example session.

In [1]:
from avid.common.artefact import ArtefactCollection
import avid.common.artefact.fileHelper as fileHelper

In [2]:
filepath = "output/example.avid"
artefacts = ArtefactCollection(fileHelper.load_artefact_collection_from_xml(filepath, check_validity=False))

In [3]:
artefacts

ArtefactCollection(15 artefacts)

## Selecting specific artefacts
Just like in an AVID workflow script, we can use _Selectors_ to filter for specific artefacts. For example, we can get all artefacts with a specific _actionTag_. Unlike in a workflow script, where we usually hand over _Selectors_ to various actions, here we use them directly on our collection of artefacts:

In [4]:
from avid.selectors import ActionTagSelector, TimepointSelector

In [5]:
mr_selector = ActionTagSelector('MR')
mr_artefacts = mr_selector.getSelection(artefacts)

In [6]:
mr_artefacts

ArtefactCollection(6 artefacts)

These results can then be further filtered by subsequent selectors. As usual, selectors can also be combined to respect several conditions at once:

In [7]:
tp1_selector = TimepointSelector('TP1')
mr_artefacts_tp1 = tp1_selector.getSelection(mr_artefacts)
print(f"Further filtering of mr_artefacts: {mr_artefacts_tp1}")

mr_tp1_selector = mr_selector + tp1_selector
mr_tp1_artefacts = mr_tp1_selector.getSelection(artefacts)
print(f"Combined filtering of both selectors: {mr_tp1_artefacts}")

print(f"Both methods yield the same result: {mr_artefacts_tp1 == mr_tp1_artefacts}")

Further filtering of mr_artefacts: ArtefactCollection(4 artefacts)
Combined filtering of both selectors: ArtefactCollection(4 artefacts)
Both methods yield the same result: True


Without looking into the details within these artefact collections, we can already gather some insights. For example, this code will give an overview what percentage of artefacts for the actionTag `MR` are invalid:

In [8]:
from avid.selectors import ActionTagSelector, ValiditySelector

In [9]:
mr_selector = ActionTagSelector('MR')
mr_artefacts = mr_selector.getSelection(artefacts)
mr_count = len(mr_artefacts)

valid_selector = ValiditySelector()
valid_mr_artefacts = valid_selector.getSelection(mr_artefacts)
valid_mr_count = len(valid_mr_artefacts)

percentage = 100 * valid_mr_count / mr_count

print(f"Valid MR artefacts: {valid_mr_count}/{mr_count} ({percentage:2.0f}%)")

Valid MR artefacts: 6/6 (100%)


## Accessing artefact information
If we are interested in the actual contents of the artefacts, we can simply iterate over the artefact collection. All the metadata is accessible via keywords.

In [10]:
for artefact in mr_artefacts:
    print(artefact)

Artefact({'case': 'pat1', 'caseInstance': None, 'timePoint': 'TP1', 'actionTag': 'MR', 'type': 'result', 'format': 'itk', 'url': '..\\data\\img\\pat1_TP1_MR.txt', 'objective': 'MR', 'result_sub_tag': None, 'result_sub_count': None, 'invalid': False, 'input_ids': None, 'action_class': None, 'action_instance_uid': None, 'id': 'bbe232b9-5740-11ec-85a6-e9d058c65a83', 'timestamp': '1638869608.3333993', 'execution_duration': None}, {})
Artefact({'case': 'pat1', 'caseInstance': None, 'timePoint': 'TP2', 'actionTag': 'MR', 'type': 'result', 'format': 'itk', 'url': '..\\data\\img\\pat1_TP2_MR.txt', 'objective': 'MR', 'result_sub_tag': None, 'result_sub_count': None, 'invalid': False, 'input_ids': None, 'action_class': None, 'action_instance_uid': None, 'id': 'bbe232ba-5740-11ec-85a6-e9d058c65a83', 'timestamp': '1638869608.3335395', 'execution_duration': None}, {})
Artefact({'case': 'pat2', 'caseInstance': None, 'timePoint': 'TP1', 'actionTag': 'MR', 'type': 'result', 'format': 'itk', 'url': '..

In [11]:
for artefact in mr_artefacts:
    print(f"{artefact['case']}:  {artefact['url']}")

pat1:  ..\data\img\pat1_TP1_MR.txt
pat1:  ..\data\img\pat1_TP2_MR.txt
pat2:  ..\data\img\pat2_TP1_MR1.txt
pat2:  ..\data\img\pat2_TP1_MR2.txt
pat2:  ..\data\img\pat2_TP2_MR1.txt
pat3:  ..\data\img\pat3_TP1_MR.txt
