# ISA Create Mode example

## Abstract:
    
In this notebook, we'll show how to generate an ISA-Tab and an ISA JSON representation of a metabolomics study.
The study uses GC-MS and 13C NMR on 3 distinct sample types (liver, blood and heart) collected from study subjects assigned to 3 distinct study arms.

GC-MS acquisition were carried out in duplicate, extracts were derivatized using BSA and acquired on an Agilent QTOF in both positive and negative modes.
13C NMR free induction decays were acquired on a Bruker Avance, using CPMG and PSEQ pulse sequences in duplicates.



### 1. Loading ISA-API model and relevant library

In [1]:
# If executing the notebooks on `Google Colab`,uncomment the following command 
# and run it to install the required python libraries. Also, make the test datasets available.

# !pip install -r requirements.txt

In [2]:
from isatools import isatab
from isatools.isajson import ISAJSONEncoder
from collections import OrderedDict
from isatools.model import (
    Investigation,
    OntologyAnnotation,
    FactorValue,
    Characteristic
)
from isatools.create.model import (
    Treatment,
    NonTreatment,
    StudyCell,
    StudyArm,
    ProductNode,
    SampleAndAssayPlan,
    StudyDesign,
    QualityControl
)
from isatools.create.constants import (
    BASE_FACTORS,
    SCREEN,
    RUN_IN,
    WASHOUT,
    FOLLOW_UP,
    SAMPLE,
    EXTRACT,
    LABELED_EXTRACT,
    DATA_FILE
)
from isatools.isatab import dump_tables_to_dataframes as dumpdf
import os
import json

### 2. Setting variables:

In [3]:
NAME = 'name'
FACTORS_0_VALUE = OntologyAnnotation(term='nitroglycerin')
FACTORS_0_VALUE_ALT = OntologyAnnotation(term='alcohol')
FACTORS_0_VALUE_THIRD = OntologyAnnotation(term='water')

FACTORS_1_VALUE = 5
FACTORS_1_UNIT = OntologyAnnotation(term='kg/m^3')

FACTORS_2_VALUE = 100.0
FACTORS_2_VALUE_ALT = 50.0
FACTORS_2_UNIT = OntologyAnnotation(term='s')

TEST_EPOCH_0_NAME = 'test epoch 0'
TEST_EPOCH_1_NAME = 'test epoch 1'
TEST_EPOCH_2_NAME = 'test epoch 2'

TEST_STUDY_ARM_NAME_00 = 'test arm'
TEST_STUDY_ARM_NAME_01 = 'another arm'
TEST_STUDY_ARM_NAME_02 = 'yet another arm'

TEST_STUDY_DESIGN_NAME = 'test study design'

TEST_EPOCH_0_RANK = 0

SCREEN_DURATION_VALUE = 100
FOLLOW_UP_DURATION_VALUE = 5*366
WASHOUT_DURATION_VALUE = 30
DURATION_UNIT = OntologyAnnotation(term='day')

### 3. Declaration of ISA Sample / Biomaterial templates for liver, blood and heart

In [4]:
sample_list = [
        {
            'node_type': SAMPLE,
            'characteristics_category': OntologyAnnotation(term='organism part'),
            'characteristics_value': OntologyAnnotation(term='liver'),
            'size': 1,
            'technical_replicates': None,
            'is_input_to_next_protocols': True
        },
        {
            'node_type': SAMPLE,
            'characteristics_category': OntologyAnnotation(term='organism part'),
            'characteristics_value': OntologyAnnotation(term='blood'),
            'size': 1,
            'technical_replicates': None,
            'is_input_to_next_protocols': True
        },
        {
            'node_type': SAMPLE,
            'characteristics_category': OntologyAnnotation(term='organism part'),
            'characteristics_value': OntologyAnnotation(term='heart'),
            'size': 1,
            'technical_replicates': None,
            'is_input_to_next_protocols': True
        }
]

### 4. Declaration of ISA Assay templates as Python `OrderedDict`

In [5]:
# A Mass Spectrometry based metabolite profiling assay

ms_assay_dict = OrderedDict([
    ('measurement_type', OntologyAnnotation(term='metabolite profiling')),
    ('technology_type', OntologyAnnotation(term='mass spectrometry')),
    ('extraction', {}),
    ('extract', [
        {
            'node_type': EXTRACT,
            'characteristics_category': OntologyAnnotation(term='extract type'),
            'characteristics_value': OntologyAnnotation(term='polar fraction'),
            'size': 1,
            'is_input_to_next_protocols': True
        },
        {
            'node_type': EXTRACT,
            'characteristics_category': OntologyAnnotation(term='extract type'),
            'characteristics_value': OntologyAnnotation(term='lipids'),
            'size': 1,
            'is_input_to_next_protocols': True
        }
    ]),
    ('derivatization', {
        '#replicates': 1,
        OntologyAnnotation(term='derivatization'): ['sylalation'],
        OntologyAnnotation(term='derivatization'): ['bis(trimethylsilyl)acetamide'],
    }),
    ('labeled extract', [
        {
            'node_type': LABELED_EXTRACT,
            'characteristics_category': OntologyAnnotation(term='labeled extract type'),
            'characteristics_value': '',
            'size': 1,
            'is_input_to_next_protocols': True
        }
    ]),
    ('mass spectrometry', {
        '#replicates': 2,
        OntologyAnnotation(term='instrument'): ['Agilent QTOF'],
        OntologyAnnotation(term='injection_mode'): ['GC'],
        OntologyAnnotation(term='acquisition_mode'): ['positive mode','negative mode']
    }),
    ('raw spectral data file', [
        {
            'node_type': DATA_FILE,
            'size': 1,
            'is_input_to_next_protocols': False
        }
    ])
])


# A high-throughput phenotyping imaging based phenotyping assay

phti_assay_dict = OrderedDict([
    ('measurement_type', OntologyAnnotation(term='phenotyping')),
    ('technology_type', OntologyAnnotation(term='high-throughput imaging')),
            ('extraction', {}),
            ('extract', [
                {
                    'node_type': EXTRACT,
                    'characteristics_category': OntologyAnnotation(term='extract type'),
                    'characteristics_value': OntologyAnnotation(term='supernatant'),
                    'size': 1,
                    'technical_replicates': None,
                    'is_input_to_next_protocols': True
                },
                {
                    'node_type': EXTRACT,
                    'characteristics_category': OntologyAnnotation(term='extract type'),
                    'characteristics_value': OntologyAnnotation(term='pellet'),
                    'size': 1,
                    'technical_replicates': None,
                    'is_input_to_next_protocols': True
                }
            ]),
            ('phenotyping by high throughput imaging', {
                'OntologyAnnotation(term=instrument)': ['lemnatech gigant'],
                'OntologyAnnotation(term=acquisition_mode)': ['UV light','near-IR light','far-IR light','visible light'],
                'OntologyAnnotation(term=camera position)': ['top','120 degree','240 degree','360 degree'],
                'OntologyAnnotation(term=imaging daily schedule)': ['06.00','19.00']
            }),
            ('raw_spectral_data_file', [
                {
                    'node_type': DATA_FILE,
                    'size': 1,
                    'technical_replicates': 2,
                    'is_input_to_next_protocols': False
                }
            ])
        ])

# A liquid chromatography diode-array based metabolite profiling assay

lcdad_assay_dict = OrderedDict([
    ('measurement_type', OntologyAnnotation(term='metabolite identification')),
    ('technology_type', OntologyAnnotation(term='liquid chromatography diode-array detector')),
            ('extraction', {}),
            ('extract', [
                {
                    'node_type': EXTRACT,
                    'characteristics_category': OntologyAnnotation(term='extract type'),
                    'characteristics_value': OntologyAnnotation(term='supernatant'),
                    'size': 1,
                    'technical_replicates': None,
                    'is_input_to_next_protocols': True
                },
                {
                    'node_type': EXTRACT,
                    'characteristics_category': OntologyAnnotation(term='extract type'),
                    'characteristics_value': OntologyAnnotation(term='pellet'),
                    'size': 1,
                    'technical_replicates': None,
                    'is_input_to_next_protocols': True
                }
            ]),
            ('lcdad_spectroscopy', {
                'OntologyAnnotation(term=instrument)': ['Shimadzu DAD 400'],
            }),
            ('raw_spectral_data_file', [
                {
                    'node_type': DATA_FILE,
                    'size': 1,
                    'technical_replicates': 2,
                    'is_input_to_next_protocols': False
                }
            ])
        ])


# A NMR spectroscopy based metabolite profiling assay:
nmr_assay_dict = OrderedDict([
    ('measurement_type', OntologyAnnotation(term='metabolite profiling')),
    ('technology_type', OntologyAnnotation(term='nmr spectroscopy')),
            ('extraction', {}),
            ('extract', [
                {
                    'node_type': EXTRACT,
                    'characteristics_category':  OntologyAnnotation(term='extract type'),
                    'characteristics_value': OntologyAnnotation(term='supernatant'),
                    'size': 1,
                    'technical_replicates': None,
                    'is_input_to_next_protocols': True
                },
                {
                    'node_type': EXTRACT,
                    'characteristics_category':  OntologyAnnotation(term='extract type'),
                    'characteristics_value': OntologyAnnotation(term='pellet'),
                    'size': 1,
                    'technical_replicates': None,
                    'is_input_to_next_protocols': True
                }
            ]),
            ('nmr spectroscopy', {
                OntologyAnnotation(term='instrument'): [OntologyAnnotation(term='Bruker AvanceII 1 GHz')],
                OntologyAnnotation(term='acquisition_mode'): [OntologyAnnotation(term='1D 13C NMR')],
                OntologyAnnotation(term='pulse_sequence'): [OntologyAnnotation(term='CPMG')]
            }),
            ('raw_spectral_data_file', [
                {
                    'node_type': DATA_FILE,
                    'size': 1,
                    'technical_replicates': 1,
                    'is_input_to_next_protocols': False
                }
            ])
    ])

### 5. Declaring Study Design key elements in terms of Treatments and Non-Treatment elements, Study Cell & Arms

In [6]:
first_treatment = Treatment(factor_values=(
    FactorValue(factor_name=BASE_FACTORS[0], value=FACTORS_0_VALUE),
    FactorValue(factor_name=BASE_FACTORS[1], value=FACTORS_1_VALUE, unit=FACTORS_1_UNIT),
    FactorValue(factor_name=BASE_FACTORS[2], value=FACTORS_2_VALUE, unit=FACTORS_2_UNIT)
))
second_treatment = Treatment(factor_values=(
    FactorValue(factor_name=BASE_FACTORS[0], value=FACTORS_0_VALUE_ALT),
    FactorValue(factor_name=BASE_FACTORS[1], value=FACTORS_1_VALUE, unit=FACTORS_1_UNIT),
    FactorValue(factor_name=BASE_FACTORS[2], value=FACTORS_2_VALUE, unit=FACTORS_2_UNIT)
))
third_treatment = Treatment(factor_values=(
    FactorValue(factor_name=BASE_FACTORS[0], value=FACTORS_0_VALUE_ALT),
    FactorValue(factor_name=BASE_FACTORS[1], value=FACTORS_1_VALUE, unit=FACTORS_1_UNIT),
    FactorValue(factor_name=BASE_FACTORS[2], value=FACTORS_2_VALUE_ALT, unit=FACTORS_2_UNIT)
))
fourth_treatment = Treatment(factor_values=(
    FactorValue(factor_name=BASE_FACTORS[0], value=FACTORS_0_VALUE_THIRD),
    FactorValue(factor_name=BASE_FACTORS[1], value=FACTORS_1_VALUE, unit=FACTORS_1_UNIT),
    FactorValue(factor_name=BASE_FACTORS[2], value=FACTORS_2_VALUE, unit=FACTORS_2_UNIT)
))
screen = NonTreatment(element_type=SCREEN, duration_value=SCREEN_DURATION_VALUE, duration_unit=DURATION_UNIT)
run_in = NonTreatment(element_type=RUN_IN, duration_value=WASHOUT_DURATION_VALUE, duration_unit=DURATION_UNIT)
washout = NonTreatment(element_type=WASHOUT, duration_value=WASHOUT_DURATION_VALUE, duration_unit=DURATION_UNIT)
follow_up = NonTreatment(element_type=FOLLOW_UP, duration_value=FOLLOW_UP_DURATION_VALUE, duration_unit=DURATION_UNIT)
potential_concomitant_washout = NonTreatment(element_type=WASHOUT, duration_value=FACTORS_2_VALUE,
                                                          duration_unit=FACTORS_2_UNIT)
cell_screen = StudyCell(SCREEN, elements=(screen,))
cell_run_in = StudyCell(RUN_IN, elements=(run_in,))
cell_other_run_in = StudyCell('OTHER RUN-IN', elements=(run_in,))
cell_screen_and_run_in = StudyCell('SCREEN AND RUN-IN', elements=[screen, run_in])
cell_concomitant_treatments = StudyCell('CONCOMITANT TREATMENTS',
                                                     elements=([{second_treatment, fourth_treatment}]))
cell_washout_00 = StudyCell(WASHOUT, elements=(washout,))
cell_washout_01 = StudyCell('ANOTHER WASHOUT', elements=(washout,))
cell_single_treatment_00 = StudyCell('SINGLE TREATMENT FIRST', elements=[first_treatment])
cell_single_treatment_01 = StudyCell('SINGLE TREATMENT SECOND', elements=[second_treatment])
cell_single_treatment_02 = StudyCell('SINGLE TREATMENT THIRD', elements=[third_treatment])
cell_multi_elements = StudyCell('MULTI ELEMENTS',
                                             elements=[{first_treatment, second_treatment,
                                                        fourth_treatment}, washout, second_treatment])
cell_multi_elements_padded = StudyCell('MULTI ELEMENTS PADDED',
                                                    elements=[first_treatment, washout, {
                                                        second_treatment,
                                                        fourth_treatment
                                                    }, washout, third_treatment, washout])
cell_follow_up = StudyCell(FOLLOW_UP, elements=(follow_up,))
cell_follow_up_01 = StudyCell('ANOTHER FOLLOW_UP', elements=(follow_up,))
qc = QualityControl()

ms_sample_assay_plan = SampleAndAssayPlan.from_sample_and_assay_plan_dict("ms_sap", sample_list, ms_assay_dict)
nmr_sample_assay_plan = SampleAndAssayPlan.from_sample_and_assay_plan_dict("nmr_sap", sample_list, nmr_assay_dict)

first_arm = StudyArm(name=TEST_STUDY_ARM_NAME_00, group_size=3, arm_map=OrderedDict([
    (cell_screen, None), (cell_run_in, None),
    (cell_single_treatment_00, ms_sample_assay_plan),
    (cell_follow_up, ms_sample_assay_plan)
]))
second_arm = StudyArm(name=TEST_STUDY_ARM_NAME_01, group_size=5, arm_map=OrderedDict([
    (cell_screen, None), (cell_run_in, None),
    (cell_multi_elements, ms_sample_assay_plan),
    (cell_follow_up, ms_sample_assay_plan)
]))
third_arm = StudyArm(name=TEST_STUDY_ARM_NAME_02, group_size=3, arm_map=OrderedDict([
    (cell_screen, None), (cell_run_in, None),
    (cell_multi_elements_padded, ms_sample_assay_plan),
    (cell_follow_up, ms_sample_assay_plan)
]))
third_arm_no_run_in = StudyArm(name=TEST_STUDY_ARM_NAME_02, group_size=3, arm_map=OrderedDict([
    (cell_screen, None),
    (cell_multi_elements_padded, ms_sample_assay_plan),
    (cell_follow_up, ms_sample_assay_plan)
]))
arm_same_name_as_third = StudyArm(name=TEST_STUDY_ARM_NAME_02, group_size=5, arm_map=OrderedDict([
    (cell_screen, None), (cell_run_in, None),
    (cell_single_treatment_01, ms_sample_assay_plan),
    (cell_follow_up, ms_sample_assay_plan)
]))
        # Sample QC (for mass spectroscopy and other)
pre_run_sample_type = ProductNode(
    id_='pre/00', node_type=SAMPLE, name='water', size=2, characteristics=(
        Characteristic(category='dilution', value=10, unit='mg/L'),
    )
)
post_run_sample_type = ProductNode(
    id_='post/00', node_type=SAMPLE, name='ethanol', size=2, characteristics=(
        Characteristic(category='dilution', value=1000, unit='mg/L'),
        Characteristic(category='dilution', value=100, unit='mg/L'),
        Characteristic(category='dilution', value=10, unit='mg/L'),
        Characteristic(category='dilution', value=1, unit='mg/L'),
        Characteristic(category='dilution', value=0.1, unit='mg/L')
    ))
dummy_sample_type = ProductNode(id_='dummy/01', node_type=SAMPLE, name='dummy')
more_dummy_sample_type = ProductNode(id_='dummy/02', node_type=SAMPLE, name='more dummy')
interspersed_sample_types = [(dummy_sample_type, 20)]
qc = QualityControl(
    interspersed_sample_type=interspersed_sample_types,
    pre_run_sample_type=pre_run_sample_type,
    post_run_sample_type=post_run_sample_type
)

In [7]:
single_arm = StudyArm(name=TEST_STUDY_ARM_NAME_00, group_size=10, arm_map=OrderedDict([
    (cell_screen, ms_sample_assay_plan), (cell_run_in,ms_sample_assay_plan),
    (cell_single_treatment_00, nmr_sample_assay_plan),
    (cell_follow_up, nmr_sample_assay_plan)
]))
study_design = StudyDesign(study_arms=(single_arm,))


### 6. Generated ISA Study from ISA Study Design Object

In [8]:
study = study_design.generate_isa_study()

In [9]:
study

isatools.model.Study(filename='s_study_01.txt', identifier='s_01', title='Study Design', description='None', submission_date='', public_release_date='', contacts=[], design_descriptors=[], publications=[], factors=[isatools.model.StudyFactor(name='INTENSITY', factor_type=isatools.model.OntologyAnnotation(term='intensity', term_source=None, term_accession='', comments=[]), comments=[]), isatools.model.StudyFactor(name='Sequence Order', factor_type=isatools.model.OntologyAnnotation(term='sequence order', term_source=None, term_accession='', comments=[]), comments=[]), isatools.model.StudyFactor(name='AGENT', factor_type=isatools.model.OntologyAnnotation(term='perturbation agent', term_source=None, term_accession='', comments=[]), comments=[]), isatools.model.StudyFactor(name='DURATION', factor_type=isatools.model.OntologyAnnotation(term='time', term_source=None, term_accession='', comments=[]), comments=[])], protocols=[isatools.model.Protocol(name='sample collection', protocol_type=isat

In [10]:
treatment_assay = next(iter(study.assays))

In [11]:
treatment_assay.graph

<networkx.classes.digraph.DiGraph at 0x12ffa49a0>

In [12]:
[(process.name, getattr(process.prev_process, 'name', None), getattr(process.next_process, 'name', None)) for process in treatment_assay.process_sequence]

[('AT0-S1-assay0---extraction-Acquisition-R1',
  None,
  'AT0-S1-assay0---nmr-spectroscopy-Acquisition-R2'),
 ('AT0-S1-assay0---nmr-spectroscopy-Acquisition-R1',
  'AT0-S1-assay0---extraction-Acquisition-R1',
  None),
 ('AT0-S1-assay0---nmr-spectroscopy-Acquisition-R2',
  'AT0-S1-assay0---extraction-Acquisition-R1',
  None),
 ('AT0-S2-assay0---extraction-Acquisition-R1',
  None,
  'AT0-S2-assay0---nmr-spectroscopy-Acquisition-R2'),
 ('AT0-S2-assay0---nmr-spectroscopy-Acquisition-R1',
  'AT0-S2-assay0---extraction-Acquisition-R1',
  None),
 ('AT0-S2-assay0---nmr-spectroscopy-Acquisition-R2',
  'AT0-S2-assay0---extraction-Acquisition-R1',
  None),
 ('AT0-S3-assay0---extraction-Acquisition-R1',
  None,
  'AT0-S3-assay0---nmr-spectroscopy-Acquisition-R2'),
 ('AT0-S3-assay0---nmr-spectroscopy-Acquisition-R1',
  'AT0-S3-assay0---extraction-Acquisition-R1',
  None),
 ('AT0-S3-assay0---nmr-spectroscopy-Acquisition-R2',
  'AT0-S3-assay0---extraction-Acquisition-R1',
  None),
 ('AT0-S4-assay0---

In [13]:
a_graph = treatment_assay.graph

In [14]:
len(a_graph.nodes)

360

In [15]:
isa_investigation = Investigation(studies=[study])

In [24]:
#isa_tables = dumpdf(isa_investigation)

2021-12-03 16:20:00,430 [INFO]: isatab.py(_all_end_to_end_paths:1131) >> [0]
2021-12-03 16:20:00,523 [INFO]: isatab.py(_longest_path_and_attrs:1091) >> [[0, 2, 1], [0, 4, 3], [0, 6, 5], [0, 8, 7], [0, 10, 9], [0, 12, 11], [0, 14, 13], [0, 16, 15], [0, 18, 17], [0, 20, 19], [0, 22, 21], [0, 24, 23], [0, 26, 25], [0, 28, 27], [0, 30, 29], [0, 32, 31], [0, 34, 33], [0, 36, 35], [0, 38, 37], [0, 40, 39], [0, 42, 41], [0, 44, 43], [0, 46, 45], [0, 48, 47], [0, 50, 49], [0, 52, 51], [0, 54, 53], [0, 56, 55], [0, 58, 57], [0, 60, 59], [0, 62, 61], [0, 64, 63], [0, 66, 65], [0, 68, 67], [0, 70, 69], [0, 72, 71], [0, 74, 73], [0, 76, 75], [0, 78, 77], [0, 80, 79], [0, 82, 81], [0, 84, 83], [0, 86, 85], [0, 88, 87], [0, 90, 89], [0, 92, 91], [0, 94, 93], [0, 96, 95], [0, 98, 97], [0, 100, 99], [0, 102, 101], [0, 104, 103], [0, 106, 105], [0, 108, 107], [0, 110, 109], [0, 112, 111], [0, 114, 113], [0, 116, 115], [0, 118, 117], [0, 120, 119], [0, 122, 121], [0, 124, 123], [0, 126, 125], [0, 128, 1

2021-12-03 16:20:05,844 [INFO]: isatab.py(_all_end_to_end_paths:1131) >> [1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59, 61, 63, 65, 67, 69, 71, 73, 75, 77, 79, 81, 83, 85, 87, 89, 91, 93, 95, 97, 99, 101, 103, 105, 107, 109, 111, 113, 115, 117, 119]
2021-12-03 16:20:05,869 [INFO]: isatab.py(_longest_path_and_attrs:1091) >> [[1, 661, 662, 663, 664, 665], [1, 661, 662, 663, 664, 667], [1, 661, 662, 663, 664, 669], [1, 661, 662, 663, 664, 671], [1, 661, 673, 674, 675, 676], [1, 661, 673, 674, 675, 678], [1, 661, 673, 674, 675, 680], [1, 661, 673, 674, 675, 682], [3, 684, 696, 697, 698, 705], [3, 684, 685, 686, 687, 688], [3, 684, 685, 686, 687, 690], [3, 684, 685, 686, 687, 692], [3, 684, 685, 686, 687, 694], [3, 684, 696, 697, 698, 699], [3, 684, 696, 697, 698, 701], [3, 684, 696, 697, 698, 703], [5, 707, 708, 709, 710, 711], [5, 707, 708, 709, 710, 713], [5, 707, 708, 709, 710, 715], [5, 707, 708, 709, 710, 717], [5, 7

2021-12-03 16:20:05,881 [INFO]: isatab.py(_longest_path_and_attrs:1091) >> [[1, 661, 662, 663, 664, 665], [1, 661, 662, 663, 664, 667], [1, 661, 662, 663, 664, 669], [1, 661, 662, 663, 664, 671], [1, 661, 673, 674, 675, 676], [1, 661, 673, 674, 675, 678], [1, 661, 673, 674, 675, 680], [1, 661, 673, 674, 675, 682], [3, 684, 696, 697, 698, 705], [3, 684, 685, 686, 687, 688], [3, 684, 685, 686, 687, 690], [3, 684, 685, 686, 687, 692], [3, 684, 685, 686, 687, 694], [3, 684, 696, 697, 698, 699], [3, 684, 696, 697, 698, 701], [3, 684, 696, 697, 698, 703], [5, 707, 708, 709, 710, 711], [5, 707, 708, 709, 710, 713], [5, 707, 708, 709, 710, 715], [5, 707, 708, 709, 710, 717], [5, 707, 719, 720, 721, 722], [5, 707, 719, 720, 721, 724], [5, 707, 719, 720, 721, 726], [5, 707, 719, 720, 721, 728], [7, 730, 731, 732, 733, 734], [7, 730, 731, 732, 733, 736], [7, 730, 731, 732, 733, 738], [7, 730, 731, 732, 733, 740], [7, 730, 742, 743, 744, 745], [7, 730, 742, 743, 744, 747], [7, 730, 742, 743, 744, 

In [17]:
#[type(x) for x in study.assays[0].graph.nodes()]

In [18]:
#[(getattr(el, 'name', None), type(el))for el in treatment_assay.graph.nodes()]

In [19]:
from isatools.model import _build_assay_graph

In [20]:
gph = _build_assay_graph(treatment_assay.process_sequence)

In [25]:
[key for key in isa_tables.keys()]

['s_study_01.txt',
 'a_AT0_metabolite-profiling_mass-spectrometry.txt',
 'a_AT0_metabolite-profiling_nmr-spectroscopy.txt']

In [26]:
isa_tables['s_study_01.txt']

Unnamed: 0,Source Name,Characteristics[Study Subject],Term Source REF,Term Accession Number,Protocol REF,Parameter Value[Sampling order],Parameter Value[Study cell],Date,Performer,Sample Name,Characteristics[organism part],Comment[study step with treatment],Factor Value[Sequence Order],Factor Value[AGENT],Factor Value[DURATION],Unit,Factor Value[INTENSITY],Unit.1
0,GRP1_SBJ05,Human,NCIT,http://purl.obolibrary.org/obo/NCIT_C14225,sample collection,001,screen,2021-12-03,Unknown,GRP1_SBJ02_screen_SMP-blood-1,blood,NO,0,,100.0,day,,
1,GRP1_SBJ05,Human,NCIT,http://purl.obolibrary.org/obo/NCIT_C14225,sample collection,088,SINGLE TREATMENT FIRST,2021-12-03,Unknown,GRP1_SBJ01_SINGLE-TREATMENT-FIRST_SMP-heart-1,heart,YES,2,nitroglycerin,100.0,s,5.0,kg/m^3
2,GRP1_SBJ05,Human,NCIT,http://purl.obolibrary.org/obo/NCIT_C14225,sample collection,087,SINGLE TREATMENT FIRST,2021-12-03,Unknown,GRP1_SBJ10_SINGLE-TREATMENT-FIRST_SMP-heart-1,heart,YES,2,nitroglycerin,100.0,s,5.0,kg/m^3
3,GRP1_SBJ05,Human,NCIT,http://purl.obolibrary.org/obo/NCIT_C14225,sample collection,086,SINGLE TREATMENT FIRST,2021-12-03,Unknown,GRP1_SBJ08_SINGLE-TREATMENT-FIRST_SMP-heart-1,heart,YES,2,nitroglycerin,100.0,s,5.0,kg/m^3
4,GRP1_SBJ05,Human,NCIT,http://purl.obolibrary.org/obo/NCIT_C14225,sample collection,085,SINGLE TREATMENT FIRST,2021-12-03,Unknown,GRP1_SBJ07_SINGLE-TREATMENT-FIRST_SMP-heart-1,heart,YES,2,nitroglycerin,100.0,s,5.0,kg/m^3
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
115,GRP1_SBJ05,Human,NCIT,http://purl.obolibrary.org/obo/NCIT_C14225,sample collection,034,run-in,2021-12-03,Unknown,GRP1_SBJ04_run-in_SMP-blood-1,blood,NO,1,,30.0,day,,
116,GRP1_SBJ05,Human,NCIT,http://purl.obolibrary.org/obo/NCIT_C14225,sample collection,033,run-in,2021-12-03,Unknown,GRP1_SBJ03_run-in_SMP-blood-1,blood,NO,1,,30.0,day,,
117,GRP1_SBJ05,Human,NCIT,http://purl.obolibrary.org/obo/NCIT_C14225,sample collection,032,run-in,2021-12-03,Unknown,GRP1_SBJ09_run-in_SMP-blood-1,blood,NO,1,,30.0,day,,
118,GRP1_SBJ05,Human,NCIT,http://purl.obolibrary.org/obo/NCIT_C14225,sample collection,119,follow-up,2021-12-03,Unknown,GRP1_SBJ06_follow-up_SMP-heart-1,heart,NO,3,,1830.0,day,,


In [27]:
isa_tables['a_AT0_metabolite-profiling_nmr-spectroscopy.txt']

Unnamed: 0,Sample Name,Comment[study step with treatment],Protocol REF,Performer,Extract Name,Characteristics[extract type],Protocol REF.1,Parameter Value[instrument],Parameter Value[acquisition_mode],Parameter Value[pulse_sequence],Performer.1,Raw Data File
0,GRP1_SBJ01_SINGLE-TREATMENT-FIRST_SMP-blood-1,YES,assay0 - extraction,Unknown,AT0-S8-Extract-R2,supernatant,assay0 - nmr spectroscopy,Bruker AvanceII 1 GHz,1D 13C NMR,CPMG,Unknown,AT0-S8-raw_spectral_data_file-R2-
1,GRP1_SBJ01_SINGLE-TREATMENT-FIRST_SMP-blood-1,YES,assay0 - extraction,Unknown,AT0-S8-Extract-R1,pellet,assay0 - nmr spectroscopy,Bruker AvanceII 1 GHz,1D 13C NMR,CPMG,Unknown,AT0-S8-raw_spectral_data_file-R1-
2,GRP1_SBJ01_SINGLE-TREATMENT-FIRST_SMP-heart-1,YES,assay0 - extraction,Unknown,AT0-S28-Extract-R1,pellet,assay0 - nmr spectroscopy,Bruker AvanceII 1 GHz,1D 13C NMR,CPMG,Unknown,AT0-S28-raw_spectral_data_file-R1-
3,GRP1_SBJ01_SINGLE-TREATMENT-FIRST_SMP-heart-1,YES,assay0 - extraction,Unknown,AT0-S28-Extract-R2,supernatant,assay0 - nmr spectroscopy,Bruker AvanceII 1 GHz,1D 13C NMR,CPMG,Unknown,AT0-S28-raw_spectral_data_file-R2-
4,GRP1_SBJ01_SINGLE-TREATMENT-FIRST_SMP-liver-1,YES,assay0 - extraction,Unknown,AT0-S18-Extract-R2,supernatant,assay0 - nmr spectroscopy,Bruker AvanceII 1 GHz,1D 13C NMR,CPMG,Unknown,AT0-S18-raw_spectral_data_file-R2-
...,...,...,...,...,...,...,...,...,...,...,...,...
115,GRP1_SBJ10_follow-up_SMP-blood-1,NO,assay0 - extraction,Unknown,AT0-S37-Extract-R2,supernatant,assay0 - nmr spectroscopy,Bruker AvanceII 1 GHz,1D 13C NMR,CPMG,Unknown,AT0-S37-raw_spectral_data_file-R2-
116,GRP1_SBJ10_follow-up_SMP-heart-1,NO,assay0 - extraction,Unknown,AT0-S57-Extract-R1,pellet,assay0 - nmr spectroscopy,Bruker AvanceII 1 GHz,1D 13C NMR,CPMG,Unknown,AT0-S57-raw_spectral_data_file-R1-
117,GRP1_SBJ10_follow-up_SMP-heart-1,NO,assay0 - extraction,Unknown,AT0-S57-Extract-R2,supernatant,assay0 - nmr spectroscopy,Bruker AvanceII 1 GHz,1D 13C NMR,CPMG,Unknown,AT0-S57-raw_spectral_data_file-R2-
118,GRP1_SBJ10_follow-up_SMP-liver-1,NO,assay0 - extraction,Unknown,AT0-S47-Extract-R2,supernatant,assay0 - nmr spectroscopy,Bruker AvanceII 1 GHz,1D 13C NMR,CPMG,Unknown,AT0-S47-raw_spectral_data_file-R2-


In [28]:
isa_tables['a_AT0_metabolite-profiling_mass-spectrometry.txt']

Unnamed: 0,Sample Name,Comment[study step with treatment],Protocol REF,Performer,Extract Name,Characteristics[extract type],Protocol REF.1,Parameter Value[derivatization],Performer.1,Labeled Extract Name,Protocol REF.2,Parameter Value[instrument],Parameter Value[injection_mode],Parameter Value[acquisition_mode],Performer.2,Raw Spectral Data File
0,GRP1_SBJ01_run-in_SMP-blood-1,NO,assay0 - extraction,Unknown,AT0-S38-Extract-R2,lipids,assay0 - derivatization,bis(trimethylsilyl)acetamide,Unknown,AT0-S38-LE-R2,assay0 - mass spectrometry,Agilent QTOF,GC,negative mode,Unknown,AT0-S38-raw-spectral-data-file-R6
1,GRP1_SBJ01_run-in_SMP-blood-1,NO,assay0 - extraction,Unknown,AT0-S38-Extract-R1,polar fraction,assay0 - derivatization,bis(trimethylsilyl)acetamide,Unknown,AT0-S38-LE-R1,assay0 - mass spectrometry,Agilent QTOF,GC,negative mode,Unknown,AT0-S38-raw-spectral-data-file-R4
2,GRP1_SBJ01_run-in_SMP-blood-1,NO,assay0 - extraction,Unknown,AT0-S38-Extract-R1,polar fraction,assay0 - derivatization,bis(trimethylsilyl)acetamide,Unknown,AT0-S38-LE-R1,assay0 - mass spectrometry,Agilent QTOF,GC,negative mode,Unknown,AT0-S38-raw-spectral-data-file-R3
3,GRP1_SBJ01_run-in_SMP-blood-1,NO,assay0 - extraction,Unknown,AT0-S38-Extract-R1,polar fraction,assay0 - derivatization,bis(trimethylsilyl)acetamide,Unknown,AT0-S38-LE-R1,assay0 - mass spectrometry,Agilent QTOF,GC,positive mode,Unknown,AT0-S38-raw-spectral-data-file-R2
4,GRP1_SBJ01_run-in_SMP-blood-1,NO,assay0 - extraction,Unknown,AT0-S38-Extract-R1,polar fraction,assay0 - derivatization,bis(trimethylsilyl)acetamide,Unknown,AT0-S38-LE-R1,assay0 - mass spectrometry,Agilent QTOF,GC,positive mode,Unknown,AT0-S38-raw-spectral-data-file-R1
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
475,GRP1_SBJ10_screen_SMP-liver-1,NO,assay0 - extraction,Unknown,AT0-S17-Extract-R2,lipids,assay0 - derivatization,bis(trimethylsilyl)acetamide,Unknown,AT0-S17-LE-R2,assay0 - mass spectrometry,Agilent QTOF,GC,negative mode,Unknown,AT0-S17-raw-spectral-data-file-R6
476,GRP1_SBJ10_screen_SMP-liver-1,NO,assay0 - extraction,Unknown,AT0-S17-Extract-R2,lipids,assay0 - derivatization,bis(trimethylsilyl)acetamide,Unknown,AT0-S17-LE-R2,assay0 - mass spectrometry,Agilent QTOF,GC,negative mode,Unknown,AT0-S17-raw-spectral-data-file-R5
477,GRP1_SBJ10_screen_SMP-liver-1,NO,assay0 - extraction,Unknown,AT0-S17-Extract-R1,polar fraction,assay0 - derivatization,bis(trimethylsilyl)acetamide,Unknown,AT0-S17-LE-R1,assay0 - mass spectrometry,Agilent QTOF,GC,negative mode,Unknown,AT0-S17-raw-spectral-data-file-R4
478,GRP1_SBJ10_screen_SMP-liver-1,NO,assay0 - extraction,Unknown,AT0-S17-Extract-R1,polar fraction,assay0 - derivatization,bis(trimethylsilyl)acetamide,Unknown,AT0-S17-LE-R1,assay0 - mass spectrometry,Agilent QTOF,GC,positive mode,Unknown,AT0-S17-raw-spectral-data-file-R2


In [29]:
final_dir = os.path.abspath(os.path.join('notebook-output', 'sd-test'))

### 7. Serialization as ISA-JSON and ISA-Tab

In [None]:
isa_j = json.dumps(isa_investigation, cls=ISAJSONEncoder, sort_keys=True, indent=4, separators=(',', ': '))
open(os.path.join(final_dir,"isa_as_json_from_dumps2.json"),"w").write(isa_j) # this call write the string 'isa_j' to the file called 'isa_as_json_from_dumps.json'

In [None]:
isatab.dump(isa_obj=isa_investigation, output_path=final_dir)

### 8. Performing syntactic validation by invoking ISA Validator

In [None]:
with open(os.path.join(final_dir,'i_investigation.txt')) as isa:
    validation_report=isatab.validate(isa)

In [None]:
validation_report["errors"]

## Conclusion:

With this notebook, we have shown how to use study design information to generate a populated instance of ISA Study object and write it to file.



## About this notebook

- authors: philippe.rocca-serra@oerc.ox.ac.uk, massimiliano.izzo@oerc.ox.ac.uk
- license: CC-BY 4.0
- support: isatools@googlegroups.com
- issue tracker: https://github.com/ISA-tools/isa-api/issues