# Study Designer: Repeated Intervention Example

Outline: 
12 healthy volunteers were randomly assigned to each of 8 study arm.
Study arms consistent of a sequence of 2 epochs, a chemical intervention and dietary intervention
chemical intervention  (agent: diet, intensity {low fat, high fat}, duration {4 weeks}
defining intervention A and B
dietary intervention (agent: chemical compound, intensity (high dose, low dose), duration {4 weeks}
defining intervention 1 and 2

The possible sequences of treatments are: 
A followed by 1
1 followed by A
B followed by 1
1 followed by B
A followed by 2
2 followed by A
B followed by 2
2 followed by B

In each arm, 10 urine samples were collected over the course of the study for each of the volunteers
metabolites from the polar fraction were analysed using LC-MS in both positive and negative mode on a Agilent 6550 iFunnel Q-TOF Mass Sprectrometry platform. 



In [36]:
from ipywidgets import (RadioButtons, SelectMultiple,Dropdown,VBox, HBox, Layout, Label, Checkbox, Text, IntSlider)
from qgrid import show_grid
label_layout = Layout(width='100%')  # for making sure the labels display correctly

from itertools import product
from itertools import permutations
from isatools.create.models import *
from isatools.model import Investigation
from isatools.isatab import dump_tables_to_dataframes as dumpdf
import qgrid
qgrid.nbinstall(overwrite=True)

## Sample planning section

### Study design type

Please specify if the study is an intervention or an observation.

In [37]:
rad_study_design = Dropdown(options=['Intervention', 'Observation'], value='Intervention', disabled=False)
VBox([Label('Study design type?', layout=label_layout), rad_study_design])

### Intervention study

If specifying an intervention study, please answer the following:
 - Are study subjects exposed to a single intervention or to multiple intervention?
 - Are there 'hard to change' factors, which restrict randomization of experimental unit?
 
*Note: if you chose 'observation' as the study design type, the following choices will be disabled and you should skip to the Observation study section*
 

In [38]:
if rad_study_design.value == 'Intervention':
    study_design = InterventionStudyDesign()
if rad_study_design.value == 'Observation':
    study_design = None
intervention_ui_disabled = not isinstance(study_design, InterventionStudyDesign)
intervention_type = RadioButtons(options=['single', 'multiple'], value='multiple', disabled=intervention_ui_disabled)
intervention_type_vbox = VBox([Label('Single intervention or to multiple intervention?', layout=label_layout), intervention_type])
HBox([intervention_type_vbox])

#### Factorial design - intervention types

If specifying an factorial design, please list the intervention types here.

In [39]:
factorial_design_ui_disabled = not factorial_design
chemical_intervention = Checkbox(value=True, description='Chemical intervention', disabled=factorial_design_ui_disabled)
behavioural_intervention = Checkbox(value=False, description='Behavioural intervention', disabled=factorial_design_ui_disabled)
surgical_intervention = Checkbox(value=False, description='Surgical intervention', disabled=factorial_design_ui_disabled)
biological_intervention = Checkbox(value=True, description='Biological intervention', disabled=factorial_design_ui_disabled)
radiological_intervention = Checkbox(value=False, description='Radiological intervention', disabled=factorial_design_ui_disabled)
VBox([chemical_intervention, surgical_intervention, biological_intervention, radiological_intervention])

In [40]:
level_uis = []
if chemical_intervention:
    agent_levels = Text(
        value='aspirin',
        description='Agent:',
        disabled=False
    )
    dose_levels = Text(
        value='low,high',
        description='Dose levels:',
        disabled=False
    )
    duration_of_exposure_levels = Text(
        value='4 weeks',
        description='Duration of exposure:',
        disabled=False
    )
vb1=VBox([Label("Chemical intervention factor levels:", layout=label_layout), agent_levels, dose_levels, duration_of_exposure_levels])


level_uis = []
if biological_intervention:
    agent_levels = Text(
        value='diet',
        description='Agent:',
        disabled=False
    )
    dose_levels = Text(
        value='low fat,high fat',
        description='Dose levels:',
        disabled=False
    )
    duration_of_exposure_levels = Text(
        value='4 weeks',
        description='Duration of exposure:',
        disabled=False
    )
vb2=VBox([Label("Biological intervention factor levels:", layout=label_layout), agent_levels, dose_levels, duration_of_exposure_levels])

HBox([vb1, vb2])

In [49]:
factory_chem = TreatmentFactory(intervention_type=INTERVENTIONS['CHEMICAL'], factors=BASE_FACTORS)
for agent_level in agent_levels.value.split(','):
    factory_chem.add_factor_value(BASE_FACTORS[0], agent_level.strip())
for dose_level in dose_levels.value.split(','):
    factory_chem.add_factor_value(BASE_FACTORS[1], dose_level.strip())
for duration_of_exposure_level in duration_of_exposure_levels.value.split(','):
    factory_chem.add_factor_value(BASE_FACTORS[2], duration_of_exposure_level.strip())
print('Number of chemical treatments: {}'.format(len(factory_chem.compute_full_factorial_design())))

factory_diet = TreatmentFactory(intervention_type=INTERVENTIONS['BIOLOGICAL'], factors=BASE_FACTORS)
for agent_level in agent_levels.value.split(','):
    factory_diet.add_factor_value(BASE_FACTORS[0], agent_level.strip())
for dose_level in dose_levels.value.split(','):
    factory_diet.add_factor_value(BASE_FACTORS[1], dose_level.strip())
for duration_of_exposure_level in duration_of_exposure_levels.value.split(','):
    factory_diet.add_factor_value(BASE_FACTORS[2], duration_of_exposure_level.strip())
print('Number of diet treatments: {}'.format(len(factory_diet.compute_full_factorial_design())))
diet_treatments = factory_diet.compute_full_factorial_design()
chem_treatments = factory_chem.compute_full_factorial_design()
all_treatments=set()
all_treatments=all_treatments.union(diet_treatments)
all_treatments=all_treatments.union(chem_treatments)


for treatment in treatments:
    treatment_sequence.add_treatment(treatment, 1)
    treatment_sequence.add_treatment(treatment, 2)


#print(all_treatments)
num_repeats=2
treatment_sequences = list(permutations(all_treatments, num_repeats))
print("number of treatment sequences: ", len(treatment_sequences),  "| number of treatments: ", len(all_treatments))
for i in treatment_sequences:
    print(i)


Number of chemical treatments: 2
Number of diet treatments: 2
number of treatment sequences:  12 | number of treatments:  4
(Treatment(factor_type=biological intervention, factor_values=[isatools.model.FactorValue(factor_name=isatools.model.StudyFactor(name='AGENT', factor_type=isatools.model.OntologyAnnotation(term='perturbation agent', term_source=None, term_accession='', comments=[]), comments=[]), value='diet', unit=None), isatools.model.FactorValue(factor_name=isatools.model.StudyFactor(name='DURATION', factor_type=isatools.model.OntologyAnnotation(term='time', term_source=None, term_accession='', comments=[]), comments=[]), value='4 weeks', unit=None), isatools.model.FactorValue(factor_name=isatools.model.StudyFactor(name='INTENSITY', factor_type=isatools.model.OntologyAnnotation(term='intensity', term_source=None, term_accession='', comments=[]), comments=[]), value='high fat', unit=None)]), Treatment(factor_type=chemical intervention, factor_values=[isatools.model.FactorValue(f

Next, specify if all study groups of the same size, i.e have the same number of subjects? (in other words, are the groups balanced).

In [9]:
group_blanced = RadioButtons(options=['Balanced', 'Unbalanced'], value='Balanced', disabled=False)
VBox([Label('Are study groups balanced?', layout=label_layout), group_blanced])

Provide the number of subject per study group:

In [10]:
group_size = IntSlider(value=5, min=0, max=100, step=1, description='Group size:', disabled=False, continuous_update=False, orientation='horizontal', readout=True, readout_format='d')
group_size

In [11]:
plan = SampleAssayPlan(group_size=group_size.value)

In [12]:
rad_sample_type = SelectMultiple(options=['Blood', 'Sweat', 'Tears', 'Urine','Liver'], value=['Liver','Sweat'], disabled=False)
VBox([Label('Sample type?', layout=label_layout), rad_sample_type])

How many times each of the samples have been collected?

In [13]:
sampling_size = IntSlider(value=3, min=0, max=100, step=1, description='Sample size:', disabled=False, continuous_update=False, orientation='horizontal', readout=True, readout_format='d')
sampling_size

In [14]:
# plan.add_sample_type(rad_sample_type.value)
# print(rad_sample_type.value[0])
element = 0
for element in range(len(rad_sample_type.value)):
#   print(rad_sample_type.value[element])
    plan.add_sample_type(rad_sample_type.value[element])
    plan.add_sample_plan_record(rad_sample_type.value[element], sampling_size.value)

isa_object_factory = IsaModelObjectFactory(plan, treatment_sequence)

## Generate ISA model objects from the sample plan and render the study-sample table

*Check state of the Sample Assay Plan after entering sample planning information:*

In [15]:
import json
from isatools.create.models import SampleAssayPlanEncoder
print(json.dumps(plan, cls=SampleAssayPlanEncoder, sort_keys=True, indent=4, separators=(',', ': ')))

{
    "assay_plan": [],
    "assay_types": [],
    "group_size": 5,
    "sample_plan": [
        {
            "sample_type": "Blood",
            "sampling_size": 3
        }
    ],
    "sample_qc_plan": [],
    "sample_types": [
        "Blood"
    ]
}


In [16]:
isa_investigation = Investigation(identifier='inv101')
isa_study = isa_object_factory.create_study_from_plan()
isa_study.filename = 's_study.txt'
isa_investigation.studies = [isa_study]
dataframes = dumpdf(isa_investigation)
sample_table = next(iter(dataframes.values()))
show_grid(sample_table)

In [17]:
print('Total rows generated: {}'.format(len(sample_table)))

Total rows generated: 120


## Assay planning 

### Select assay technology type to map to sample type from sample plan

In [18]:
rad_assay_type = RadioButtons(options=['DNA microarray', 'DNA sequencing', 'Mass spectrometry', 'NMR spectroscopy'], value='DNA microarray', disabled=False)
VBox([Label('Assay type to map to sample type "{}"?'.format(rad_sample_type.value), layout=label_layout), rad_assay_type])

In [19]:
if rad_assay_type.value == 'DNA microarray':
    assay_type = AssayType(measurement_type='transcription profiling', technology_type='DNA microarray')
    print('Selected measurement type "transcription profiling" and technology type "DNA microarray"')
else:
    raise Exception('Assay type not implemented')

Selected measurement type "genome sequencing" and technology type "nucleotide microarray"


### Topology modifications

In [20]:
technical_replicates = IntSlider(value=2, min=0, max=5, step=1, description='Technical repeats:', disabled=False, continuous_update=False, orientation='horizontal', readout=True, readout_format='d')
technical_replicates

In [21]:
ad_mod_affy27 = Checkbox(value=True, description='DNA Chip: A-AFFY-27')
ad_mod_affy28 = Checkbox(value=True, description='DNA Chip: A-AFFY-28')
ad_mod_affy29 = Checkbox(value=False, description='DNA Chip: A-AFFY-29')
VBox([ad_mod_affy27, ad_mod_affy28, ad_mod_affy29])

In [22]:
array_designs = set()
if ad_mod_affy27.value: array_designs.add('A-AFFY-27')
if ad_mod_affy28.value: array_designs.add('A-AFFY-28')
if ad_mod_affy29.value: array_designs.add('A-AFFY-29')
top_mods = AssayTopologyModifiers(technical_replicates=technical_replicates.value, array_designs=array_designs)
print('Technical replicates: {}'.format(top_mods.technical_replicates))
assay_type.topology_modifiers = top_mods
plan.add_assay_type(assay_type)
plan.add_assay_plan_record(rad_sample_type.value, assay_type)
assay_plan = next(iter(plan.assay_plan))
print('Added assay plan: {0} -> {1}/{2}'.format(assay_plan[0].value.term, assay_plan[1].measurement_type.term, assay_plan[1].technology_type.term))
if len(top_mods.array_designs) > 0:
    print('Array Designs: {}'.format(list(top_mods.array_designs)))

Technical replicates: 2
Added assay plan: Blood -> genome sequencing/DNA microarray
Array Designs: ['A-AFFY-28', 'A-AFFY-27']


## Generate ISA model objects from the assay plan and render the assay table

*Check state of Sample Assay Plan after entering assay plan information:*

In [23]:
print(json.dumps(plan, cls=SampleAssayPlanEncoder, sort_keys=True, indent=4, separators=(',', ': ')))

{
    "assay_plan": [
        {
            "assay_type": {
                "measurement_type": "genome sequencing",
                "technology_type": "DNA microarray",
                "topology_modifiers": {
                    "array_designs": [
                        "A-AFFY-27",
                        "A-AFFY-28"
                    ],
                    "technical_replicates": 2
                }
            },
            "sample_type": "Blood"
        }
    ],
    "assay_types": [
        {
            "measurement_type": "genome sequencing",
            "technology_type": "DNA microarray",
            "topology_modifiers": {
                "array_designs": [
                    "A-AFFY-27",
                    "A-AFFY-28"
                ],
                "technical_replicates": 2
            }
        }
    ],
    "group_size": 5,
    "sample_plan": [
        {
            "sample_type": "Blood",
            "sampling_size": 3
        }
    ],
    "sample_qc_plan": [],
 

In [24]:
isa_investigation.studies = [isa_object_factory.create_assays_from_plan()]
for assay in isa_investigation.studies[-1].assays:
    print('Assay generated: {0}, {1} samples, {2} processes, {3} data files'
          .format(assay.filename, len(assay.samples), len(assay.process_sequence), len(assay.data_files)))
dataframes = dumpdf(isa_investigation)

Assay generated: a_tp_A-AFFY-28_A-AFFY-27_assay.txt, 120 samples, 1440 processes, 480 data files


In [25]:
show_grid(dataframes[next(iter(dataframes.keys()))])