# ISA-API Comprehensive Examples

This notebook reproduces all examples from the official ISA-API documentation at https://isa-tools.org/isa-api/content/

## Table of Contents

1. [Installation](#installation)
2. [Creating ISA Objects](#creating-objects)
3. [Creating Simple ISA-Tab](#creating-isatab)
4. [Creating Simple ISA-JSON](#creating-isajson)
5. [Reading ISA Files](#reading)
6. [Validating ISA-Tab](#validating-isatab)
7. [Validating ISA-JSON](#validating-isajson)
8. [Converting Between Formats](#conversions)
9. [Batch Validation](#batch-validation)
10. [Advanced Examples](#advanced)

## 1. Installation {#installation}

The ISA-API is available as the `isatools` package on PyPI:

```bash
pip install isatools
```

Supports Python 3.6+

## 2. Creating ISA Objects {#creating-objects}

The ISA model consists of Investigation, Study, and Assay objects.

In [None]:
# Import all ISA model classes
from isatools.model import (
    Investigation,
    Study,
    Assay,
    Source,
    Sample,
    Material,
    Process,
    Protocol,
    DataFile,
    OntologyAnnotation,
    OntologySource,
    Person,
    Publication,
    Characteristic,
    Comment,
    StudyFactor,
    batch_create_materials,
    plink
)

print("✓ Imported ISA model classes")

## 3. Creating Simple ISA-Tab {#creating-isatab}

This example is based on `createSimpleISAtab.py` from the official examples.

In [None]:
def create_simple_isatab():
    """
    Returns a simple but complete ISA-Tab 1.0 descriptor.
    Based on: isatools/examples/createSimpleISAtab.py
    """
    
    # Create Investigation
    investigation = Investigation()
    investigation.identifier = "i1"
    investigation.title = "My Simple ISA Investigation"
    investigation.description = (
        "We could alternatively use the class constructor's parameters to "
        "set some default values at the time of creation, however we want "
        "to demonstrate how to use the object's instance variables to set values."
    )
    investigation.submission_date = "2016-11-03"
    investigation.public_release_date = "2016-11-03"

    # Create Study
    study = Study(filename="s_study.txt")
    study.identifier = "s1"
    study.title = "My ISA Study"
    study.description = (
        "Like with the Investigation, we could use the class constructor to "
        "set some default values, but have chosen to demonstrate in this "
        "example the use of instance variables to set initial values."
    )
    study.submission_date = "2016-11-03"
    study.public_release_date = "2016-11-03"
    investigation.studies.append(study)

    # Add ontology sources
    obi = OntologySource(
        name='OBI',
        description="Ontology for Biomedical Investigations"
    )
    investigation.ontology_source_references.append(obi)
    
    ncbitaxon = OntologySource(
        name='NCBITaxon',
        description="NCBI Taxonomy"
    )
    investigation.ontology_source_references.append(ncbitaxon)

    # Add design descriptor
    intervention_design = OntologyAnnotation(term_source=obi)
    intervention_design.term = "intervention design"
    intervention_design.term_accession = "http://purl.obolibrary.org/obo/OBI_0000115"
    study.design_descriptors.append(intervention_design)

    # Add contact
    contact = Person(
        first_name="Alice",
        last_name="Robertson",
        affiliation="University of Life",
        roles=[OntologyAnnotation(term='submitter')]
    )
    study.contacts.append(contact)
    
    # Add publication
    publication = Publication(
        title="Experiments with Elephants",
        author_list="A. Robertson, B. Robertson"
    )
    publication.pubmed_id = "12345678"
    publication.status = OntologyAnnotation(term="published")
    study.publications.append(publication)

    # Create source material
    source = Source(name='source_material')
    study.sources.append(source)

    # Create sample prototype with characteristics
    prototype_sample = Sample(name='sample_material', derives_from=[source])
    characteristic_organism = Characteristic(
        category=OntologyAnnotation(term="Organism"),
        value=OntologyAnnotation(
            term="Homo Sapiens",
            term_source=ncbitaxon,
            term_accession="http://purl.bioontology.org/ontology/NCBITAXON/9606"
        )
    )
    prototype_sample.characteristics.append(characteristic_organism)

    # Create batch of 3 samples
    study.samples = batch_create_materials(prototype_sample, n=3)

    # Create sample collection protocol
    sample_collection_protocol = Protocol(
        name="sample collection",
        protocol_type=OntologyAnnotation(term="sample collection")
    )
    study.protocols.append(sample_collection_protocol)
    
    # Create sample collection process
    sample_collection_process = Process(executes_protocol=sample_collection_protocol)
    for src in study.sources:
        sample_collection_process.inputs.append(src)
    for sam in study.samples:
        sample_collection_process.outputs.append(sam)
    study.process_sequence.append(sample_collection_process)

    # Create assay
    assay = Assay(filename="a_assay.txt")
    
    # Add extraction protocol
    extraction_protocol = Protocol(
        name='extraction',
        protocol_type=OntologyAnnotation(term="material extraction")
    )
    study.protocols.append(extraction_protocol)
    
    # Add sequencing protocol
    sequencing_protocol = Protocol(
        name='sequencing',
        protocol_type=OntologyAnnotation(term="material sequencing")
    )
    study.protocols.append(sequencing_protocol)

    # Build assay graph for each sample
    for i, sample in enumerate(study.samples):
        # Extraction process
        extraction_process = Process(executes_protocol=extraction_protocol)
        extraction_process.inputs.append(sample)
        
        material = Material(name="extract-{}".format(i))
        material.type = "Extract Name"
        extraction_process.outputs.append(material)

        # Sequencing process
        sequencing_process = Process(executes_protocol=sequencing_protocol)
        sequencing_process.name = "assay-name-{}".format(i)
        sequencing_process.inputs.append(extraction_process.outputs[0])

        # Data file
        datafile = DataFile(
            filename="sequenced-data-{}".format(i),
            label="Raw Data File",
            generated_from=[sample]
        )
        sequencing_process.outputs.append(datafile)

        # Link processes
        plink(extraction_process, sequencing_process)

        # Add to assay
        assay.samples.append(sample)
        assay.data_files.append(datafile)
        assay.other_material.append(material)
        assay.process_sequence.append(extraction_process)
        assay.process_sequence.append(sequencing_process)
        assay.measurement_type = OntologyAnnotation(term="gene sequencing")
        assay.technology_type = OntologyAnnotation(term="nucleotide sequencing")

    study.assays.append(assay)

    return investigation


# Create the ISA descriptor
investigation = create_simple_isatab()
print(f"Created investigation: {investigation.identifier}")
print(f"  Title: {investigation.title}")
print(f"  Studies: {len(investigation.studies)}")
print(f"  Study samples: {len(investigation.studies[0].samples)}")
print(f"  Study assays: {len(investigation.studies[0].assays)}")

### Export to ISA-Tab format

In [None]:
from isatools import isatab
import os

# Export as ISA-Tab string
isatab_string = isatab.dumps(investigation)
print("ISA-Tab output (first 500 characters):")
print(isatab_string[:500])
print("\n... (output truncated)")

# Write to directory
output_dir = './example_isatab'
os.makedirs(output_dir, exist_ok=True)
isatab.dump(investigation, output_dir)

files = os.listdir(output_dir)
print(f"\n✓ Created {len(files)} ISA-Tab files in '{output_dir}':")
for f in sorted(files):
    print(f"  - {f}")

## 4. Creating Simple ISA-JSON {#creating-isajson}

This example shows how to export ISA objects as ISA-JSON format.

In [None]:
import json
from isatools.isajson import ISAJSONEncoder

# Convert investigation to ISA-JSON
isa_json_string = json.dumps(
    investigation,
    cls=ISAJSONEncoder,
    sort_keys=True,
    indent=4,
    separators=(',', ': ')
)

print("ISA-JSON output (first 1000 characters):")
print(isa_json_string[:1000])
print("\n... (output truncated)")

# Save to file
with open('example_isa_simple.json', 'w') as f:
    f.write(isa_json_string)

print(f"\n✓ Saved ISA-JSON ({len(isa_json_string)} bytes) to: example_isa_simple.json")

## 5. Reading ISA Files {#reading}

Examples of reading both ISA-Tab and ISA-JSON files.

### Reading ISA-Tab

In [None]:
from isatools import isatab
import os

# Read the ISA-Tab we just created
with open(os.path.join(output_dir, 'i_investigation.txt')) as fp:
    loaded_investigation = isatab.load(fp)

print(f"Loaded ISA-Tab investigation: {loaded_investigation.identifier}")
print(f"  Title: {loaded_investigation.title}")
print(f"  Description: {loaded_investigation.description[:100]}...")
print(f"  Number of studies: {len(loaded_investigation.studies)}")

for study in loaded_investigation.studies:
    print(f"\n  Study: {study.identifier} - {study.title}")
    print(f"    Sources: {len(study.sources)}")
    print(f"    Samples: {len(study.samples)}")
    print(f"    Protocols: {len(study.protocols)}")
    print(f"    Assays: {len(study.assays)}")
    print(f"    Contacts: {len(study.contacts)}")
    print(f"    Publications: {len(study.publications)}")
    
    for assay in study.assays:
        print(f"      Assay: {assay.filename}")
        print(f"        Measurement: {assay.measurement_type.term if assay.measurement_type else 'N/A'}")
        print(f"        Technology: {assay.technology_type.term if assay.technology_type else 'N/A'}")
        print(f"        Data files: {len(assay.data_files)}")

### Reading ISA-JSON

In [None]:
from isatools import isajson

# Read ISA-JSON file
with open('example_isa_simple.json') as fp:
    loaded_json_investigation = isajson.load(fp)

print(f"Loaded ISA-JSON investigation: {loaded_json_investigation.identifier}")
print(f"  Title: {loaded_json_investigation.title}")
print(f"  Number of studies: {len(loaded_json_investigation.studies)}")
print(f"  Number of ontology sources: {len(loaded_json_investigation.ontology_source_references)}")

## 6. Validating ISA-Tab {#validating-isatab}

Based on `validateISAtab.py` example.

In [None]:
from isatools import isatab
import os

# Validate ISA-Tab using default configuration
with open(os.path.join(output_dir, 'i_investigation.txt')) as fp:
    validation_report = isatab.validate(fp)

print("ISA-Tab Validation Report:")
print(f"  Errors: {len(validation_report.get('errors', []))}")
print(f"  Warnings: {len(validation_report.get('warnings', []))}")
print(f"  Info: {len(validation_report.get('info', []))}")

if validation_report.get('errors'):
    print("\nErrors found:")
    for error in validation_report['errors'][:5]:
        print(f"  - {error}")
else:
    print("\n✓ Validation successful! No errors found.")

if validation_report.get('warnings'):
    print("\nWarnings (first 5):")
    for warning in validation_report['warnings'][:5]:
        print(f"  - {warning}")

### Validate with custom configuration

You can provide a custom configuration directory for validation:

In [None]:
# Example with custom config (commented out - requires config directory)
# with open(os.path.join('./tabdir/', 'i_investigation.txt')) as fp:
#     validation_report = isatab.validate(
#         fp,
#         './my_custom_covid_study_isaconfig_v2021/'
#     )

print("Custom configuration validation would be used for specific study types")

## 7. Validating ISA-JSON {#validating-isajson}

Based on `validateISAjson.py` example.

In [None]:
from isatools import isajson

# Validate ISA-JSON file
with open('example_isa_simple.json') as fp:
    json_validation_report = isajson.validate(fp)

print("ISA-JSON Validation Report:")
print(f"  Errors: {len(json_validation_report.get('errors', []))}")
print(f"  Warnings: {len(json_validation_report.get('warnings', []))}")

if json_validation_report.get('errors'):
    print("\nErrors found:")
    for error in json_validation_report['errors'][:5]:
        print(f"  - {error}")
else:
    print("\n✓ Validation successful! No errors found.")

if json_validation_report.get('warnings'):
    print("\nWarnings (first 5):")
    for warning in json_validation_report['warnings'][:5]:
        print(f"  - {warning}")

## 8. Converting Between Formats {#conversions}

Examples of converting between ISA-Tab and ISA-JSON formats.

### Converting ISA-Tab to ISA-JSON

In [None]:
from isatools.convert import isatab2json
import os

# Convert ISA-Tab directory to ISA-JSON
# validate_first=True will validate before conversion
# use_new_parser=True uses the newer parser implementation
isa_json_converted = isatab2json.convert(
    output_dir,
    validate_first=True,
    use_new_parser=True
)

# Save the converted JSON
with open('converted_from_tab.json', 'w') as f:
    json.dump(isa_json_converted, f, indent=2)

print("✓ Converted ISA-Tab to ISA-JSON")
print(f"  Output saved to: converted_from_tab.json")
print(f"  Investigation ID: {isa_json_converted['identifier']}")

### Converting ISA-JSON to ISA-Tab

In [None]:
from isatools.convert import json2isatab
import os

# Convert ISA-JSON to ISA-Tab
json_to_tab_dir = './converted_from_json'
os.makedirs(json_to_tab_dir, exist_ok=True)

# With validation (default)
with open('example_isa_simple.json') as fp:
    json2isatab.convert(fp, json_to_tab_dir)

print("✓ Converted ISA-JSON to ISA-Tab")
print(f"  Output directory: {json_to_tab_dir}")

files = os.listdir(json_to_tab_dir)
print(f"  Created {len(files)} files:")
for f in sorted(files):
    print(f"    - {f}")

### Convert without validation

In [None]:
# Convert without validation (faster, but riskier)
json_to_tab_dir_no_val = './converted_from_json_no_validation'
os.makedirs(json_to_tab_dir_no_val, exist_ok=True)

with open('example_isa_simple.json') as fp:
    json2isatab.convert(fp, json_to_tab_dir_no_val, validate_first=False)

print("✓ Converted ISA-JSON to ISA-Tab (without validation)")

## 9. Batch Validation {#batch-validation}

Examples of validating multiple ISA files at once.

### Batch validate ISA-Tab directories

In [None]:
from isatools import isatab

# List of ISA-Tab directories to validate
my_tabs = [
    output_dir,
    json_to_tab_dir
]

# Batch validate and write report
batch_report_path = 'batch_validation_report_tab.txt'
batch_report = isatab.batch_validate(my_tabs, batch_report_path)

print("Batch ISA-Tab Validation:")
print(f"  Validated {len(my_tabs)} directories")
print(f"  Report saved to: {batch_report_path}")

# Display report summary
if os.path.exists(batch_report_path):
    with open(batch_report_path, 'r') as f:
        report_content = f.read()
        print(f"\nReport preview (first 500 characters):")
        print(report_content[:500])

### Batch validate ISA-JSON files

In [None]:
from isatools import isajson

# List of ISA-JSON files to validate
my_jsons = [
    'example_isa_simple.json',
    'converted_from_tab.json'
]

# Batch validate and write report
batch_json_report_path = 'batch_validation_report_json.txt'
batch_json_report = isajson.batch_validate(my_jsons, batch_json_report_path)

print("Batch ISA-JSON Validation:")
print(f"  Validated {len(my_jsons)} files")
print(f"  Report saved to: {batch_json_report_path}")

# Display report summary
if os.path.exists(batch_json_report_path):
    with open(batch_json_report_path, 'r') as f:
        report_content = f.read()
        print(f"\nReport preview (first 500 characters):")
        print(report_content[:500])

### Reformatting validation reports

You can reformat JSON reports to CSV format:

In [None]:
from isatools import utils

# Format the validation report as CSV
csv_report_path = 'validation_report.csv'
with open(csv_report_path, 'w') as report_file:
    report_file.write(utils.format_report_csv(validation_report))

print(f"✓ Formatted validation report as CSV: {csv_report_path}")

# Display CSV preview
if os.path.exists(csv_report_path):
    with open(csv_report_path, 'r') as f:
        csv_content = f.read()
        print(f"\nCSV Report preview (first 300 characters):")
        print(csv_content[:300])

## 10. Advanced Examples {#advanced}

Additional features and utilities.

### Using Comments to annotate ISA objects

In [None]:
# Create a study with comments
study_with_comments = Study(filename="s_commented.txt")
study_with_comments.identifier = "s_commented"
study_with_comments.title = "Study with Comments"

# Add comments to study
study_with_comments.comments.append(
    Comment(name="Study Start Date", value="2025-01-01")
)
study_with_comments.comments.append(
    Comment(name="Study End Date", value="2025-12-31")
)

print("Study with comments:")
for comment in study_with_comments.comments:
    print(f"  {comment.name}: {comment.value}")

### Using Study Factors

In [None]:
# Create study factors
treatment_factor = StudyFactor(
    name="treatment",
    factor_type=OntologyAnnotation(term="treatment")
)
treatment_factor.comments.append(
    Comment(name="Description", value="Drug treatment factor")
)

study_with_comments.factors.append(treatment_factor)

print(f"\nStudy factor added: {treatment_factor.name}")
print(f"  Type: {treatment_factor.factor_type.term}")
print(f"  Comments: {len(treatment_factor.comments)}")

### Using plink() to connect processes

In [None]:
# plink() helps connect processes in the workflow
# It was already used in the assay creation above

# Create two processes
process1 = Process(executes_protocol=Protocol(name="step1"))
process2 = Process(executes_protocol=Protocol(name="step2"))

# Add output to process1
intermediate = Material(name="intermediate_material")
intermediate.type = "Extract Name"
process1.outputs.append(intermediate)

# Add same material as input to process2
process2.inputs.append(intermediate)

# Use plink to establish the connection
plink(process1, process2)

print("Process linking example:")
print(f"  Process 1 outputs: {len(process1.outputs)}")
print(f"  Process 2 inputs: {len(process2.inputs)}")
print(f"  Processes are now linked through intermediate material")

### Batch creating materials

In [None]:
# batch_create_materials() efficiently creates multiple materials
# from a prototype (already used above)

prototype = Sample(name="sample")
prototype.characteristics.append(
    Characteristic(
        category=OntologyAnnotation(term="age"),
        value=OntologyAnnotation(term="adult")
    )
)

# Create 10 samples from prototype
samples = batch_create_materials(prototype, n=10)

print(f"Created {len(samples)} samples:")
for i, sample in enumerate(samples[:5]):
    print(f"  {i+1}. {sample.name}")
print(f"  ... and {len(samples) - 5} more")

## Summary

This notebook has demonstrated all major features from the ISA-API documentation:

✓ Creating ISA Investigation, Study, and Assay objects  
✓ Adding ontology annotations and metadata  
✓ Creating source materials, samples, and data files  
✓ Defining protocols and process workflows  
✓ Exporting to ISA-Tab format  
✓ Exporting to ISA-JSON format  
✓ Reading ISA-Tab and ISA-JSON files  
✓ Validating ISA metadata  
✓ Converting between ISA-Tab and ISA-JSON  
✓ Batch validation of multiple files  
✓ Advanced features: Comments, Study Factors, plink(), batch materials  

## Resources

- **Official Documentation**: https://isa-tools.org/isa-api/content/
- **GitHub Repository**: https://github.com/ISA-tools/isa-api
- **PyPI Package**: https://pypi.org/project/isatools/
- **ISA Community**: https://www.isacommons.org
- **More Examples**: Check the `isa-cookbook/` directory in this repository