This notebook creates a Workflow RO crate for the BatchConvert tool at the root of the repo.  
The notebook could be extended to document a run with the workflow hence creating a WF Run RO crate procedure.

The notebook uses the general purpose python rocrate library to create the crate (and the underlying json file).  
Adding items to the crate follows the following event sequence : 
- create an item 
- add it to the crate
- link the item to other items in the crate

Some functions of the rocrate package take care of both creating an item and adding it to the crate (ex: add_workflow).  

In [1]:
import os, sys
from pprint import pprint
from pathlib import Path
from rocrate_validator import services, models


sys.path.append("./bin")
#sys.path.append(str(Path.cwd().joinpath("bin")))
#pprint(sys.path)

print("Working directory :", os.getcwd())

from make_workflow_crate import create_workflow_crate, write_workflow_run_crate

def validate_crate(path, severity = models.Severity.RECOMMENDED):

    # Create an instance of `ValidationSettings` class to configure the validation
    settings = services.ValidationSettings(rocrate_uri = path,
                                            profile_identifier = "workflow-ro-crate-1.0",
                                            #requirement_severity = models.Severity.REQUIRED,
                                            requirement_severity = models.Severity.RECOMMENDED
                                            )

    # Call the validation service with the settings
    result = services.validate(settings)

    # Check if the validation was successful
    if not result.has_issues():
        print("RO-Crate is valid!")

    else:
        print("RO-Crate is invalid!")
        
        # Explore the issues
        for issue in result.get_issues():
            # Every issue object has a reference to the check that failed, the severity of the issue, and a message describing the issue.
            print(f"Detected issue of severity {issue.severity.name} with check \"{issue.check.identifier}\": {issue.message}")

Working directory : /Users/thomasl/Documents/repos/BatchConvert


In [2]:
crate = create_workflow_crate(repo_root_dir = "")

In [3]:
crate.mainEntity.as_jsonld()

{'@id': 'batchconvert',
 '@type': ['File', 'SoftwareSourceCode', 'ComputationalWorkflow'],
 'name': 'batchconvert',
 'programmingLanguage': {'@id': '#bash'},
 'input': [{'@id': '#conversion_format'},
  {'@id': '#src_dir'},
  {'@id': '#dest_dir'},
  {'@id': '#merge_files'},
  {'@id': '#concatenation_order'}],
 'hasPart': [{'@id': 'pff2omezarr.nf'}, {'@id': 'pff2ometiff.nf'}],
 'url': ['https://github.com/Euro-BioImaging/BatchConvert']}

In [5]:
# Write and validate the crate
path = "test_crate"
crate.write(path)
validate_crate(path)

RO-Crate is valid!


In [4]:
dest_dir = "/Users/thomasl/Documents/z-stack-acquifer"
write_workflow_run_crate(batch_convert_repo_dir = "",
                         dest_dir = "/Users/thomasl/Documents/z-stack-acquifer")
validate_crate("/Users/thomasl/Documents/z-stack-acquifer")

RO-Crate is valid!


In [None]:
"""
Here showing how to create a RO crate with a directory, without duplicating the data.  
It seems the package detects that the images are already in a subfolder of the "output" directory.
"""
from rocrate.rocrate import ROCrate
import os

src = "/Users/thomasl/Documents/z-stack-acquifer"
#crate = ROCrate(source = src, init=True) # works but would list all files of all subdirectory, so if we just want to add a directory as a single entry, one should not do init=True

crate = ROCrate()
crate.add_directory(os.path.join(src, "images")) # alias of add_dataset

crate.write(src)