# Tutorial 4: Adding non-SPARC tools and models to the knowledge graph

There are many tools being developed in different research efforts around the world. These tools are being collated in tool and model registries such as WorfkowHub and the Physiome Model Repository (PMR). This tutorial shows how external tools can be added to the knowledge graph such that they can be used when assembling workflows. 


## Requirements
pip install sparc-assemble


## Using sparc_assemble to download an existing workflow cwl from WorkflowHub
An example tool is sourced from the [WorkflowHub](https://workflowhub.eu/workflows/525/ro_crate?version=1). 
WorkflowHub stores workflows cwl in RO-Crate, so it is already in a FAIR format and there is no need to convert it to SDS.
This code above downloads a workflow and related tools in cwl files from WorkflowHub and save the cwl files in the plugin folder. 


In [None]:
from sparc_assemble import WorkflowHub

zip_url = "https://workflowhub.eu/workflows/525/ro_crate?version=1"
workflow_hub = WorkflowHub(zip_url)

# generate ZIP filename
zip_filename = workflow_hub.generate_zip_filename(zip_url)
local_zip_path = zip_filename+'.zip'

# Download the ZIP file
workflow_hub.download_zip_file(zip_url, local_zip_path)

# Unzip the entire contents of the ZIP archive
output_directory = zip_filename
workflow_hub.unzip_folder(local_zip_path, output_directory)


## Using sparc_assemble to add an external tool to the knowledge graph

The code below shows how to extract the metadata from the cwl and use it to populate the knowledge graph.


In [None]:
from sparc_assemble import KG

tool_library = r"../../resources/tools"
save_kg_path = r"./kg_workflowhub_tools.owl"

# initialising KG from default ontology "EDAM"
kg = KG()

# adding tools to KG
kg.add_tools(tool_library=tool_library)

# listing tools in KG
kg.list()

kg.save(save_path=save_kg_path)


The code that generated this knowledge graph can be found in the [Tutorial 2 folder](tutorials/tutorial_2) on github. This tutorial shows how to build a knowledge graph for automated workflow assembly using sparc-assemble. 

## Using sparc_assemble to add an external model to the knowledge graph


An example Cellml model is identified from the Physiome Model Repository. This Cellml model is stored in [SPARC dataset](https://sparc.science/datasets/135) (refer to tutorial 1 for more information).

The code below shows how to download metadata file and the model file of the Cellml model locally using the model ID and URL.


In [None]:
!pip install biomodels
import biomodels

# Retrieve metadata for the specified model using its unique identifier
metadata = biomodels.get_metadata("BIOMD0000000012")
# Download the model file in XML format based on the model ID and URL, and store it locally
metadata_file = biomodels.get_file("BIOMD0000000012_url.xml", model_id="BIOMD12")
# Convert the file's path to a string and get the directory path where the file is stored
metadata_file_cache_path = str(metadata_file.parent)

The code below shows how to load the model file of the Cellml model and extract the species (inputs) and reactions (outputs) from the model.

In [None]:
!pip install simplesbml
import simplesbml

# Construct the full path to the downloaded model file by combining the directory path and file name
model_path = metadata_file_cache_path + '/BIOMD0000000012_url.xml'
# Load the SBML model from the specified file path using simplesbml
model = simplesbml.loadSBMLFile(model_path)
# Retrieve the name of the model
model_name = model.model.name
# Extract all species (inputs) from the model and store them in a list
inputs = list(model.model.species.all_elements)
# Extract all reactions (outputs) from the model and store them in a list
outputs = list(model.model.reactions.all_elements)

The code below shows how to add a Cellml model to the knowledge graph using the model name, extracted species (inputs), and reactions (outputs) from the model.

In [15]:
# defining the path to save the knowledge graph
save_kg_path = "./kg_biomodels_model.owl"

# initialising KG from default ontology "EDAM"
kg = KG()

# adding biomodels to KG
kg.add_biomodels_model(inputs, outputs, model_name, model_path)

# listing tools in KG
kg.list()

kg.save(save_path=save_kg_path)

NameError: name 'model_url' is not defined