# PICO Research Question Nanopublication Creator

Creates PICO nanopublications from a JSON configuration file.

**Template:** [Defining a PICO-based research question](https://w3id.org/np/RA5e5XeXy_-aNK5giB7kBAEQslTLVydHeM4YYEzhmEE2w)

This template uses the Cochrane PICO ontology for structured research questions.

---

## Instructions

1. **Create a JSON file** with your PICO details (see template below)
2. **Set the path** to your JSON file in Section 1
3. **Run All Cells** ‚Üí Get your `.trig` file

---
# üìÅ SECTION 1: INPUT FILE (EDIT THIS)
---

In [1]:
# Path to your PICO JSON file
PICO_FILE = "../inputs/quantum-biodiversity/pico-quantum-biodiversity.json"
PICO_FILE = "../inputs/pets-biodiversity/pets-biodiversity-pico.json"
PICO_FILE = "../inputs/agriculture-privacy-eo/pico-agriculture-privacy-eo.json"
PICO_FILE = "../inputs/climate-differential-privacy/pico-climate-differential-privacy.json"
PICO_FILE = "../inputs/crossborder-eo-privacy/pico-crossborder-eo-privacy.json"
PICO_FILE = "../inputs/federated-learning-eo/pico-federated-learning-eo.json"
PICO_FILE = "../inputs/indigenous-forest-privacy/pico-indigenous-forest-privacy.json"
PICO_FILE = "../inputs/urban-imagery-privacy/pico-urban-imagery-privacy.json"
PICO_FILE = "../inputs/wildfire-sentinel2-ml/pico-wildfire-sentinel2-ml.json"

---
# ‚öôÔ∏è SECTION 2: SETUP
---

In [2]:
# Install dependencies (uncomment if needed)
# !pip install nanopub rdflib

In [3]:
import json
import re
from rdflib import Graph, Dataset, Namespace, Literal, URIRef
from rdflib.namespace import RDF, RDFS, XSD, FOAF
from datetime import datetime, timezone
from pathlib import Path

# Namespaces (matching Nanodash)
NP = Namespace("http://www.nanopub.org/nschema#")
DCT = Namespace("http://purl.org/dc/terms/")
NT = Namespace("https://w3id.org/np/o/ntemplate/")
NPX = Namespace("http://purl.org/nanopub/x/")
PROV = Namespace("http://www.w3.org/ns/prov#")
ORCID = Namespace("https://orcid.org/")

# Cochrane PICO ontology
PICO = Namespace("http://data.cochrane.org/ontologies/pico/")

# Science Live ontology for question types
SCIENCELIVE = Namespace("https://w3id.org/sciencelive/o/terms/")

# PICO template URIs (NEW TEMPLATE)
PICO_TEMPLATE = URIRef("https://w3id.org/np/RA5e5XeXy_-aNK5giB7kBAEQslTLVydHeM4YYEzhmEE2w")
PICO_TEMPLATE_NS = Namespace("https://w3id.org/np/RA5e5XeXy_-aNK5giB7kBAEQslTLVydHeM4YYEzhmEE2w/")

# Template references
PROV_TEMPLATE = URIRef("https://w3id.org/np/RA7lSq6MuK_TIC6JMSHvLtee3lpLoZDOqLJCLXevnrPoU")
PUBINFO_TEMPLATE_1 = URIRef("https://w3id.org/np/RA0J4vUn_dekg-U1kK3AOEt02p9mT2WO03uGxLDec1jLw")
PUBINFO_TEMPLATE_2 = URIRef("https://w3id.org/np/RAukAcWHRDlkqxk7H2XNSegc1WnHI569INvNr-xdptDGI")
PUBINFO_TEMPLATE_3 = URIRef("https://w3id.org/np/RAoTD7udB2KtUuOuAe74tJi1t3VzK0DyWS7rYVAq1GRvw")

# Question type mapping (new Science Live URIs)
QUESTION_TYPE_MAP = {
    "causation": SCIENCELIVE.CausationResearchQuestion,
    "descriptive": SCIENCELIVE.DescriptiveResearchQuestion,
    "effectiveness": SCIENCELIVE.EffectivenessResearchQuestions,
    "experience": SCIENCELIVE.ExperienceResearchQuestions,
    "prediction": SCIENCELIVE.PredictionResearchQuestions,
}

VALID_QUESTION_TYPES = list(QUESTION_TYPE_MAP.keys())

print("‚úì Setup complete")

‚úì Setup complete


---
# üìñ SECTION 3: LOAD & VALIDATE
---

In [4]:
# Load PICO from JSON
print(f"Loading: {PICO_FILE}")

with open(PICO_FILE, 'r', encoding='utf-8') as f:
    config = json.load(f)

# Extract fields
AUTHOR_ORCID = config['author']['orcid']
AUTHOR_NAME = config['author']['name']

TITLE = config['pico']['title']
POPULATION = config['pico']['population']
INTERVENTION = config['pico']['intervention']
COMPARISON = config['pico']['comparison']
OUTCOME = config['pico']['outcome']
RESEARCH_QUESTION = config['pico']['research_question']
QUESTION_TYPE = config['pico']['question_type']
RATIONALE = config['pico'].get('rationale', '')  # Optional (not used in new template)

OUTPUT_FILENAME = config['output']['filename']

print(f"‚úì Loaded PICO: {TITLE[:50]}...")

Loading: ../inputs/wildfire-sentinel2-ml/pico-wildfire-sentinel2-ml.json
‚úì Loaded PICO: Machine Learning Algorithms for Wildfire Detection...


In [5]:
# Validate
print("Validating...")

errors = []
if not AUTHOR_ORCID:
    errors.append("author.orcid is required")
if not AUTHOR_NAME:
    errors.append("author.name is required")
if not TITLE or len(TITLE) < 10:
    errors.append("pico.title must be at least 10 characters")
if not POPULATION:
    errors.append("pico.population is required")
if not INTERVENTION:
    errors.append("pico.intervention is required")
if not OUTCOME:
    errors.append("pico.outcome is required")
if not RESEARCH_QUESTION:
    errors.append("pico.research_question is required")
if QUESTION_TYPE not in VALID_QUESTION_TYPES:
    errors.append(f"pico.question_type must be one of: {VALID_QUESTION_TYPES}")

if errors:
    print("‚ùå Validation errors:")
    for e in errors:
        print(f"   - {e}")
    raise ValueError("Please fix the errors in your JSON file")
else:
    print("‚úì All fields valid")

Validating...
‚úì All fields valid


In [6]:
# Generate a URI-safe ID from the title for the PICO resource
def slugify(text, max_length=50):
    """Create a URL-safe slug from text."""
    # Convert to lowercase and replace spaces with hyphens
    slug = text.lower().strip()
    slug = re.sub(r'[^a-z0-9\s-]', '', slug)  # Remove special chars
    slug = re.sub(r'[\s_]+', '-', slug)  # Replace spaces/underscores with hyphens
    slug = re.sub(r'-+', '-', slug)  # Remove consecutive hyphens
    slug = slug.strip('-')  # Remove leading/trailing hyphens
    return slug[:max_length]

PICO_ID = slugify(TITLE)
print(f"‚úì Generated PICO ID: {PICO_ID}")

‚úì Generated PICO ID: machine-learning-algorithms-for-wildfire-detection


---
# üî® SECTION 4: BUILD NANOPUBLICATION
---

In [7]:
# Create dataset with named graphs
TEMP_NP = Namespace("http://purl.org/nanopub/temp/np/")

this_np = URIRef("http://purl.org/nanopub/temp/np/")
head_graph = URIRef("http://purl.org/nanopub/temp/np/Head")
assertion_graph = URIRef("http://purl.org/nanopub/temp/np/assertion")
provenance_graph = URIRef("http://purl.org/nanopub/temp/np/provenance")
pubinfo_graph = URIRef("http://purl.org/nanopub/temp/np/pubinfo")

ds = Dataset()

# Bind prefixes
ds.bind("this", "http://purl.org/nanopub/temp/np/")
ds.bind("sub", TEMP_NP)
ds.bind("np", NP)
ds.bind("dct", DCT)
ds.bind("nt", NT)
ds.bind("npx", NPX)
ds.bind("xsd", XSD)
ds.bind("rdfs", RDFS)
ds.bind("orcid", ORCID)
ds.bind("prov", PROV)
ds.bind("foaf", FOAF)
ds.bind("pico", PICO)
ds.bind("sciencelive", SCIENCELIVE)

print("‚úì Dataset created")

‚úì Dataset created


In [8]:
# HEAD
head = ds.graph(head_graph)
head.add((this_np, RDF.type, NP.Nanopublication))
head.add((this_np, NP.hasAssertion, assertion_graph))
head.add((this_np, NP.hasProvenance, provenance_graph))
head.add((this_np, NP.hasPublicationInfo, pubinfo_graph))
print(f"‚úì Head: {len(head)} triples")

‚úì Head: 4 triples


In [9]:
# ASSERTION - Using Cochrane PICO ontology structure
assertion = ds.graph(assertion_graph)

# Create local resource URIs
pico_uri = TEMP_NP[PICO_ID]
population_uri = TEMP_NP["population"]
intervention_uri = TEMP_NP["interventionGroup"]
comparator_uri = TEMP_NP["comparatorGroup"]
outcome_uri = TEMP_NP["outcomeGroup"]

# Main PICO resource
assertion.add((pico_uri, RDF.type, PICO.PICO))  # st1: type PICO
assertion.add((pico_uri, RDF.type, QUESTION_TYPE_MAP[QUESTION_TYPE]))  # st1b: question type
assertion.add((pico_uri, RDFS.label, Literal(TITLE)))  # st2: label
assertion.add((pico_uri, DCT.description, Literal(RESEARCH_QUESTION)))  # st3: description

# Population (P)
assertion.add((pico_uri, PICO.population, population_uri))  # st4a
assertion.add((population_uri, DCT.description, Literal(POPULATION)))  # st4b

# Intervention (I)
assertion.add((pico_uri, PICO.interventionGroup, intervention_uri))  # st5a
assertion.add((intervention_uri, DCT.description, Literal(INTERVENTION)))  # st5b

# Comparator (C)
assertion.add((pico_uri, PICO.comparatorGroup, comparator_uri))  # st6a
assertion.add((comparator_uri, DCT.description, Literal(COMPARISON)))  # st6b

# Outcome (O)
assertion.add((pico_uri, PICO.outcomeGroup, outcome_uri))  # st7a
assertion.add((outcome_uri, DCT.description, Literal(OUTCOME)))  # st7b

print(f"‚úì Assertion: {len(assertion)} triples")

‚úì Assertion: 12 triples


In [10]:
# PROVENANCE
provenance = ds.graph(provenance_graph)
author_uri = ORCID[AUTHOR_ORCID]
provenance.add((assertion_graph, PROV.wasAttributedTo, author_uri))
print(f"‚úì Provenance: {len(provenance)} triples")

‚úì Provenance: 1 triples


In [11]:
# PUBINFO
pubinfo = ds.graph(pubinfo_graph)
now = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S.000Z")

pubinfo.add((author_uri, FOAF.name, Literal(AUTHOR_NAME)))
pubinfo.add((this_np, DCT.created, Literal(now, datatype=XSD.dateTime)))
pubinfo.add((this_np, DCT.creator, author_uri))
pubinfo.add((this_np, DCT.license, URIRef("https://creativecommons.org/licenses/by/4.0/")))
pubinfo.add((this_np, NPX.wasCreatedAt, URIRef("https://nanodash.knowledgepixels.com/")))

# CRITICAL: npx:introduces enables federated SPARQL queries to find this resource
# Without this, queries using SERVICE to join on the resource URI will fail
pubinfo.add((this_np, NPX.introduces, pico_uri))

# Label (truncate if needed)
label = f"PICO Research Question: {TITLE}"
if len(label) > 100:
    label = label[:97] + "..."
pubinfo.add((this_np, RDFS.label, Literal(label)))

# Template references (updated for new template)
pubinfo.add((this_np, NT.wasCreatedFromProvenanceTemplate, PROV_TEMPLATE))
pubinfo.add((this_np, NT.wasCreatedFromPubinfoTemplate, PUBINFO_TEMPLATE_1))
pubinfo.add((this_np, NT.wasCreatedFromPubinfoTemplate, PUBINFO_TEMPLATE_2))
pubinfo.add((this_np, NT.wasCreatedFromPubinfoTemplate, PUBINFO_TEMPLATE_3))
pubinfo.add((this_np, NT.wasCreatedFromTemplate, PICO_TEMPLATE))

print(f"‚úì Pubinfo: {len(pubinfo)} triples")

‚úì Pubinfo: 12 triples


---
# üìÑ SECTION 5: OUTPUT
---

In [12]:
# Serialize and save
trig_output = ds.serialize(format="trig")

output_path = Path(f"{OUTPUT_FILENAME}.trig")
with open(output_path, "w", encoding="utf-8") as f:
    f.write(trig_output)

print(f"‚úì Saved to: {output_path.absolute()}")

‚úì Saved to: /Users/annef/Documents/FAIR2Adapt/systematic-review-pipeline/notebooks/wildfire-sentinel2-ml-pico.trig


In [13]:
# Display output
print("=" * 70)
print("NANOPUBLICATION (TriG format)")
print("=" * 70)
print(trig_output)

NANOPUBLICATION (TriG format)
@prefix dct: <http://purl.org/dc/terms/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix np: <http://www.nanopub.org/nschema#> .
@prefix npx: <http://purl.org/nanopub/x/> .
@prefix nt: <https://w3id.org/np/o/ntemplate/> .
@prefix orcid: <https://orcid.org/> .
@prefix pico: <http://data.cochrane.org/ontologies/pico/> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix sciencelive: <https://w3id.org/sciencelive/o/terms/> .
@prefix sub: <http://purl.org/nanopub/temp/np/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .

sub:pubinfo {
    sub: rdfs:label "PICO Research Question: Machine Learning Algorithms for Wildfire Detection and Burned Area Mappin..." ;
        dct:created "2026-01-06T10:11:08+00:00"^^xsd:dateTime ;
        dct:creator orcid:0000-0002-1784-2920 ;
        dct:license <https://creativecommons.org/licenses/by/4.0/> ;
        npx:introduces sub:machine-learning-algorithms-fo

In [14]:
# Summary
print("=" * 70)
print("SUMMARY")
print("=" * 70)
print(f"Input:    {PICO_FILE}")
print(f"Output:   {output_path}")
print(f"Author:   {AUTHOR_NAME} (orcid:{AUTHOR_ORCID})")
print(f"Type:     {QUESTION_TYPE}")
print(f"PICO ID:  {PICO_ID}")
print()
print("PICO:")
print(f"  Title: {TITLE[:60]}..." if len(TITLE) > 60 else f"  Title: {TITLE}")
print(f"  P: {POPULATION[:55]}..." if len(POPULATION) > 55 else f"  P: {POPULATION}")
print(f"  I: {INTERVENTION[:55]}..." if len(INTERVENTION) > 55 else f"  I: {INTERVENTION}")
print(f"  C: {COMPARISON[:55]}..." if len(COMPARISON) > 55 else f"  C: {COMPARISON}")
print(f"  O: {OUTCOME[:55]}..." if len(OUTCOME) > 55 else f"  O: {OUTCOME}")
print()
print("Template: https://w3id.org/np/RA5e5XeXy_-aNK5giB7kBAEQslTLVydHeM4YYEzhmEE2w")
print()
print("Next steps:")
print(f"  Sign:    nanopub sign {output_path}")
print(f"  Publish: nanopub publish {output_path.stem}.signed.trig")

SUMMARY
Input:    ../inputs/wildfire-sentinel2-ml/pico-wildfire-sentinel2-ml.json
Output:   wildfire-sentinel2-ml-pico.trig
Author:   Anne Fouilloux (orcid:0000-0002-1784-2920)
Type:     descriptive
PICO ID:  machine-learning-algorithms-for-wildfire-detection

PICO:
  Title: Machine Learning Algorithms for Wildfire Detection and Burne...
  P: Geographic regions affected by wildfires globally, with...
  I: Machine learning and deep learning algorithms applied t...
  C: Different ML/DL architectures compared against each oth...
  O: Algorithm performance metrics (accuracy, precision, rec...

Template: https://w3id.org/np/RA5e5XeXy_-aNK5giB7kBAEQslTLVydHeM4YYEzhmEE2w

Next steps:
  Sign:    nanopub sign wildfire-sentinel2-ml-pico.trig
  Publish: nanopub publish wildfire-sentinel2-ml-pico.signed.trig


---
# üöÄ SECTION 6: SIGN & PUBLISH (OPTIONAL)
---

In [15]:
PUBLISH = True
USE_TEST_SERVER = False

In [16]:
if PUBLISH:
    from nanopub import Nanopub, NanopubConf, load_profile
    
    profile = load_profile()
    print(f"Loaded profile: {profile.name}")
    
    conf = NanopubConf(profile=profile, use_test_server=USE_TEST_SERVER)
    np_obj = Nanopub(rdf=output_path, conf=conf)
    
    np_obj.sign()
    print(f"‚úì Signed")
    
    signed_path = Path(f"{OUTPUT_FILENAME}.signed.trig")
    np_obj.store(signed_path)
    print(f"‚úì Saved: {signed_path}")
    
    np_obj.publish()
    print(f"‚úì Published: {np_obj.source_uri}")
else:
    print("Publishing disabled. Set PUBLISH = True to enable.")

Loaded profile: Anne Fouilloux
‚úì Signed
‚úì Saved: wildfire-sentinel2-ml-pico.signed.trig
‚úì Published: https://w3id.org/np/RAjO8tdVOla9I77PeXF4iY92ULngrpx5_ZSKFkVrCmsW0


---
# üìã JSON TEMPLATE

Create a JSON file with this structure:

```json
{
  "author": {
    "orcid": "0000-0000-0000-0000",
    "name": "Your Name"
  },
  "pico": {
    "title": "Your systematic review title",
    "population": "Who or what is being studied",
    "intervention": "What intervention or exposure",
    "comparison": "Comparison group (or 'Not applicable')",
    "outcome": "What outcomes are measured",
    "research_question": "Your full research question",
    "question_type": "descriptive",
    "rationale": "Why this review is needed (optional, not used in this template)"
  },
  "output": {
    "filename": "my-pico"
  }
}
```

**Question types:** `causation`, `descriptive`, `effectiveness`, `experience`, `prediction`

**Note:** The `evaluation` type is not available in this template.

---

## Template Changes

This notebook uses the **Cochrane PICO ontology** template:
- Template: `https://w3id.org/np/RA5e5XeXy_-aNK5giB7kBAEQslTLVydHeM4YYEzhmEE2w`
- Uses `pico:population`, `pico:interventionGroup`, `pico:comparatorGroup`, `pico:outcomeGroup`
- Question types from Science Live ontology (`https://w3id.org/sciencelive/o/terms/`)

---