# PCC Research Question Nanopublication Creator

Creates PCC nanopublications from a JSON configuration file.

**Template:** [Defining a PCC-based research question](https://w3id.org/np/RAmR-xqMgOq3oTJmOVDQFL2p5usID6zqRapizHy0UJb04)

This template uses the PCC framework (Population, Concept, Context) for scoping reviews, data papers, and qualitative research.

---

## When to use PCC vs PICO

| Framework | Use Case | Components |
|-----------|----------|------------|
| **PICO** | Clinical trials, intervention studies | Population, Intervention, Comparison, Outcome |
| **PCC** | Scoping reviews, data papers, qualitative research | Population, Concept, Context |

---

## Instructions

1. **Create a JSON file** with your PCC details (see template below)
2. **Set the path** to your JSON file in Section 1
3. **Run All Cells** ‚Üí Get your `.trig` file

---
# üìÅ SECTION 1: INPUT FILE (EDIT THIS)
---

In [None]:
# Path to your PCC JSON file
PCC_FILE = "../config/pcc-procambarus-clarkii.json"

---
# ‚öôÔ∏è SECTION 2: SETUP
---

In [None]:
# Install dependencies (uncomment if needed)
# !pip install nanopub rdflib

In [None]:
import json
import re
from rdflib import Graph, Dataset, Namespace, Literal, URIRef
from rdflib.namespace import RDF, RDFS, XSD, FOAF
from datetime import datetime, timezone
from pathlib import Path

# Namespaces (matching Nanodash)
NP = Namespace("http://www.nanopub.org/nschema#")
DCT = Namespace("http://purl.org/dc/terms/")
NT = Namespace("https://w3id.org/np/o/ntemplate/")
NPX = Namespace("http://purl.org/nanopub/x/")
PROV = Namespace("http://www.w3.org/ns/prov#")
ORCID = Namespace("https://orcid.org/")

# Science Live ontology for PCC
SLV = Namespace("https://w3id.org/sciencelive/o/terms/")

# PCC template URI
PCC_TEMPLATE = URIRef("https://w3id.org/np/RAmR-xqMgOq3oTJmOVDQFL2p5usID6zqRapizHy0UJb04")

# Template references
PROV_TEMPLATE = URIRef("https://w3id.org/np/RA7lSq6MuK_TIC6JMSHvLtee3lpLoZDOqLJCLXevnrPoU")
PUBINFO_TEMPLATE_1 = URIRef("https://w3id.org/np/RA0J4vUn_dekg-U1kK3AOEt02p9mT2WO03uGxLDec1jLw")
PUBINFO_TEMPLATE_2 = URIRef("https://w3id.org/np/RAukAcWHRDlkqxk7H2XNSegc1WnHI569INvNr-xdptDGI")

print("‚úì Setup complete")

---
# üìñ SECTION 3: LOAD & VALIDATE
---

In [None]:
# Load PCC from JSON
print(f"Loading: {PCC_FILE}")

with open(PCC_FILE, 'r', encoding='utf-8') as f:
    config = json.load(f)

# Extract fields
AUTHOR_ORCID = config['author']['orcid']
AUTHOR_NAME = config['author']['name']

TITLE = config['pcc']['title']
POPULATION = config['pcc']['population']
CONCEPT = config['pcc']['concept']
CONTEXT = config['pcc']['context']
RESEARCH_QUESTION = config['pcc']['research_question']

OUTPUT_FILENAME = config['output']['filename']

print(f"‚úì Loaded PCC: {TITLE[:50]}...")

In [None]:
# Validate
print("Validating...")

errors = []
if not AUTHOR_ORCID:
    errors.append("author.orcid is required")
if not AUTHOR_NAME:
    errors.append("author.name is required")
if not TITLE or len(TITLE) < 10:
    errors.append("pcc.title must be at least 10 characters")
if not POPULATION:
    errors.append("pcc.population is required")
if not CONCEPT:
    errors.append("pcc.concept is required")
if not CONTEXT:
    errors.append("pcc.context is required")
if not RESEARCH_QUESTION:
    errors.append("pcc.research_question is required")

if errors:
    print("‚ùå Validation errors:")
    for e in errors:
        print(f"   - {e}")
    raise ValueError("Please fix the errors in your JSON file")
else:
    print("‚úì All fields valid")

In [None]:
# Generate a URI-safe ID from the title for the PCC resource
def slugify(text, max_length=50):
    """Create a URL-safe slug from text."""
    # Convert to lowercase and replace spaces with hyphens
    slug = text.lower().strip()
    slug = re.sub(r'[^a-z0-9\s-]', '', slug)  # Remove special chars
    slug = re.sub(r'[\s_]+', '-', slug)  # Replace spaces/underscores with hyphens
    slug = re.sub(r'-+', '-', slug)  # Remove consecutive hyphens
    slug = slug.strip('-')  # Remove leading/trailing hyphens
    return slug[:max_length]

PCC_ID = slugify(TITLE)
print(f"‚úì Generated PCC ID: {PCC_ID}")

---
# üî® SECTION 4: BUILD NANOPUBLICATION
---

In [None]:
# Create dataset with named graphs
TEMP_NP = Namespace("http://purl.org/nanopub/temp/np/")

this_np = URIRef("http://purl.org/nanopub/temp/np/")
head_graph = URIRef("http://purl.org/nanopub/temp/np/Head")
assertion_graph = URIRef("http://purl.org/nanopub/temp/np/assertion")
provenance_graph = URIRef("http://purl.org/nanopub/temp/np/provenance")
pubinfo_graph = URIRef("http://purl.org/nanopub/temp/np/pubinfo")

ds = Dataset()

# Bind prefixes
ds.bind("this", "http://purl.org/nanopub/temp/np/")
ds.bind("sub", TEMP_NP)
ds.bind("np", NP)
ds.bind("dct", DCT)
ds.bind("nt", NT)
ds.bind("npx", NPX)
ds.bind("xsd", XSD)
ds.bind("rdfs", RDFS)
ds.bind("orcid", ORCID)
ds.bind("prov", PROV)
ds.bind("foaf", FOAF)
ds.bind("slv", SLV)

print("‚úì Dataset created")

In [None]:
# HEAD
head = ds.graph(head_graph)
head.add((this_np, RDF.type, NP.Nanopublication))
head.add((this_np, NP.hasAssertion, assertion_graph))
head.add((this_np, NP.hasProvenance, provenance_graph))
head.add((this_np, NP.hasPublicationInfo, pubinfo_graph))
print(f"‚úì Head: {len(head)} triples")

In [None]:
# ASSERTION - Using Science Live PCC ontology structure
assertion = ds.graph(assertion_graph)

# Create local resource URIs
pcc_uri = TEMP_NP[PCC_ID]
population_uri = TEMP_NP["population"]
concept_uri = TEMP_NP["concept"]
context_uri = TEMP_NP["context"]

# Main PCC resource (st1: type PCC)
assertion.add((pcc_uri, RDF.type, SLV.PccReviewQuestion))

# Label (st2)
assertion.add((pcc_uri, RDFS.label, Literal(TITLE)))

# Description (st3)
assertion.add((pcc_uri, DCT.description, Literal(RESEARCH_QUESTION)))

# Population (st4a, st4b)
assertion.add((pcc_uri, SLV.hasPccPopulation, population_uri))
assertion.add((population_uri, DCT.description, Literal(POPULATION)))

# Concept (st5a, st5b)
assertion.add((pcc_uri, SLV.hasPccConcept, concept_uri))
assertion.add((concept_uri, DCT.description, Literal(CONCEPT)))

# Context (st6a, st6b)
assertion.add((pcc_uri, SLV.hasPccContext, context_uri))
assertion.add((context_uri, DCT.description, Literal(CONTEXT)))

print(f"‚úì Assertion: {len(assertion)} triples")

In [None]:
# PROVENANCE
provenance = ds.graph(provenance_graph)
author_uri = ORCID[AUTHOR_ORCID]
provenance.add((assertion_graph, PROV.wasAttributedTo, author_uri))
print(f"‚úì Provenance: {len(provenance)} triples")

In [None]:
# PUBINFO
pubinfo = ds.graph(pubinfo_graph)
now = datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S.000Z")

pubinfo.add((author_uri, FOAF.name, Literal(AUTHOR_NAME)))
pubinfo.add((this_np, DCT.created, Literal(now, datatype=XSD.dateTime)))
pubinfo.add((this_np, DCT.creator, author_uri))
pubinfo.add((this_np, DCT.license, URIRef("https://creativecommons.org/licenses/by/4.0/")))
pubinfo.add((this_np, NPX.wasCreatedAt, URIRef("https://nanodash.knowledgepixels.com/")))
pubinfo.add((this_np, RDFS.label, Literal(f"PCC: {TITLE}")))

# Nanopub type
pubinfo.add((this_np, NPX.hasNanopubType, SLV.PccReviewQuestion))

# Introduces
pubinfo.add((this_np, NPX.introduces, pcc_uri))

# Template references
pubinfo.add((this_np, NT.wasCreatedFromTemplate, PCC_TEMPLATE))
pubinfo.add((this_np, NT.wasCreatedFromProvenanceTemplate, PROV_TEMPLATE))
pubinfo.add((this_np, NT.wasCreatedFromPubinfoTemplate, PUBINFO_TEMPLATE_1))
pubinfo.add((this_np, NT.wasCreatedFromPubinfoTemplate, PUBINFO_TEMPLATE_2))

print(f"‚úì Pubinfo: {len(pubinfo)} triples")

---
# üíæ SECTION 5: SAVE OUTPUT
---

In [None]:
# Serialize to TriG
trig_output = ds.serialize(format='trig')

# Save to file
output_path = Path(f"{OUTPUT_FILENAME}.trig")
with open(output_path, 'w', encoding='utf-8') as f:
    f.write(trig_output)

print(f"‚úì Saved: {output_path}")

In [None]:
# Display output
print("=" * 70)
print("NANOPUBLICATION (TriG format)")
print("=" * 70)
print(trig_output)

In [None]:
# Summary
print("=" * 70)
print("SUMMARY")
print("=" * 70)
print(f"Input:    {PCC_FILE}")
print(f"Output:   {output_path}")
print(f"Author:   {AUTHOR_NAME} (orcid:{AUTHOR_ORCID})")
print(f"PCC ID:   {PCC_ID}")
print()
print("PCC:")
print(f"  Title:      {TITLE[:55]}..." if len(TITLE) > 55 else f"  Title:      {TITLE}")
print(f"  P: {POPULATION[:55]}..." if len(POPULATION) > 55 else f"  P: {POPULATION}")
print(f"  C: {CONCEPT[:55]}..." if len(CONCEPT) > 55 else f"  C: {CONCEPT}")
print(f"  C: {CONTEXT[:55]}..." if len(CONTEXT) > 55 else f"  C: {CONTEXT}")
print()
print(f"Template: {PCC_TEMPLATE}")
print()
print("Next steps:")
print(f"  Sign:    nanopub sign {output_path}")
print(f"  Publish: nanopub publish {output_path.stem}.signed.trig")

---
# üöÄ SECTION 6: SIGN & PUBLISH (OPTIONAL)
---

In [None]:
PUBLISH = False
USE_TEST_SERVER = True

In [None]:
if PUBLISH:
    from nanopub import Nanopub, NanopubConf, load_profile
    
    profile = load_profile()
    print(f"Loaded profile: {profile.name}")
    
    conf = NanopubConf(profile=profile, use_test_server=USE_TEST_SERVER)
    np_obj = Nanopub(rdf=output_path, conf=conf)
    
    np_obj.sign()
    print(f"‚úì Signed")
    
    signed_path = Path(f"{OUTPUT_FILENAME}.signed.trig")
    np_obj.store(signed_path)
    print(f"‚úì Saved: {signed_path}")
    
    np_obj.publish()
    print(f"‚úì Published: {np_obj.source_uri}")
else:
    print("Publishing disabled. Set PUBLISH = True to enable.")

---
# üìã JSON TEMPLATE

Create a JSON file with this structure:

```json
{
  "author": {
    "orcid": "0000-0000-0000-0000",
    "name": "Your Name"
  },
  "pcc": {
    "title": "Your scoping review / data paper title",
    "population": "Who or what is being studied",
    "concept": "What concept/phenomenon is being explored or measured",
    "context": "Setting, time period, geography",
    "research_question": "Your full research question"
  },
  "output": {
    "filename": "my-pcc"
  }
}
```

---

## PCC vs PICO

| Component | PICO | PCC |
|-----------|------|-----|
| P | Population | Population |
| I | Intervention | - |
| C | Comparison | Concept |
| O | Outcome | Context |

Use **PCC** for:
- Scoping reviews
- Data papers (describing/documenting data)
- Qualitative research
- Descriptive studies
- Mapping/charting exercises

Use **PICO** for:
- Clinical trials
- Intervention studies
- Effectiveness research

---