# Create Geolocation Nanopublications

This notebook generates **geolocation nanopublications** from a JSON configuration file.

Geolocation nanopubs document the geographical area or region covered by a research paper's findings, using GeoSPARQL for spatial queries.

**Template:** [Documenting geographical coverage of research](https://w3id.org/np/RAsPVd3bNOPg5vxQGc1Tqn69v3dSY-ASrAhEFioutCXao)

## Supported Geometry Types

This notebook supports multiple ways to specify geographical locations:

1. **WKT Geometry** - Full GeoSPARQL geometry with WKT literal (recommended for spatial queries)
2. **Bounding Box** - Simple bbox using dcat:bbox
3. **Both** - WKT geometry and bounding box together
4. **Label only** - Just a location name without coordinates

---
## Configuration

In [None]:
# Path to your geolocation config file
CONFIG_FILE = "../config/dggs_geolocation.json"

---
## Setup

In [None]:
import json
import sys
from pathlib import Path
from datetime import datetime, timezone

# Add parent directory to path for imports
sys.path.insert(0, str(Path('.').resolve().parent))

try:
    from nanopub_utils import load_config, get_timestamp, escape_literal, PREFIXES
    print("✓ Loaded nanopub_utils")
except ImportError:
    print("⚠ nanopub_utils not found, using inline functions")
    
    def load_config(path):
        with open(path, 'r', encoding='utf-8') as f:
            return json.load(f)
    
    def get_timestamp():
        return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%S.%f")[:-3] + "Z"
    
    def escape_literal(text):
        return text.replace('\\', '\\\\').replace('"', '\\"').replace('\n', '\\n').replace('\r', '\\r')

# Output directory
OUTPUT_DIR = Path("../output/geolocation")
OUTPUT_DIR.mkdir(parents=True, exist_ok=True)
print(f"✓ Output directory: {OUTPUT_DIR}")

---
## Load Configuration

In [None]:
config = load_config(CONFIG_FILE)

metadata = config['metadata']
nanopubs = config['nanopublications']

print(f"✓ Loaded configuration")
print(f"  Source paper: {metadata['source_paper']['title'][:60]}...")
print(f"  Creator: {metadata['creator_name']}")
print(f"  Geolocations to create: {len(nanopubs)}")
print()
for np_config in nanopubs:
    geom_type = "none"
    if np_config.get('geometry'):
        geom_type = np_config['geometry'].get('type', 'unknown')
    elif np_config.get('bbox'):
        geom_type = "bbox only"
    print(f"  - {np_config['location_label']} ({geom_type})")

---
## Template URIs

In [None]:
# Geolocation template
GEOLOCATION_TEMPLATE = "https://w3id.org/np/RAsPVd3bNOPg5vxQGc1Tqn69v3dSY-ASrAhEFioutCXao"

# Standard templates
PROVENANCE_TEMPLATE = "https://w3id.org/np/RA7lSq6MuK_TIC6JMSHvLtee3lpLoZDOqLJCLXevnrPoU"
PUBINFO_TEMPLATE_1 = "https://w3id.org/np/RA0J4vUn_dekg-U1kK3AOEt02p9mT2WO03uGxLDec1jLw"
PUBINFO_TEMPLATE_2 = "https://w3id.org/np/RAukAcWHRDlkqxk7H2XNSegc1WnHI569INvNr-xdptDGI"
PUBINFO_TEMPLATE_3 = "https://w3id.org/np/RAoTD7udB2KtUuOuAe74tJi1t3VzK0DyWS7rYVAq1GRvw"

print("✓ Template URIs configured")

---
## Generate Nanopublications

In [None]:
def generate_geolocation_trig(np_config, metadata):
    """Generate a geolocation nanopublication in TriG format.
    
    Follows the structure from the template RAsPVd3bNOPg5vxQGc1Tqn69v3dSY-ASrAhEFioutCXao
    
    Supports:
    - WKT geometry via geometry.wkt
    - Bounding box via bbox
    - Both together
    - Neither (label-only location)
    """
    
    creator_orcid = metadata['creator_orcid']
    creator_name = metadata['creator_name']
    is_part_of = metadata.get('is_part_of', {})
    
    paper_doi = np_config['paper_doi']
    location_id = np_config['location_id']
    location_label = np_config['location_label']
    quotation = escape_literal(np_config['quotation'])
    quotation_end = np_config.get('quotation_end')
    comment = escape_literal(np_config['comment'])
    
    # Geometry options
    geometry = np_config.get('geometry', {})
    bbox = np_config.get('bbox')
    
    has_wkt = geometry.get('type') == 'wkt' and geometry.get('wkt')
    geometry_id = geometry.get('geometry_id', f"{location_id}-geometry") if has_wkt else None
    
    timestamp = get_timestamp()
    
    # Build list of introduced resources
    introduced = [f"sub:{location_id}"]
    if has_wkt:
        introduced.append(f"sub:{geometry_id}")
    
    # Start building TriG
    trig = f'''@prefix this: <http://purl.org/nanopub/temp/np/> .
@prefix sub: <http://purl.org/nanopub/temp/np/> .
@prefix np: <http://www.nanopub.org/nschema#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix nt: <https://w3id.org/np/o/ntemplate/> .
@prefix npx: <http://purl.org/nanopub/x/> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix orcid: <https://orcid.org/> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix cito: <http://purl.org/spar/cito/> .
@prefix geo: <http://www.opengis.net/ont/geosparql#> .
@prefix dcat: <http://www.w3.org/ns/dcat#> .

sub:Head {{
  this: a np:Nanopublication ;
    np:hasAssertion sub:assertion ;
    np:hasProvenance sub:provenance ;
    np:hasPublicationInfo sub:pubinfo .
}}

sub:assertion {{
  <https://doi.org/{paper_doi}> dct:spatial sub:{location_id} ;
    cito:hasQuotedText "{quotation}" ;
'''
    
    # Optional quotation end
    if quotation_end:
        trig += f'''    cito:hasQuotedTextEnd "{escape_literal(quotation_end)}" ;
'''
    
    # Comment
    trig += f'''    rdfs:comment "{comment}" .
  
  orcid:{creator_orcid} cito:quotes <https://doi.org/{paper_doi}> .
'''
    
    # WKT Geometry (if provided)
    if has_wkt:
        wkt_value = geometry['wkt']
        trig += f'''
  sub:{geometry_id} geo:asWKT "{wkt_value}"^^geo:wktLiteral .
  
  sub:{location_id} a geo:Feature ;
    geo:hasGeometry sub:{geometry_id} ;
    rdfs:label "{location_label}" .
'''
    else:
        # Location without explicit geometry
        trig += f'''
  sub:{location_id} a geo:Feature ;
    rdfs:label "{location_label}" .
'''
    
    # Optional bounding box
    if bbox:
        trig += f'''  sub:{location_id} dcat:bbox "{bbox}" .
'''
    
    # Close assertion, add provenance
    trig += f'''}}

sub:provenance {{
  sub:assertion prov:wasAttributedTo orcid:{creator_orcid} .
}}

sub:pubinfo {{
  orcid:{creator_orcid} foaf:name "{creator_name}" .
  
  this: dct:created "{timestamp}"^^xsd:dateTime ;
    dct:creator orcid:{creator_orcid} ;
    dct:license <https://creativecommons.org/licenses/by/4.0/> ;
'''
    
    # Add introduced resources
    introduced_str = ", ".join(introduced)
    trig += f'''    npx:introduces {introduced_str} ;
'''
    
    trig += f'''    npx:wasCreatedAt <https://nanodash.knowledgepixels.com/> ;
    nt:wasCreatedFromProvenanceTemplate <{PROVENANCE_TEMPLATE}> ;
    nt:wasCreatedFromPubinfoTemplate <{PUBINFO_TEMPLATE_1}> , <{PUBINFO_TEMPLATE_2}> , <{PUBINFO_TEMPLATE_3}> ;
    nt:wasCreatedFromTemplate <{GEOLOCATION_TEMPLATE}> .
'''
    
    # Optional: link to systematic review
    if is_part_of and is_part_of.get('uri'):
        review_uri = is_part_of['uri']
        review_label = is_part_of.get('label', '')
        trig += f'''
  this: dct:isPartOf <{review_uri}> .
'''
        if review_label:
            trig += f'''  <{review_uri}> nt:hasLabelFromApi "{review_label}" .
'''
    
    trig += '''}}
'''
    
    return trig

print("✓ Generation function defined")

In [None]:
# Generate all nanopublications
generated_files = []

for np_config in nanopubs:
    np_id = np_config['id']
    trig_content = generate_geolocation_trig(np_config, metadata)
    
    output_file = OUTPUT_DIR / f"{np_id}.trig"
    with open(output_file, 'w', encoding='utf-8') as f:
        f.write(trig_content)
    
    generated_files.append(output_file)
    print(f"✓ Generated: {output_file.name}")
    print(f"    Location: {np_config['location_label']}")
    
    geom = np_config.get('geometry', {})
    if geom.get('wkt'):
        print(f"    Geometry: WKT ({geom['wkt'][:40]}...)")
    if np_config.get('bbox'):
        print(f"    BBox: {np_config['bbox'][:40]}...")

print(f"\n✓ Generated {len(generated_files)} geolocation nanopublication(s)")

---
## Preview Generated Content

In [None]:
# Preview the first generated file
if generated_files:
    print(f"Preview of {generated_files[0].name}:")
    print("=" * 70)
    with open(generated_files[0], 'r') as f:
        print(f.read())

---
## Summary

In [None]:
print("=" * 70)
print("SUMMARY")
print("=" * 70)
print(f"Config file:  {CONFIG_FILE}")
print(f"Output dir:   {OUTPUT_DIR}")
print(f"Generated:    {len(generated_files)} file(s)")
print()
print("Files:")
for f in generated_files:
    print(f"  - {f}")
print()
print("Next steps:")
print("  1. Review the generated .trig files")
print("  2. Use sign_and_publish_nanopub.ipynb to sign and publish")
print("  3. Or upload to https://nanodash.knowledgepixels.com/")

---
## JSON Configuration Template

```json
{
  "metadata": {
    "source_paper": {
      "title": "Your Paper Title",
      "doi": "10.xxxx/xxxxx"
    },
    "creator_orcid": "0000-0002-XXXX-XXXX",
    "creator_name": "Your Name",
    "is_part_of": {
      "uri": "https://w3id.org/np/RAxxxxx/your-review",
      "label": "Your Systematic Review Title"
    }
  },
  "nanopublications": [
    {
      "id": "geolocation_unique_id",
      "paper_doi": "10.xxxx/xxxxx",
      "location_id": "location-uri-suffix",
      "location_label": "Name of Location (e.g., Lake Zone, Tanzania)",
      "quotation": "Exact quote from paper describing the location (max 500 chars)",
      "quotation_end": "Optional end of longer quote (max 500 chars)",
      "comment": "Your explanation of the geographical relevance (max 800 chars)",
      "geometry": {
        "type": "wkt",
        "geometry_id": "location-geometry",
        "wkt": "POLYGON((lon1 lat1, lon2 lat2, ...))"
      },
      "bbox": "POLYGON((...))  -- alternative to geometry"
    }
  ]
}
```

### Geometry Options

You can specify location geometry in several ways:

**Option 1: WKT Geometry (recommended for spatial queries)**
```json
"geometry": {
  "type": "wkt",
  "geometry_id": "my-location-geometry",
  "wkt": "POLYGON((30.25 -1.0, 35.15 -1.0, 35.15 -4.75, 30.25 -4.75, 30.25 -1.0))"
}
```

**Option 2: Bounding Box only**
```json
"bbox": "POLYGON((west south, east south, east north, west north, west south))"
```

**Option 3: Both WKT and BBox**
```json
"geometry": {
  "type": "wkt",
  "geometry_id": "detailed-geometry",
  "wkt": "POLYGON((...complex polygon...))"
},
"bbox": "POLYGON((simple bounding box))"
```

**Option 4: Label only (no coordinates)**
```json
"location_label": "Northern Europe"
// omit both geometry and bbox
```

### WKT Examples

| Type | Example | Use case |
|------|---------|----------|
| Point | `POINT(2.35 48.86)` | Single location (Paris) |
| Polygon | `POLYGON((0 0, 10 0, 10 10, 0 10, 0 0))` | Study area boundary |
| BBox | `POLYGON((west south, east south, east north, west north, west south))` | Rectangular extent |

**Note:** WKT uses longitude-latitude order (x, y), not latitude-longitude.