[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Hawksight-AI/semantica/blob/main/cookbook/introduction/14_Ontology.ipynb)

# Ontology

## Overview

This notebook demonstrates how to generate and validate ontologies using Semantica's ontology modules. You'll learn to use `OntologyEngine`, `ClassInferrer`, `PropertyGenerator`, and `OntologyValidator`.

**Documentation**: [API Reference](https://semantica.readthedocs.io/reference/ontology/)

### Learning Objectives

- Use `OntologyEngine` to generate ontologies
- Use `ClassInferrer` to infer classes
- Use `PropertyGenerator` to generate properties
- Use `OntologyValidator` to validate ontologies

## Installation

Install Semantica from PyPI:

```bash
pip install semantica
# Or with all optional dependencies:
pip install semantica[all]
```

---

## Step 1: Generate Ontology

Generate ontology from entities and relationships.

### Module Map

- `OntologyEngine`: Unified API for generation, validation, evaluation, and OWL export
- `OntologyGenerator`: 6-stage pipeline to build classes and properties
- `ClassInferrer`: Derive class candidates and hierarchies from entities
- `PropertyGenerator`: Infer object/data properties from relationships
- `OntologyValidator`: Structural and symbolic validation (reasoners optional)
- `NamespaceManager`: Consistent IRIs (PascalCase classes, camelCase properties)
- `OWLGenerator`/`OWLExporter`: OWL serialization and file export
- `RequirementsSpecManager`/`CompetencyQuestionsManager`: requirements & questions
- `LLMGenerator`: text-to-ontology generation
- `ReuseManager`/`DomainOntologies`: reuse known vocabularies, domain templates
- `NamingConventions`: enforce naming guidelines

### Engine Options

- `base_uri`: namespace root (e.g., `http://example.org/onto#`)
- `name`: ontology name (default: `GeneratedOntology`)
- `version`: version string (default: `1.0`)

### Detailed Guide

#### Core Classes & Methods

- `OntologyEngine`: `from_data(data, **options)`, `from_text(text, provider=None, model=None, **options)`, `infer_classes(entities)`, `validate(ontology)`, `evaluate(ontology)`, `to_owl(ontology, format)`, `export_owl(ontology, path, format)`
- `OntologyGenerator`: `generate_ontology(data, **options)`, `optimize_ontology(ontology, **options)`, `remove_redundancy(ontology)`, `improve_coherence(ontology)`
- `ClassInferrer`: `infer_classes(entities)`, `build_class_hierarchy(classes)`, `validate_classes(classes)`
- `PropertyGenerator`: `infer_properties(entities, relationships, classes)`
- `OntologyValidator`: `validate_ontology(ontology)` (returns `valid`, `consistent`, `errors`, `warnings`, `metrics`)
- `NamespaceManager`: `generate_class_iri(name)`, `generate_property_iri(name)`, `register_namespace(prefix, uri)`, `get_all_namespaces()`

#### Extended Modules

- `OWLGenerator` vs `OWLExporter`: in-memory OWL generation vs file export
- `RequirementsSpecManager` & `CompetencyQuestionsManager`: plan and validate coverage
- `LLMGenerator`: bootstrap ontology from text with a provider/model
- `ReuseManager` & `DomainOntologies`: reuse known vocabularies and templates
- `NamingConventions`: enforce consistent, readable names

#### Building Ontologies Step-by-Step

1) Prepare `entities` and `relationships`
2) Generate ontology with engine
3) Infer classes
4) Infer properties
5) Validate
6) Export OWL

#### Property Types

- Object properties: connect classes (domain → range)
- Data properties: attach literals (strings, numbers, dates)

#### Validator Output

- `valid`, `consistent`, `errors`, `warnings`

#### OWL Export Formats

- `turtle`, `owl-xml`

#### Namespace Management

- Use stable `base_uri` and versioned IRIs
- PascalCase for classes; camelCase for properties


In [None]:
from semantica.ontology import OntologyEngine

engine = OntologyEngine()

entities = [
    {"id": "e1", "type": "Organization", "name": "Apple Inc."},
    {"id": "e2", "type": "Person", "name": "Tim Cook"}
]

relationships = [
    {"source": "e2", "target": "e1", "type": "CEO_of"}
]

ontology = engine.from_data({
    "entities": entities,
    "relationships": relationships
})

print(f"Generated ontology")
print(f"Classes: {len(ontology.get('classes', []))}")
print(f"Properties: {len(ontology.get('properties', []))}")


## Step 2: Class Inference

Infer classes from entities.


In [None]:
from semantica.ontology import ClassInferrer

class_inferrer = ClassInferrer()

classes = class_inferrer.infer_classes(entities)

print(f"Inferred {len(classes)} classes")
for cls in classes[:3]:
    print(f"  - {cls.get('name', cls)}")


## Step 3: Property Generation

Generate properties from relationships.


In [None]:
from semantica.ontology import PropertyGenerator

property_generator = PropertyGenerator()

properties = property_generator.infer_properties(entities, relationships, classes)

print(f"Generated {len(properties)} properties")


## Step 4: Ontology Validation

Validate the generated ontology.


In [None]:
from semantica.ontology import OntologyValidator

validator = OntologyValidator()

validation_result = validator.validate_ontology(ontology)

print(f"Ontology validation:")
print(f"  Valid: {validation_result.valid}")
print(f"  Consistent: {validation_result.consistent}")
print(f"  Errors: {len(validation_result.errors)}")
print(f"  Warnings: {len(validation_result.warnings)}")


## Summary

You've learned how to work with ontologies:

- **OntologyEngine**: Generate ontologies from entities and relationships
- **ClassInferrer**: Infer classes from entities
- **PropertyGenerator**: Generate properties from relationships
- **OntologyValidator**: Validate ontologies

### Validate & Evaluate

```python
result = engine.validate(ontology)
print("Valid:", result.valid, "Consistent:", result.consistent)
print("Errors:", len(result.errors), "Warnings:", len(result.warnings))
# Optional: evaluate coverage/metrics if available
metrics = engine.evaluate(ontology)
print("Evaluation metrics keys:", list(metrics.keys()) if isinstance(metrics, dict) else metrics)
```

### Inspect Classes & Properties

```python
for cls in ontology.get("classes", [])[:5]:
    print("Class:", cls.get("name"), "label:", cls.get("label"))
for prop in ontology.get("properties", [])[:5]:
    print("Property:", prop.get("name"), "type:", prop.get("type"), "domain:", prop.get("domain"), "range:", prop.get("range"))
```

### Modeling Checklist

- Define scope and competency questions
- Choose stable base URI and naming conventions
- Map entities → classes, relationships → object properties, attributes → data properties
- Add domain/range and only essential constraints
- Validate; export OWL; iterate with visualization

Next: Learn how to export data in the Export notebook.


### Worked Example: Object vs Data Properties

Differentiate object properties (link classes) from data properties (literal values).

In [None]:
from semantica.ontology import ClassInferrer, PropertyGenerator

richer_entities = [
    {"id": "p1", "type": "Person", "name": "Alice", "birthDate": "1990-01-01"},
    {"id": "c1", "type": "Company", "name": "Acme Corp", "foundedYear": 2005}
]

richer_relationships = [
    {"source_id": "p1", "target_id": "c1", "source_type": "Person", "target_type": "Company", "type": "WORKS_FOR"}
]

inferrer = ClassInferrer()
classes2 = inferrer.infer_classes(richer_entities)

propgen = PropertyGenerator()
props2 = propgen.infer_properties(richer_entities, richer_relationships, classes2)

obj_props = [p for p in props2 if p.get("type") == "object"]
data_props = [p for p in props2 if p.get("type") == "data"]

print("Object properties:", [p.get("name") for p in obj_props])
print("Data properties:", [p.get("name") for p in data_props])


### Vocabulary Reuse: Align to Schema.org

Align a property to a known vocabulary term for interoperability.

In [None]:
schema_works_for = "https://schema.org/worksFor"
for p in obj_props:
    if p.get("name") == "worksFor":
        p["sameAs"] = schema_works_for
        print("Aligned", p.get("name"), "to", p["sameAs"])


### Visualization (Optional)

Render the class hierarchy if visualization dependencies are available.

In [None]:
from semantica.visualization import OntologyVisualizer

viz = OntologyVisualizer()
fig = viz.visualize_hierarchy(ontology, output="interactive")
print("Visualization generated")
