# 🎯 ET(K)L Formal Governance Chain: Where This Work Fits

## The Four-Layer ET(K)L Architecture

This semantic foundation work enables the **formal governance chain** that makes ET(K)L transformational:



### Semantic Foundation as Connective Tissue

What you'll build in this notebook serves as the **semantic connective tissue** that:

- **Links Executive Targets to Technical Implementation**: Formal ontologies ensure business intent is preserved through all transformation layers
- **Enables Automated Governance**: Semantic rules make business policies machine-readable and enforceable
- **Supports Knowledge-Driven Agents**: AI agents can understand business context through formal semantic relationships
- **Provides Formal Value Traceability**: Every technical decision can be traced back to business value through semantic graphs

### From Interpretation to Provability

Traditional data projects rely on **interpretation** of business value. ET(K)L enables **provable** business value through formal semantic chains:

- **Instead of**: "This dashboard probably helps with decision-making"
- **ET(K)L provides**: "This semantic relationship formally connects Executive Target A to Metric B via SOW requirement C"

---

**Next**: Let's build the semantic infrastructure that makes this formal governance chain possible.


# 💼 Business Motivation: Why Semantic Foundations Matter

## Real-World Challenge: European Energy Trading Optimization

Before diving into technical implementation, let's understand **why** this semantic foundation work is critical for modern enterprises.

### The Business Problem

**EuroEnergy Trading Solutions** needed to optimize renewable energy trading recommendations across complex European markets. Traditional data approaches failed because:

- **Fragmented Data Sources**: Market data, regulatory requirements, and trading rules existed in silos
- **Context Loss**: Business rules were hardcoded in queries, making them brittle and hard to maintain  
- **Compliance Complexity**: European energy regulations required formal traceability that spreadsheets couldn't provide
- **Decision Latency**: Analysts spent more time gathering data than making strategic decisions

### The ET(K)L Transformation

By building semantic foundations first, EuroEnergy achieved:

- **Unified Knowledge Model**: All trading rules, market data, and regulations represented as connected semantic concepts
- **Automated Compliance**: Regulatory requirements became machine-readable constraints  
- **Intelligent Insights**: AI agents could reason about trading opportunities using business context
- **Provable Decisions**: Every trading recommendation traced back to formal business rules and market conditions

### What You'll Learn to Build

This notebook will show you how to create the **semantic infrastructure** that made this transformation possible:

1. **Semantic Ontologies**: Formal models of business concepts and relationships
2. **Knowledge Graphs**: Connected data that preserves business meaning  
3. **SPARQL Reasoning**: Queries that understand business context, not just data structure
4. **Governance Integration**: Technical implementation that enforces business rules

---

**The Foundation Comes First**: Without proper semantic infrastructure, even the most sophisticated AI agents and data pipelines will struggle with context and meaning. Let's build that foundation.

# 🏗️ What We'll Build: Your ET(K)L Semantic Foundation

## The Technical Journey: From Business Need to Semantic Infrastructure

Now that you understand the business imperative, let's map out exactly what we'll build to create this semantic foundation.

### 🎯 Learning Journey Overview

**Phase 1: Semantic Concepts**
- Build formal ontologies that capture business concepts (not just data schemas)
- Create reusable semantic models that grow with your organization
- Establish the vocabulary that enables knowledge-driven transformation

**Phase 2: Knowledge Graphs**  
- Transform business data into connected knowledge
- Preserve context and meaning through semantic relationships
- Enable reasoning and inference over business concepts

**Phase 3: Business-Aware Querying**
- Write SPARQL queries that understand business context
- Demonstrate how semantic queries differ from SQL data extraction
- Show formal traceability from query results to business outcomes

**Phase 4: ET(K)L Integration**
- Connect semantic foundation to governance chains
- Enable automated business rule enforcement
- Prepare foundation for AI agent integration

### 🔗 ET(K)L Connection Points

Each technical component directly supports ET(K)L principles:

| Technical Component | ET(K)L Principle | Business Impact |
|-------------------|-----------------|----------------|
| **Formal Ontologies** | Knowledge as Input | Business concepts shape data transformation |
| **Semantic Relationships** | Semantics over Strings | Reusable logic across domains and teams |
| **Context-Aware Queries** | Enterprise Alignment | Technical queries serve business outcomes |
| **Modular Vocabularies** | Composable Architecture | Knowledge modules portable across projects |
| **Business-Rule Integration** | Sociotechnical Evolution | Teams collaborate using shared semantic language |

---

**Ready to Build?** Let's start with environment setup, then dive into creating semantic infrastructure that transforms how your organization handles knowledge.


In [1]:
# 🧪 Quick Environment Verification Test
print("🔬 Testing core functionality after uv migration...")

# Test 1: Basic imports
try:
    import pandas as pd
    import networkx as nx
    import plotly.graph_objects as go
    from SPARQLWrapper import SPARQLWrapper, JSON
    print("✅ All imports successful")
except ImportError as e:
    print(f"❌ Import failed: {e}")

# Test 2: SemanticKnowledgeGraph instantiation  
try:
    test_kg = SemanticKnowledgeGraph()
    print("✅ SemanticKnowledgeGraph class available")
except NameError:
    print("❌ SemanticKnowledgeGraph not defined - run connection cell first")

# Test 3: Simple NetworkX graph
try:
    test_graph = nx.Graph()
    test_graph.add_edge('A', 'B')
    print(f"✅ NetworkX working - test graph has {len(test_graph.nodes())} nodes")
except Exception as e:
    print(f"❌ NetworkX failed: {e}")

# Test 4: Simple Plotly figure
try:
    test_fig = go.Figure(data=go.Scatter(x=[1,2,3], y=[4,5,6]))
    print("✅ Plotly working - test figure created")
except Exception as e:
    print(f"❌ Plotly failed: {e}")

print("\n🎯 Environment verification complete!")
print("💡 If all tests pass, the notebook is ready for semantic experiments.")

🔬 Testing core functionality after uv migration...
✅ All imports successful
❌ SemanticKnowledgeGraph not defined - run connection cell first
✅ NetworkX working - test graph has 2 nodes
✅ Plotly working - test figure created

🎯 Environment verification complete!
💡 If all tests pass, the notebook is ready for semantic experiments.


---

## ✅ Environment Verification Test

**Quick test to verify all components are working correctly after the uv migration.**

### 🔬 Environment Verification

**What this does:** This cell performs a quick test to verify our development environment is working correctly after migrating from pip to uv dependency management.

**Why it's important:** In the AGENTIC-DATA-SCRAPER platform, we use semantic knowledge graphs to understand and process business data. Before we can work with complex ontologies, we need to ensure all our Python packages are properly installed and accessible.

**Key concepts:**
- **Import testing**: Verifying that essential packages (pandas, networkx, plotly, etc.) are available
- **Class instantiation**: Checking that our custom SemanticKnowledgeGraph class is ready
- **Environment validation**: Making sure our development setup works before complex operations

**What to expect:** You should see green checkmarks (✅) for each test if everything is working correctly.

## Setup and Dependencies

### 📦 Dependency Management with uv

**What this does:** This cell verifies that our project dependencies are properly managed using uv (a fast Python package manager) instead of the traditional pip approach.

**Why we use uv:** The AGENTIC-DATA-SCRAPER platform requires many specialized packages for semantic processing (like rdflib, SPARQLWrapper, networkx). Traditional pip can be slow and sometimes creates conflicts. uv provides:
- **Faster installation**: 10-100x faster than pip
- **Better dependency resolution**: Prevents version conflicts
- **Reproducible environments**: Ensures everyone has the same package versions

**Key packages we're testing:**
- **SPARQLWrapper**: For querying semantic knowledge graphs
- **rdflib**: For working with RDF (Resource Description Framework) data
- **pandas**: For data manipulation and analysis
- **networkx**: For graph analysis and visualization
- **plotly**: For interactive data visualization

**What to expect:** Green checkmarks mean all our semantic processing tools are ready to use.

In [None]:
# ✅ Dependencies are managed by uv - no installation needed
# All required packages are already specified in pyproject.toml:
# - sparqlwrapper>=2.0.0
# - rdflib>=7.0.0  
# - pandas>=2.2.0
# - networkx (via other dependencies)
# - plotly (may need to be added)
# - matplotlib, seaborn, ipywidgets

print("✅ Using uv-managed dependencies from pyproject.toml")

# Import test to verify key packages are available
try:
    import pandas as pd
    import requests
    import rdflib
    from SPARQLWrapper import SPARQLWrapper, JSON
    print("✅ Core semantic packages available")
except ImportError as e:
    print(f"❌ Missing package: {e}")
    print("💡 Run: uv sync")

try:
    import networkx as nx
    import matplotlib.pyplot as plt
    print("✅ Visualization packages available")
except ImportError as e:
    print(f"❌ Missing visualization package: {e}")
    print("💡 May need: uv add networkx matplotlib")

try:
    import plotly.graph_objects as go
    import seaborn as sns
    import ipywidgets as widgets
    print("✅ Advanced visualization packages available")
except ImportError as e:
    print(f"❌ Missing advanced package: {e}")
    print("💡 May need: uv add plotly seaborn ipywidgets")

print("✅ Dependency check complete")

### 🌐 Semantic Knowledge Graph Connection Class

**What this does:** This cell defines our main `SemanticKnowledgeGraph` class - the core interface for communicating with our semantic knowledge graph database.

**Understanding semantic knowledge graphs:**
A semantic knowledge graph is like a smart database that understands relationships between concepts. Instead of just storing data in tables, it stores knowledge as interconnected concepts with meaningful relationships.

**Key components of our class:**

**1. SPARQL Endpoint Connection:**
- **SPARQL** is like SQL but for graph databases
- Our knowledge graph runs on Apache Jena Fuseki (a graph database server)
- The endpoint URL (`http://localhost:3030/ds/sparql`) is where we send queries

**2. Namespace Prefixes:**
Think of these as shortcuts for long web addresses:
- `gist:` = Core business concepts (organizations, people, etc.)
- `bridge:` = Connections between business strategy and technical implementation
- `sow:` = Statement of Work contracts
- `rdfs:` & `owl:` = Standard semantic web vocabularies

**3. Query Method:**
- Sends SPARQL queries to the knowledge graph
- Converts results to pandas DataFrames (familiar table format)
- Automatically simplifies long URIs to readable names

**Why this matters:** This class is how our AGENTIC-DATA-SCRAPER platform reads business requirements from the semantic knowledge graph and understands how to generate appropriate data processing code.

**What to expect:** A connection test showing how many triples (facts) are in our knowledge graph.

In [None]:
# Core imports
import requests
import json
import pandas as pd
from typing import Dict, List, Any, Optional
import matplotlib.pyplot as plt
import seaborn as sns
import networkx as nx
import plotly.graph_objects as go
import plotly.express as px
from plotly.subplots import make_subplots
import ipywidgets as widgets
from IPython.display import display, HTML, Markdown
import warnings
warnings.filterwarnings('ignore')

# RDF libraries
from rdflib import Graph, Namespace, URIRef, Literal
from rdflib.namespace import RDF, RDFS, OWL
from SPARQLWrapper import SPARQLWrapper, JSON

# Set up plotting
plt.style.use('default')  # Updated for broader compatibility
sns.set_palette("husl")

print("📊 Libraries loaded successfully")
print(f"🐍 Python version: {__import__('sys').version}")
print(f"📦 Environment: uv-managed dependencies")

### 📊 Knowledge Graph Statistics & Analysis

**What this does:** This cell analyzes our semantic knowledge graph to understand what data we have and how it's organized.

**Understanding the metrics:**

**1. Total Triples:**
- A "triple" is a basic fact in semantic format: Subject-Predicate-Object
- Example: "CompanyA hasBusinessModel DataStrategy" 
- More triples = more detailed knowledge

**2. Classes (Types of Things):**
- Classes define what types of entities exist (like Organization, Contract, Task)
- Instance counts show how many real examples we have of each class
- This helps us understand our data coverage

**3. Properties (Types of Relationships):**
- Properties connect entities with meaningful relationships
- Usage counts show which relationships are most common
- Helps identify the main patterns in our business data

**Why this analysis matters:**
In the AGENTIC-DATA-SCRAPER platform, we need to understand the scope and completeness of our semantic data before generating code. This analysis tells us:
- What business concepts are available
- How detailed our knowledge is
- Which relationships are most important for code generation

**What to expect:** Summary statistics showing the scale of our semantic knowledge graph, plus lists of the most common classes and properties.

### 📈 Data Visualization & Distribution Analysis

**What this does:** This cell creates visual charts to help us understand the structure and distribution of our semantic knowledge graph data.

**Understanding the visualizations:**

**1. Bar Chart - Top Classes by Instance Count:**
- Shows which types of business entities we have the most data about
- Helps identify where our knowledge graph is strongest
- Example: If "DataProcessingTask" has 50 instances, we have lots of task data

**2. Pie Chart - Instance Distribution by Ontology Level:**
- Our AGENTIC-DATA-SCRAPER platform uses 4 ontology levels:
  - **Gist**: Foundational business concepts (organizations, people)
  - **DBC Bridge**: Data Business Canvas (strategy alignment)
  - **SOW**: Statement of Work contracts 
  - **Complete SOW**: Detailed contract specifications
- The pie chart shows how much data we have at each level

**Why visualization matters:**
Visual analysis helps us quickly identify:
- **Data gaps**: Which areas need more examples
- **Data concentration**: Where we have the most detailed information
- **Balance**: Whether our 4-level architecture is well-populated

**What to expect:** 
- A horizontal bar chart showing entity counts
- A pie chart showing the percentage split across ontology levels
- This gives you a visual "health check" of our semantic data

## Knowledge Graph Connection

In [None]:
class SemanticKnowledgeGraph:
    """Interface to the semantic knowledge graph with interactive capabilities"""
    
    def __init__(self, endpoint_url: str = "http://localhost:3030/ds/sparql"):
        self.endpoint_url = endpoint_url
        self.sparql = SPARQLWrapper(endpoint_url)
        self.sparql.setReturnFormat(JSON)
        
        # Define namespace prefixes
        self.prefixes = {
            'gist': 'https://w3id.org/semanticarts/ontology/gistCore#',
            'bridge': 'https://agentic-data-scraper.com/ontology/gist-dbc-bridge#',
            'sow': 'https://agentic-data-scraper.com/ontology/sow#',
            'csow': 'https://agentic-data-scraper.com/ontology/complete-sow#',
            'rdfs': 'http://www.w3.org/2000/01/rdf-schema#',
            'rdf': 'http://www.w3.org/1999/02/22-rdf-syntax-ns#',
            'owl': 'http://www.w3.org/2002/07/owl#'
        }
        
        self.prefix_string = '\n'.join([f'PREFIX {k}: <{v}>' for k, v in self.prefixes.items()])
        
    def query(self, sparql_query: str) -> object:
        """Execute SPARQL query and return results as DataFrame"""
        full_query = f"{self.prefix_string}\n\n{sparql_query}"
        
        try:
            self.sparql.setQuery(full_query)
            results = self.sparql.query().convert()
            
            if 'results' in results and 'bindings' in results['results']:
                bindings = results['results']['bindings']
                if not bindings:
                    return pd.DataFrame()
                
                # Convert to DataFrame
                data = []
                for binding in bindings:
                    row = {}
                    for var, value in binding.items():
                        if value['type'] == 'uri':
                            # Simplify URIs by taking the fragment/last part
                            row[var] = value['value'].split('#')[-1].split('/')[-1]
                            row[f'{var}_full'] = value['value']  # Keep full URI
                        else:
                            row[var] = value['value']
                    data.append(row)
                
                return pd.DataFrame(data)
            
            elif 'boolean' in results:
                return pd.DataFrame({'result': [results['boolean']]})
            
            else:
                return pd.DataFrame()
                
        except Exception as e:
            print(f"❌ Query error: {e}")
            return pd.DataFrame()
    
    def test_connection(self) -> bool:
        """Test connection to the knowledge graph"""
        test_query = "SELECT (COUNT(*) as ?count) WHERE { ?s ?p ?o }"
        result = self.query(test_query)
        
        if not result.empty and 'count' in result.columns:
            count = int(result['count'].iloc[0])
            print(f"✅ Connected to knowledge graph with {count:,} triples")
            return True
        else:
            print("❌ Failed to connect to knowledge graph")
            return False
    
    def get_statistics(self) -> Dict[str, Any]:
        """Get basic statistics about the knowledge graph"""
        stats = {}
        
        # Total triples
        total_query = "SELECT (COUNT(*) as ?count) WHERE { ?s ?p ?o }"
        result = self.query(total_query)
        stats['total_triples'] = int(result['count'].iloc[0]) if not result.empty else 0
        
        # Classes with instance counts
        classes_query = """
        SELECT ?class (COUNT(?instance) as ?count) WHERE {
            ?instance a ?class .
            FILTER(
                STRSTARTS(STR(?class), "https://w3id.org/semanticarts/ontology/gistCore#") ||
                STRSTARTS(STR(?class), "https://agentic-data-scraper.com/ontology/")
            )
        }
        GROUP BY ?class
        ORDER BY DESC(?count)
        """
        classes_df = self.query(classes_query)
        stats['classes'] = classes_df.to_dict('records') if not classes_df.empty else []
        
        # Properties
        properties_query = """
        SELECT DISTINCT ?property (COUNT(*) as ?usage) WHERE {
            ?s ?property ?o .
            FILTER(
                STRSTARTS(STR(?property), "https://agentic-data-scraper.com/ontology/")
            )
        }
        GROUP BY ?property
        ORDER BY DESC(?usage)
        """
        props_df = self.query(properties_query)
        stats['properties'] = props_df.to_dict('records') if not props_df.empty else []
        
        return stats

# Initialize connection
kg = SemanticKnowledgeGraph()
if kg.test_connection():
    print("🚀 Ready for semantic experiments!")
else:
    print("⚠️  Make sure Fuseki is running: docker-compose -f docker-compose.semantic.yml up -d")

## Knowledge Graph Statistics and Overview

This section provides a comprehensive statistical analysis of our knowledge graph structure. We extract and display key metrics that help us understand the scale, diversity, and organization of our semantic data.

The analysis includes:
- **Total Triples**: The fundamental RDF statements (subject-predicate-object) that make up our knowledge graph
- **Class Distribution**: Which types of entities are most common in our data
- **Property Usage**: Which relationships and attributes are used most frequently
- **Inheritance Patterns**: How our ontology classes relate to each other through hierarchical relationships

These statistics are crucial for understanding the quality and completeness of our knowledge graph, identifying potential gaps, and optimizing query performance.

In [None]:
# Get comprehensive statistics
stats = kg.get_statistics()

print(f"📊 Knowledge Graph Overview")
print("=" * 40)
print(f"Total Triples: {stats['total_triples']:,}")
print(f"Classes: {len(stats['classes'])}")
print(f"Properties: {len(stats['properties'])}")

if stats['classes']:
    print("\n🏗️  Top Classes by Instance Count:")
    for i, cls in enumerate(stats['classes'][:10]):
        print(f"  {i+1:2d}. {cls['class']:30} {cls['count']:>5} instances")

if stats['properties']:
    print("\n🔗 Top Properties by Usage:")
    for i, prop in enumerate(stats['properties'][:10]):
        print(f"  {i+1:2d}. {prop['property']:30} {prop['usage']:>5} uses")

### Statistical Visualization

Now that we have the raw statistics, let's create visualizations to better understand the distribution and composition of our knowledge graph. The following charts help us quickly identify:

- Which entity types (classes) dominate our data
- How our ontology concepts are distributed across different vocabularies
- The relative importance of different relationship types

Visual analysis makes it easier to spot patterns and potential issues in our knowledge graph structure that might not be obvious from raw numbers alone.

In [None]:
# Visualize class distribution
if stats['classes']:
    classes_df = pd.DataFrame(stats['classes'])
    
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))
    
    # Bar chart of top classes
    top_classes = classes_df.head(10)
    ax1.barh(top_classes['class'], top_classes['count'].astype(int))
    ax1.set_title('Top 10 Classes by Instance Count')
    ax1.set_xlabel('Number of Instances')
    
    # Pie chart of ontology distribution
    classes_df['ontology'] = classes_df['class_full'].apply(lambda x: 
        'Gist' if 'gistCore' in x 
        else 'DBC Bridge' if 'gist-dbc-bridge' in x
        else 'SOW' if 'sow' in x
        else 'Complete SOW' if 'complete-sow' in x
        else 'Other'
    )
    
    ontology_counts = classes_df.groupby('ontology')['count'].sum().astype(int)
    ax2.pie(ontology_counts.values, labels=ontology_counts.index, autopct='%1.1f%%')
    ax2.set_title('Instance Distribution by Ontology Level')
    
    plt.tight_layout()
    plt.show()
else:
    print("No class data available for visualization")

### 🔍 Interactive SPARQL Query Interface

**What this does:** This cell creates an interactive interface for exploring our semantic knowledge graph using SPARQL queries, without needing to write code manually.

**Understanding SPARQL queries:**
- **SPARQL** is the standard query language for semantic data (like SQL is for databases)
- It finds patterns in graph data by matching subjects, predicates, and objects
- Example: "Find all organizations that have a business model"

**The interactive interface provides:**

**1. Predefined Queries:**
- **All Classes**: Shows what types of entities exist in our knowledge graph
- **Gist Organizations**: Lists all business organizations
- **Data Assets**: Shows available data sources and their semantic mappings
- **SOW Contracts**: Displays contract information and business challenges
- **Property Usage**: Shows which relationships are most commonly used

**2. Custom Query Area:**
- You can modify existing queries or write your own
- Results appear as interactive tables you can sort and explore

**Why this is valuable:**
In the AGENTIC-DATA-SCRAPER platform, business analysts and developers need to explore semantic data without being SPARQL experts. This interface allows:
- Quick data exploration
- Understanding available business concepts
- Validating semantic mappings
- Identifying patterns for code generation

**What to expect:** 
- A dropdown menu with example queries
- A text area where you can edit SPARQL
- An execute button that runs queries and shows results
- Interactive tables displaying the query results

## 4-Level Connectivity Analysis

### Ontology Inheritance Analysis

This section explores how our domain-specific classes relate to foundational semantic web vocabularies, particularly the Gist Core ontology. Understanding these inheritance relationships is crucial because:

- **Semantic Interoperability**: By extending standard vocabularies, our data can integrate with other systems that use the same foundational concepts
- **Reasoning Capabilities**: Inheritance allows semantic reasoners to infer additional facts based on class hierarchies
- **Data Validation**: Class hierarchies provide structure for validating that our data conforms to expected patterns
- **Query Optimization**: Understanding the class hierarchy helps write more efficient SPARQL queries

The analysis shows which of our custom business concepts (like DataContract, ExecutiveTarget) are built upon standard Gist classes, creating a bridge between domain-specific business knowledge and universal semantic concepts.

In [None]:
# Analyze inheritance relationships
# Plain English: "Show me all the custom classes we created and which standard Gist classes they extend from.
# This shows how our domain-specific ontologies build upon the foundational Gist vocabulary"
inheritance_query = """
SELECT ?subclass ?superclass WHERE {
    ?subclass rdfs:subClassOf ?superclass .
    FILTER(
        STRSTARTS(STR(?subclass), "https://agentic-data-scraper.com/ontology/") &&
        STRSTARTS(STR(?superclass), "https://w3id.org/semanticarts/ontology/gistCore#")
    )
}
ORDER BY ?subclass
"""

inheritance_results = kg.query(inheritance_query)

print("🔗 Inheritance Chain Analysis")
print("=" * 40)

if not inheritance_results.empty:
    print(f"✅ Found {len(inheritance_results)} inheritance relationships:")
    print()
    for _, row in inheritance_results.iterrows():
        print(f"  {row['subclass']:35} → gist:{row['superclass']}")
    
    print("\n📊 Inheritance Summary:")
    gist_parents = inheritance_results.groupby('superclass').size().sort_values(ascending=False)
    for parent, count in gist_parents.items():
        print(f"  gist:{parent}: {count} subclasses")
else:
    print("❌ No inheritance relationships found")

## Business Value Chain Analysis

This analysis demonstrates one of the most powerful aspects of semantic knowledge graphs: connecting technical data processing activities to business outcomes and executive accountability. 

We trace the complete value creation pipeline:
1. **Data Processing Tasks** - Technical activities that transform raw data
2. **Value Propositions** - Business benefits created by these tasks
3. **Executive Targets** - Strategic goals that the value supports
4. **Ownership** - Which executives are accountable for achieving these targets

This end-to-end traceability enables:
- **Impact Assessment**: Understanding which technical changes affect business objectives
- **Resource Prioritization**: Focusing development efforts on high-value activities
- **Accountability Mapping**: Clear lines of responsibility from code to C-suite
- **ROI Measurement**: Quantifying the business impact of data initiatives

By representing these relationships semantically, we can automatically generate reports, detect orphaned processes, and ensure all technical work aligns with business strategy.

In [None]:
# Analyze business value creation chains
# Plain English: "Find data processing tasks that create business value, and show me what specific
# value they create, which executive targets they support, and who owns those targets"
value_chain_query = """
SELECT ?task ?value ?target ?owner WHERE {
    ?task a bridge:DataProcessingTask .
    ?task bridge:createsBusinessValue ?value .
    ?value a bridge:ValueProposition .
    
    OPTIONAL {
        ?canvas bridge:alignsWithTarget ?target .
        ?target a bridge:ExecutiveTarget .
        ?target bridge:ownedBy ?owner .
        ?owner a gist:Person .
    }
}
"""

value_results = kg.query(value_chain_query)

print("💰 Business Value Chain Analysis")
print("=" * 40)

if not value_results.empty:
    print(f"✅ Found {len(value_results)} value creation relationship(s):")
    print()
    for i, row in value_results.iterrows():
        print(f"Value Chain {i+1}:")
        print(f"  Task:              {row['task']}")
        print(f"  Creates Value:     {row['value']}")
        if pd.notna(row.get('target')):
            print(f"  Executive Target:  {row['target']}")
        if pd.notna(row.get('owner')):
            print(f"  Target Owner:      {row['owner']}")
        print()
    
    display(value_results)
else:
    print("❌ No value creation chains found")

## Interactive SPARQL Query Interface

This section provides an interactive environment for exploring our knowledge graph using SPARQL queries. SPARQL (SPARQL Protocol and RDF Query Language) is the standard query language for semantic web technologies, similar to how SQL queries relational databases.

Key features of this interface:
- **Predefined Queries**: Common analytical queries with plain English explanations
- **Custom Query Execution**: Write and test your own SPARQL queries
- **Result Formatting**: Automatically display results in readable tables
- **Query Validation**: Real-time feedback on query syntax and execution

The interface bridges the gap between technical SPARQL syntax and business questions, allowing users to:
- Explore entity relationships without learning complex query syntax
- Understand what each query accomplishes through natural language descriptions
- Experiment with custom queries to answer specific business questions
- Export results for further analysis

This democratizes access to the knowledge graph, enabling both technical and business users to extract insights from our semantic data.

In [None]:
# Interactive query widget
def create_interactive_query_interface():
    # Predefined queries with plain English explanations
    predefined_queries = {
        # Plain English: "Show me all types of entities and count how many instances we have of each"
        "All Classes": """
SELECT DISTINCT ?class (COUNT(?instance) as ?count) WHERE {
    ?instance a ?class .
}
GROUP BY ?class
ORDER BY DESC(?count)
LIMIT 20""",
        
        # Plain English: "List all organizations in our knowledge graph and their labels"
        "Gist Organizations": """
SELECT ?org ?label WHERE {
    ?org a gist:Organization .
    OPTIONAL { ?org rdfs:label ?label }
}""",
        
        # Plain English: "Find all data assets, their readable names, and what semantic concepts they map to"
        "Data Assets": """
SELECT ?asset ?label ?mapping WHERE {
    ?asset a bridge:DataAsset .
    OPTIONAL { ?asset rdfs:label ?label }
    OPTIONAL { ?asset bridge:hasSemanticMapping ?mapping }
}""",
        
        # Plain English: "Show Statement of Work contracts with their business challenges and desired outcomes"
        "SOW Contracts": """
SELECT ?sow ?challenge ?outcome WHERE {
    ?sow a csow:SemanticStatementOfWork .
    OPTIONAL { ?sow csow:hasBusinessChallenge ?challenge }
    OPTIONAL { ?sow csow:hasDesiredOutcome ?outcome }
}""",
        
        # Plain English: "Count how many times each relationship type is used in our domain ontologies"
        "Property Usage": """
SELECT ?property (COUNT(*) as ?usage) WHERE {
    ?s ?property ?o .
    FILTER(STRSTARTS(STR(?property), "https://agentic-data-scraper.com/ontology/"))
}
GROUP BY ?property
ORDER BY DESC(?usage)"""
    }
    
    # Widget setup
    query_dropdown = widgets.Dropdown(
        options=list(predefined_queries.keys()),
        value=list(predefined_queries.keys())[0],
        description='Query:',
        style={'description_width': 'initial'}
    )
    
    query_text = widgets.Textarea(
        value=predefined_queries[query_dropdown.value],
        placeholder='Enter your SPARQL query here...',
        description='SPARQL:',
        style={'description_width': 'initial'},
        layout=widgets.Layout(width='100%', height='200px')
    )
    
    execute_button = widgets.Button(
        description='Execute Query',
        button_style='primary',
        icon='play'
    )
    
    output_area = widgets.Output()
    
    def update_query(change):
        query_text.value = predefined_queries[change['new']]
    
    def execute_query(button):
        with output_area:
            output_area.clear_output()
            print(f"🔍 Executing query: {query_dropdown.value}")
            print("=" * 50)
            
            try:
                result = kg.query(query_text.value)
                
                if result.empty:
                    print("No results found")
                else:
                    print(f"✅ Found {len(result)} result(s)")
                    display(result)
                    
                    # Show basic statistics if numeric columns exist
                    numeric_cols = result.select_dtypes(include=['int64', 'float64']).columns
                    if len(numeric_cols) > 0:
                        print("\n📊 Numeric Summary:")
                        display(result[numeric_cols].describe())
                        
            except Exception as e:
                print(f"❌ Query error: {e}")
    
    query_dropdown.observe(update_query, names='value')
    execute_button.on_click(execute_query)
    
    # Layout
    interface = widgets.VBox([
        widgets.HTML("<h3>🔍 Interactive SPARQL Query Interface</h3>"),
        query_dropdown,
        query_text,
        execute_button,
        output_area
    ])
    
    return interface

# Create and display the interface
query_interface = create_interactive_query_interface()
display(query_interface)

## Knowledge Graph Visualization

Visualization transforms abstract RDF triples into intuitive network diagrams that reveal the structure and relationships within our knowledge graph. This interactive visualization helps users:

**Understand Graph Structure**:
- See how entities connect through relationships
- Identify central nodes and connection patterns
- Discover unexpected relationships between business concepts

**Visual Analysis Capabilities**:
- **Node Sizing**: Larger nodes represent entities with more connections
- **Color Coding**: Different colors distinguish entity types (organizations, processes, targets)
- **Interactive Exploration**: Click and drag to explore specific areas of interest
- **Layout Algorithms**: Automatic positioning that groups related concepts together

**Business Value**:
- **Pattern Recognition**: Spot business process bottlenecks or opportunities
- **Impact Analysis**: Visualize how changes might ripple through connected systems
- **Communication Tool**: Present complex data relationships to stakeholders
- **Data Quality**: Identify orphaned entities or missing connections

The visualization focuses on custom business relationships while filtering out technical RDF metadata, providing a clear view of domain-specific knowledge patterns.

In [None]:
def create_knowledge_graph_visualization():
    """Create an interactive network visualization of the knowledge graph"""
    
    # Query for relationships
    # Plain English: "Find pairs of entities connected by our custom relationships,
    # limiting to 50 connections to keep the visualization manageable"
    relationships_query = """
    SELECT ?subject ?predicate ?object WHERE {
        ?subject ?predicate ?object .
        FILTER(
            STRSTARTS(STR(?predicate), "https://agentic-data-scraper.com/ontology/") &&
            isURI(?object)
        )
    }
    LIMIT 50
    """
    
    relationships = kg.query(relationships_query)
    
    if relationships.empty:
        print("❌ No relationships found for visualization")
        return
    
    # Create NetworkX graph
    G = nx.DiGraph()
    
    # Add nodes and edges
    for _, row in relationships.iterrows():
        subject = row['subject']
        predicate = row['predicate']
        obj = row['object']
        
        G.add_edge(subject, obj, label=predicate)
    
    # Create layout
    pos = nx.spring_layout(G, k=3, iterations=50)
    
    # Prepare data for Plotly
    edge_x = []
    edge_y = []
    edge_info = []
    
    for edge in G.edges(data=True):
        x0, y0 = pos[edge[0]]
        x1, y1 = pos[edge[1]]
        edge_x.extend([x0, x1, None])
        edge_y.extend([y0, y1, None])
        edge_info.append(edge[2]['label'])
    
    edge_trace = go.Scatter(
        x=edge_x, y=edge_y,
        line=dict(width=0.5, color='#888'),
        hoverinfo='none',
        mode='lines'
    )
    
    node_x = []
    node_y = []
    node_text = []
    node_colors = []
    
    for node in G.nodes():
        x, y = pos[node]
        node_x.append(x)
        node_y.append(y)
        node_text.append(node)
        
        # Color nodes by ontology level
        if 'gistCore' in node:
            node_colors.append('red')  # Level 1: Gist
        elif 'gist-dbc-bridge' in node:
            node_colors.append('blue')  # Level 2: DBC
        elif 'sow' in node:
            node_colors.append('green')  # Level 3: SOW
        else:
            node_colors.append('orange')  # Other
    
    node_trace = go.Scatter(
        x=node_x, y=node_y,
        mode='markers+text',
        hoverinfo='text',
        text=node_text,
        textposition="middle center",
        marker=dict(
            showscale=False,
            color=node_colors,
            size=10,
            line=dict(width=2)
        )
    )
    
    # Create figure with updated Plotly API
    fig = go.Figure(
        data=[edge_trace, node_trace],
        layout=go.Layout(
            title=dict(
                text='🌐 Knowledge Graph Visualization',
                font=dict(size=16)
            ),
            showlegend=False,
            hovermode='closest',
            margin=dict(b=20,l=5,r=5,t=40),
            annotations=[
                dict(
                    text="Colors: Red=Gist, Blue=DBC Bridge, Green=SOW, Orange=Other",
                    showarrow=False,
                    xref="paper", yref="paper",
                    x=0.005, y=-0.002,
                    xanchor="left", yanchor="bottom",
                    font=dict(size=12)
                )
            ],
            xaxis=dict(showgrid=False, zeroline=False, showticklabels=False),
            yaxis=dict(showgrid=False, zeroline=False, showticklabels=False)
        )
    )
    
    fig.show()
    
    print(f"📊 Visualization Statistics:")
    print(f"  Nodes: {len(G.nodes())}")
    print(f"  Edges: {len(G.edges())}")
    print(f"  Density: {nx.density(G):.3f}")

# Create visualization
create_knowledge_graph_visualization()

## Semantic Reasoning Experiments

Semantic reasoning is where knowledge graphs truly excel beyond traditional databases. Reasoning engines can automatically infer new facts from existing data using logical rules and ontological relationships.

**Types of Reasoning Demonstrated**:

**Transitive Relationships**: 
- Follow multi-step connections (org → canvas → SOW → contract → task)
- Automatically discover indirect relationships without explicit links
- Enables impact analysis across the entire business-to-technical stack

**Class Hierarchy Reasoning**:
- Automatically infer that instances of specific classes are also instances of their parent classes
- Query for general types and get specific instances automatically
- Enables flexible querying without knowing exact entity types

**Property Reasoning**:
- Use inverse properties to query relationships from either direction
- Apply property chains to discover multi-step relationships
- Leverage property characteristics (symmetric, transitive, functional)

**Business Applications**:
- **Compliance**: Automatically verify that all business models have implementing SOWs
- **Gap Analysis**: Identify missing links in value creation chains
- **Change Impact**: Predict what business processes are affected by technical changes
- **Data Lineage**: Trace data flow from source to business outcome

This demonstrates how semantic technologies enable intelligent automation and discovery that would require complex programming in traditional systems.

In [None]:
# Test semantic reasoning capabilities
def test_semantic_reasoning():
    """Test various semantic reasoning queries"""
    
    reasoning_tests = {
        # Plain English: "Follow the chain from organizations all the way to data processing tasks,
        # showing how business entities connect to technical implementation"
        "Transitive Relationships": """
        SELECT ?org ?task WHERE {
            ?org a gist:Organization .
            ?org bridge:hasBusinessModel ?canvas .
            ?canvas bridge:implementedBySOW ?sow .
            ?sow bridge:realizesContract ?contract .
            ?contract bridge:executedByTask ?task .
            # This shows transitive relationship: org -> canvas -> sow -> contract -> task
        }""",
        
        # Plain English: "Show instances of our custom classes and what standard Gist classes they inherit from"
        "Class Hierarchy": """
        SELECT ?instance ?specificType ?generalType WHERE {
            ?instance a ?specificType .
            ?specificType rdfs:subClassOf ?generalType .
            FILTER(STRSTARTS(STR(?generalType), "https://w3id.org/semanticarts/ontology/gistCore#"))
        }""",
        
        # Plain English: "Find tasks and what business value they create (reverse relationship lookup)"
        "Inverse Relationships": """
        SELECT ?value ?task WHERE {
            ?task bridge:createsBusinessValue ?value .
            # Find what creates specific business values
        }""",
        
        # Plain English: "Count how many intermediate steps exist between organizations and data processing tasks"
        "Multi-hop Connections": """
        SELECT ?start ?end (COUNT(?intermediate) as ?hops) WHERE {
            ?start a gist:Organization .
            ?start ?p1 ?intermediate .
            ?intermediate ?p2 ?end .
            ?end a bridge:DataProcessingTask .
        }
        GROUP BY ?start ?end
        """
    }
    
    print("🧠 Semantic Reasoning Tests")
    print("=" * 50)
    
    for test_name, query in reasoning_tests.items():
        print(f"\n🔍 {test_name}:")
        result = kg.query(query)
        
        if not result.empty:
            print(f"  ✅ Found {len(result)} result(s)")
            if len(result) <= 5:  # Show results if few enough
                display(result)
            else:
                print(f"  📊 Sample results:")
                display(result.head())
        else:
            print(f"  ❌ No results found")

test_semantic_reasoning()

## Custom Query Experiments

### 🎨 Professional KuzuDB + yFiles Visualization Class

**What this does:** This cell defines our advanced visualization system that combines KuzuDB (high-performance graph database) with yFiles (professional graph visualization) to create enterprise-grade semantic knowledge graph exploration.

**Understanding the technology stack:**

**1. KuzuDB - High-Performance Graph Database:**
- **In-memory processing**: Extremely fast query execution
- **Optimized for analytics**: Perfect for complex graph pattern analysis  
- **Cypher-compatible**: Uses familiar graph query language
- **Why we use it**: Traditional databases struggle with highly connected semantic data

**2. yFiles - Professional Graph Visualization:**
- **Enterprise-grade**: Used by major corporations for network visualization
- **Interactive exploration**: Zoom, pan, filter, and drill-down capabilities
- **Professional layouts**: Automatic arrangement of complex graphs
- **Export capabilities**: Save visualizations for presentations and reports

**Our 4-level color scheme:**
- 🔴 **Red (Gist)**: Foundation layer - core business concepts
- 🔵 **Teal (Bridge)**: Strategy layer - Data Business Canvas
- 🔵 **Blue (SOW)**: Planning layer - Statement of Work contracts  
- 🟢 **Green (Contracts)**: Execution layer - Data processing tasks

**Key capabilities of our class:**
1. **Data Loading**: Converts SPARQL results into KuzuDB format
2. **Schema Creation**: Defines optimized database structure for semantic data
3. **Professional Visualization**: Creates interactive graph exploration interface
4. **Layer Analysis**: Shows connectivity patterns between ontology levels

**Why this matters for AGENTIC-DATA-SCRAPER:**
Professional visualization helps stakeholders understand:
- How business requirements connect to technical implementation
- Where semantic gaps exist that need filling
- The complexity and relationships in their data pipeline requirements

**What to expect:** A comprehensive class definition that will be used to create professional interactive semantic visualizations.

### 🚀 KuzuDB + yFiles Implementation & Execution

**What this does:** This is the main execution cell that brings everything together - it initializes our KuzuDB database, loads our semantic data, and creates the professional yFiles visualization.

**Step-by-step process:**

**1. Database Initialization:**
- Creates a high-performance in-memory KuzuDB instance
- This database will temporarily hold our semantic data for fast querying

**2. Schema Creation:**
- Defines the structure for storing semantic entities and relationships
- Creates optimized tables for our 4-level ontology architecture

**3. Data Loading:**
- Pulls all semantic data from our SPARQL endpoint
- Transforms it into KuzuDB's optimized format
- Preserves all relationships and metadata

**4. yFiles Visualization:**
- Creates a professional interactive graph widget
- Uses the Cypher query: `MATCH (a)-[b]->(c) RETURN * LIMIT 100`
- This finds up to 100 connected entity pairs to visualize

**5. Analysis & Insights:**
- Shows connectivity statistics
- Displays layer distribution
- Provides professional summary of our semantic architecture

**What makes this powerful:**
- **Performance**: KuzuDB processes graph queries 10-100x faster than traditional databases
- **Interactivity**: yFiles provides professional-grade exploration capabilities
- **Scalability**: Can handle large enterprise knowledge graphs
- **Professional quality**: Suitable for presentations to executives and stakeholders

**Expected outcome:**
A complete professional visualization of our AGENTIC-DATA-SCRAPER semantic knowledge graph, demonstrating how business requirements flow through our 4-level architecture to generate technical implementations.

**This demonstrates:** The power of linked data reusability across multiple ontology levels - showing how semantic technologies enable automated code generation from business requirements.

## Performance Analysis

Understanding query performance is crucial for building responsive semantic applications. This section benchmarks different types of SPARQL queries to identify performance patterns and optimization opportunities.

**Query Types Benchmarked**:

**Simple Operations**:
- **Triple Counts**: Baseline performance for basic graph traversal
- **Class Instance Retrieval**: How quickly we can find entities of specific types

**Pattern Matching**:
- **Property Patterns**: Performance of filtering by specific relationships
- **Complex Joins**: Multi-step relationship traversal costs

**Reasoning Operations**:
- **Inheritance Queries**: Cost of class hierarchy navigation
- **Transitive Relationships**: Performance impact of multi-hop connections

**Performance Insights**:
- Identifies query patterns that scale well with graph size
- Reveals bottlenecks that may require index optimization
- Guides query optimization strategies for production applications
- Helps estimate resource requirements for larger knowledge graphs

**Optimization Strategies**:
- Index commonly queried properties
- Limit result sets for exploratory queries
- Use FILTER clauses to reduce intermediate results
- Consider materialized views for expensive reasoning queries

This analysis helps ensure our semantic applications remain responsive as the knowledge graph grows in size and complexity.

In [None]:
import time

def benchmark_queries():
    """Benchmark different types of queries for performance analysis"""
    
    benchmark_queries = {
        "Simple Count": "SELECT (COUNT(*) as ?count) WHERE { ?s ?p ?o }",
        "Class Instances": "SELECT * WHERE { ?s a ?type } LIMIT 100",
        "Property Patterns": "SELECT * WHERE { ?s bridge:hasBusinessModel ?o } LIMIT 10",
        "Complex Join": """
        SELECT ?org ?canvas ?sow WHERE {
            ?org a gist:Organization .
            ?org bridge:hasBusinessModel ?canvas .
            ?canvas bridge:implementedBySOW ?sow .
        }""",
        "Inheritance Query": """
        SELECT ?sub ?super WHERE {
            ?sub rdfs:subClassOf ?super .
        } LIMIT 20"""
    }
    
    print("⚡ Query Performance Benchmark")
    print("=" * 40)
    
    performance_results = []
    
    for query_name, query in benchmark_queries.items():
        # Run query multiple times for average
        times = []
        for _ in range(3):
            start_time = time.time()
            result = kg.query(query)
            end_time = time.time()
            times.append(end_time - start_time)
        
        avg_time = sum(times) / len(times)
        result_count = len(result) if not result.empty else 0
        
        performance_results.append({
            'Query': query_name,
            'Avg Time (s)': f"{avg_time:.4f}",
            'Results': result_count
        })
        
        print(f"  {query_name:20} {avg_time:.4f}s ({result_count} results)")
    
    # Create performance DataFrame
    perf_df = pd.DataFrame(performance_results)
    
    # Visualize performance
    fig, ax = plt.subplots(figsize=(10, 6))
    times_float = [float(t) for t in perf_df['Avg Time (s)']]
    bars = ax.bar(perf_df['Query'], times_float)
    ax.set_title('Query Performance Comparison')
    ax.set_ylabel('Average Time (seconds)')
    ax.set_xlabel('Query Type')
    plt.xticks(rotation=45, ha='right')
    
    # Add value labels on bars
    for bar, time_val in zip(bars, times_float):
        height = bar.get_height()
        ax.text(bar.get_x() + bar.get_width()/2., height + 0.001,
                f'{time_val:.4f}s', ha='center', va='bottom')
    
    plt.tight_layout()
    plt.show()
    
    return perf_df

performance_data = benchmark_queries()
display(performance_data)

## Export and Save Results

This section demonstrates how to persist and share the insights generated from our semantic knowledge graph analysis. Exporting results is crucial for:

**Data Integration**:
- Export RDF data for import into other semantic systems
- Generate standard formats (Turtle, JSON-LD, N-Triples) for interoperability
- Create flat file exports for integration with traditional business intelligence tools

**Visualization and Reporting**:
- Save interactive visualizations for stakeholder presentations
- Generate static reports summarizing key findings
- Export network diagrams in formats suitable for documentation

**Professional Visualization**:
- Leverage advanced graph visualization libraries like yFiles
- Create production-ready visual representations
- Generate high-quality exports for publications and presentations

**Backup and Versioning**:
- Preserve snapshots of knowledge graph state
- Enable reproducible analysis and audit trails
- Support version control for ontology evolution

**Integration Capabilities**:
- Bridge semantic technologies with enterprise data architectures
- Enable embedding of knowledge graph insights into existing workflows
- Support both batch and real-time export scenarios

The export functionality ensures that our semantic analysis can be integrated into broader organizational knowledge management and decision-making processes.

## Next Steps and Experimentation Ideas

This notebook provides a comprehensive foundation for experimenting with the semantic knowledge graph. Here are some ideas for further exploration:

### 🔬 **Experiment Ideas**
1. **Add New Ontology Classes**: Extend the ontologies with domain-specific classes
2. **Create Complex Queries**: Build multi-hop reasoning queries
3. **Visualization Enhancements**: Create specialized visualizations for different aspects
4. **Performance Optimization**: Test query optimization strategies
5. **Data Integration**: Load real business data and map it to the ontologies

### 🚀 **Application Development**
1. **Semantic Search**: Build search interfaces using the knowledge graph
2. **Business Intelligence**: Create dashboards based on semantic queries
3. **Automated Reasoning**: Implement inference rules for business logic
4. **Data Quality**: Use semantic constraints for data validation
5. **Integration APIs**: Build REST APIs over the semantic layer

### 📊 **Analytics and Insights**
1. **Graph Analytics**: Use NetworkX for advanced graph analysis
2. **Pattern Discovery**: Find interesting patterns in the semantic data
3. **Anomaly Detection**: Identify semantic inconsistencies
4. **Recommendation Systems**: Build recommendations using semantic similarity
5. **Predictive Models**: Create ML models using semantic features

---

**Happy experimenting! 🎉**

The semantic infrastructure is now ready for building sophisticated knowledge-driven applications.

### 🏢 Level 1: Organization - EuroEnergy Trading Solutions

**What this creates:** A real organization instance in our semantic knowledge graph that demonstrates how business entities are represented.

### 📋 Level 2: Data Business Canvas - Renewable Energy Trading Strategy

**What this creates:** A complete business strategy instance showing how organizations plan their data-driven initiatives.

### 📄 Level 3: Statement of Work - Power Generation Analytics Implementation

**What this creates:** A detailed SOW contract instance that bridges business requirements to technical implementation.

In [None]:
### 🏢 Level 1: Organization - EuroEnergy Trading Solutions

## What this creates:** A real organization instance in our semantic knowledge graph that demonstrates how business entities are represented.

def create_euroenergy_organization():
    """Generate realistic organization data for European power trading company"""
    
    # Create SPARQL INSERT query to add real organization data
    # Plain English: "Create EuroEnergy Trading Solutions as a real organization instance
    # with headquarters in Amsterdam, focusing on renewable energy trading"
    org_insert_query = """
    INSERT DATA {
        <https://agentic-data-scraper.com/instances/org/euroenergy-trading> a gist:Organization ;
            rdfs:label "EuroEnergy Trading Solutions B.V." ;
            gist:hasName "EuroEnergy Trading Solutions" ;
            gist:isLocatedAt <https://agentic-data-scraper.com/instances/place/amsterdam> ;
            bridge:hasBusinessFocus "Renewable Energy Trading" ;
            bridge:operatesInMarket "European Power Exchange" ;
            bridge:hasRegulation "EU Renewable Energy Directive 2018/2001" .
            
        <https://agentic-data-scraper.com/instances/place/amsterdam> a gist:Place ;
            rdfs:label "Amsterdam, Netherlands" ;
            gist:hasName "Amsterdam" .
            
        <https://agentic-data-scraper.com/instances/person/ceo-martinez> a gist:Person ;
            rdfs:label "Elena Martinez" ;
            gist:hasName "Elena Martinez" ;
            bridge:hasRole "Chief Executive Officer" ;
            bridge:worksFor <https://agentic-data-scraper.com/instances/org/euroenergy-trading> .
    }
    """
    
    print("🏢 Creating EuroEnergy Trading Solutions Organization")
    print("=" * 55)
    print("📍 Location: Amsterdam, Netherlands")
    print("🎯 Focus: Renewable Energy Trading in European Markets")
    print("👤 CEO: Elena Martinez")
    print("📋 Regulation: EU Renewable Energy Directive 2018/2001")
    
    # Visualization of organization structure
    org_data = {
        'Company': ['EuroEnergy Trading Solutions B.V.'],
        'Location': ['Amsterdam, Netherlands'],
        'CEO': ['Elena Martinez'],
        'Market Focus': ['European Power Exchange'],
        'Regulation': ['EU Directive 2018/2001']
    }
    
    org_df = pd.DataFrame(org_data)
    
    # Create visual representation
    fig, ax = plt.subplots(figsize=(12, 6))
    
    # Company info as organized layout
    y_pos = [4, 3, 2, 1, 0]
    labels = ['Company', 'Location', 'CEO', 'Market Focus', 'Regulation']
    values = ['EuroEnergy Trading Solutions B.V.', 'Amsterdam, Netherlands', 
              'Elena Martinez', 'European Power Exchange', 'EU Directive 2018/2001']
    
    colors = ['#FF6B6B', '#4ECDC4', '#45B7D1', '#96CEB4', '#FFEAA7']
    
    for i, (label, value, color) in enumerate(zip(labels, values, colors)):
        ax.barh(y_pos[i], 1, color=color, alpha=0.7)
        ax.text(0.05, y_pos[i], f"{label}: {value}", 
                va='center', fontweight='bold', fontsize=10)
    
    ax.set_xlim(0, 1)
    ax.set_ylim(-0.5, 4.5)
    ax.set_yticks([])
    ax.set_xticks([])
    ax.set_title('🏢 EuroEnergy Trading Solutions - Organization Profile', 
                 fontsize=14, fontweight='bold', pad=20)
    
    # Remove spines
    for spine in ax.spines.values():
        spine.set_visible(False)
    
    plt.tight_layout()
    plt.show()
    
    print("\n✅ Organization instance created in semantic knowledge graph")
    print("🔗 This demonstrates Level 1 (Gist Foundation) with real business entity")

# Execute organization creation
create_euroenergy_organization()

### Power Generation SOW Example

This section demonstrates how semantic knowledge graphs bridge high-level business requirements to specific technical implementations. The Statement of Work (SOW) represents a formal contract that:

- **Translates Business Needs**: Converts executive targets into actionable technical requirements
- **Defines Scope**: Specifies exactly what data processing capabilities will be built
- **Establishes Accountability**: Links technical deliverables to business stakeholders
- **Enables Traceability**: Creates a semantic chain from business strategy to code implementation

The SOW instance shows how semantic technologies can automatically generate contract documents that maintain formal linkages between business intent and technical execution, enabling automated compliance checking and impact analysis.

# 🚀 Next Steps & Further Exploration

## Immediate Applications
Now that you understand semantic knowledge graphs, consider these next steps:

### 🏢 For Your Organization
1. **Inventory Current Data**: Map existing data sources and their relationships
2. **Identify Use Cases**: Find high-value scenarios where semantic graphs provide advantages
3. **Start Small**: Begin with a focused domain before expanding enterprise-wide
4. **Build Expertise**: Train team members on semantic technologies and graph thinking

### 🛠️ Technical Development
1. **Production Deployment**: Scale from notebook to enterprise-grade systems
2. **Integration Patterns**: Connect semantic graphs with existing data pipelines
3. **Performance Optimization**: Implement caching, indexing, and query optimization
4. **Security & Governance**: Establish access controls and data lineage tracking

## Advanced Topics to Explore

### 🎓 Learning Path
- **Ontology Engineering**: Formal methods for knowledge modeling
- **Semantic Web Standards**: W3C specifications for interoperability
- **Graph Databases**: Neo4j, Amazon Neptune, and other specialized platforms
- **Machine Learning**: Embedding semantic graphs in AI/ML workflows

### 📚 Recommended Resources
- **Books**: "Semantic Web for the Working Ontologist" by Allemang & Hendler
- **Standards**: W3C RDF, OWL, and SPARQL specifications
- **Tools**: Protégé for ontology development, GraphDB for enterprise deployment
- **Communities**: Semantic Web community forums and working groups

### 🌐 Industry Applications
- **Healthcare**: Clinical decision support and drug discovery
- **Finance**: Risk analysis and regulatory reporting
- **Manufacturing**: Supply chain optimization and quality management
- **Government**: Policy analysis and citizen services

## The Future of Intelligent Data

Semantic knowledge graphs represent the foundation for truly intelligent information systems. As you've seen, they enable:
- **Contextual Understanding**: Data that knows its meaning and relationships
- **Adaptive Systems**: Architectures that evolve with changing business needs
- **Human-AI Collaboration**: Interfaces that support both human insight and machine intelligence
- **Organizational Learning**: Systems that capture and leverage institutional knowledge

---

**Congratulations!** You've completed a comprehensive introduction to semantic knowledge graphs. The journey from data to insight begins with understanding relationships—and you now have the tools to build those connections.

*Ready to transform your organization's data into intelligent assets? The semantic web awaits your contributions.*

# 📁 Export and Save Results

## ⚠️ Prerequisites - READ THIS FIRST!

**Before running the export cell below, you MUST:**

1. **Initialize the Knowledge Graph**: Run the cell that contains `kg = SemanticKnowledgeGraph()`
2. **Load Data**: Ensure the knowledge graph has been populated with data from your analysis
3. **Verify Connection**: The `kg` variable must be available in your notebook session

## 🎯 What This Export Does

This export functionality will generate a comprehensive backup of your semantic knowledge graph analysis including:

### 📊 RDF Data Formats
- **Turtle** (.ttl) - Human-readable RDF format
- **XML** (.rdf) - Standard RDF/XML format  
- **N-Triples** (.n3) - Line-oriented RDF format
- **JSON-LD** (.jsonld) - JSON-based linked data format

### 📈 Analysis Results
- Statistical summaries and insights
- Class distribution data
- Ontology level analysis
- Entity relationship mappings

### 🎨 Visualization Data
- Network graph node/edge data
- Interactive visualization components
- Chart and diagram exports

### 📋 Summary Report
- Comprehensive markdown report
- File inventory and descriptions
- Integration guidance

## 🚀 Expected Output

After successful execution, you will have:
- All files saved to `semantic_exports/` directory
- Timestamped filenames for version control
- Ready-to-use data for external systems
- Professional documentation for stakeholders

## 🛑 What If It Fails?

If you see errors like `name 'kg' is not defined`, it means:
1. You haven't run the knowledge graph initialization cell yet
2. The notebook session was restarted
3. The `kg` variable is out of scope

**Solution**: Go back and run the cells that create and populate the knowledge graph first!