[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Hawksight-AI/semantica/blob/main/cookbook/use_cases/renewable_energy/02_Smart_Grid_Management.ipynb)

# Smart Grid Management - Stream Processing & Real-Time Monitoring

## Overview

This notebook demonstrates **smart grid management** using Semantica with focus on **stream processing**, **real-time monitoring**, and **failure prediction**. The pipeline streams grid sensor data to monitor grid health and predict failures using temporal pattern detection.

### Key Features

- **Stream Processing**: Emphasizes real-time stream ingestion from grid sensors
- **Real-Time Monitoring**: Monitors grid health in real-time
- **Failure Prediction**: Uses temporal pattern detection to predict grid failures
- **Anomaly Detection**: Detects anomalies in grid sensor data
- **Temporal Pattern Detection**: Identifies patterns in sensor data streams

### Pipeline Architecture

1. **Phase 0**: Setup & Configuration
2. **Phase 1**: Grid Sensor Stream Ingestion
3. **Phase 2**: Real-Time Data Parsing
4. **Phase 3**: Sensor Entity Extraction
5. **Phase 4**: Temporal Knowledge Graph Construction
6. **Phase 5**: Real-Time Monitoring
7. **Phase 6**: Failure Prediction & Anomaly Detection
8. **Phase 7**: Alert Generation & Visualization

---

## Installation


In [None]:
%pip install -qU semantica networkx matplotlib plotly pandas groq


---

## Phase 0: Setup & Configuration


In [None]:
import os
from semantica.core import Semantica, ConfigManager
from semantica.ingest import StreamIngestor

os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY", "your-key")

config_dict = {
    "project_name": "Smart_Grid_Management",
    "extraction": {"provider": "groq", "model": "llama-3.1-8b-instant"},
    "knowledge_graph": {"backend": "networkx", "temporal": True}
}

config = ConfigManager().load_from_dict(config_dict)
core = Semantica(config=config)
print("Configured for smart grid management with stream processing focus")


---

## Phase 1: Real Data Ingestion (Sensor Stream)

Ingest grid sensor data from simulated stream using StreamIngestor.


In [None]:
from semantica.ingest import StreamIngestor, FileIngestor
from semantica.normalize import TextNormalizer
from semantica.split import TextSplitter
import os

os.makedirs("data", exist_ok=True)

# Option 1: Ingest from sensor stream (simulated Kafka)
# In production: stream_ingestor = StreamIngestor()
# stream_documents = stream_ingestor.ingest("kafka://localhost:9092/grid-sensors", method="kafka")

# Fallback: Sample sensor stream data
sensor_data = """
2024-01-01 10:00:00 - Sensor S001: Voltage 230V, Status: Normal
2024-01-01 10:01:00 - Sensor S002: Voltage 225V, Status: Normal
2024-01-01 10:02:00 - Sensor S001: Voltage 210V, Status: Warning (voltage drop)
2024-01-01 10:03:00 - Sensor S003: Voltage 200V, Status: Alert (potential failure)
2024-01-01 10:04:00 - Sensor S001: Voltage 205V, Status: Warning
"""

with open("data/grid_sensors.txt", "w") as f:
    f.write(sensor_data)

documents = FileIngestor().ingest("data/grid_sensors.txt")
print(f"Ingested {len(documents)} documents from sensor stream")


---

## Phase 2: Text Normalization & Advanced Chunking

Normalize sensor data and use token/sentence chunking for stream data.


In [None]:
# Normalize sensor data
normalizer = TextNormalizer()
normalized_documents = []
for doc in documents:
    normalized_text = normalizer.normalize(
        doc.content if hasattr(doc, 'content') else str(doc),
        clean_html=True,
        normalize_entities=True,
        normalize_numbers=True,
        remove_extra_whitespace=True
    )
    normalized_documents.append(normalized_text)

print(f"Normalized {len(normalized_documents)} documents")

# Use token chunking for fixed-size sensor data chunks
# Alternative: sentence chunking for structured sensor logs
splitter = TextSplitter(method="token", chunk_size=100, chunk_overlap=10)
# splitter = TextSplitter(method="sentence", chunk_size=500, chunk_overlap=50)

chunked_docs = []
for doc_text in normalized_documents:
    chunks = splitter.split(doc_text)
    chunked_docs.extend([chunk.content if hasattr(chunk, 'content') else str(chunk) for chunk in chunks])

print(f"Created {len(chunked_docs)} chunks using token chunking")


---

## Phase 3-4: Temporal Knowledge Graph Construction

Build full temporal KG with TemporalGraphQuery capabilities.


In [None]:
from semantica.kg import TemporalGraphQuery

# Build temporal knowledge graph
result = core.build_knowledge_base(
    sources=chunked_docs,
    custom_entity_types=["Sensor", "Grid", "Failure", "Alert", "Prediction"],
    graph=True,
    temporal=True
)

kg = result["knowledge_graph"]

# Initialize temporal graph query engine
temporal_query = TemporalGraphQuery(
    enable_temporal_reasoning=True,
    temporal_granularity="minute"  # Fine-grained for real-time sensors
)

# Query graph at specific time point
query_results = temporal_query.query_at_time(
    kg,
    query={"type": "Alert"},
    at_time="2024-01-01 10:03:00"
)

# Analyze temporal evolution
evolution = temporal_query.analyze_evolution(kg)

print(f"Built temporal grid KG with {len(kg.get('entities', []))} entities")
print(f"Temporal queries: {len(query_results)} alerts at query time")
print("Focus: Stream processing, real-time monitoring, failure prediction, anomaly detection")


In [None]:
from semantica.reasoning import GraphReasoner

# Detect failure patterns
reasoner = GraphReasoner(kg)
failure_patterns = reasoner.find_patterns(pattern_type="failure")

# Temporal pattern detection
temporal_patterns = temporal_query.detect_temporal_patterns(kg, pattern_type="sequence")

# Identify alerts
alerts = [e for e in kg.get("entities", []) if e.get("type") == "Alert"]

print(f"Real-time monitoring: {len(alerts)} alerts generated")
print(f"Failure prediction: {len(failure_patterns)} failure patterns detected")
print(f"Temporal patterns: {len(temporal_patterns)} temporal patterns detected")
print("\n=== Pipeline Summary ===")
print(f"✓ Ingested {len(documents)} documents from sensor stream")
print(f"✓ Normalized {len(normalized_documents)} documents")
print(f"✓ Created {len(chunked_docs)} chunks using token chunking")
print(f"✓ Built temporal KG with {len(kg.get('entities', []))} entities")
print(f"✓ This cookbook emphasizes stream processing, temporal KGs, and real-time monitoring")


---

## Phase 7: Visualization


In [None]:
from semantica.visualization import KGVisualizer

visualizer = KGVisualizer()
visualizer.visualize(kg, output_path="smart_grid_kg.html", layout="temporal")

print("Smart grid management analysis complete")
print("Emphasizes: Stream processing, real-time monitoring, failure prediction, anomaly detection")
