[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Hawksight-AI/semantica/blob/main/cookbook/use_cases/supply_chain/01_Supply_Chain_Data_Integration.ipynb)

# Supply Chain Data Integration Pipeline

## Overview

This notebook demonstrates how to integrate Python/FastMCP MCP servers as data sources for supply chain data ingestion. Connect to supply chain database MCP servers via URL, ingest logistics data, inventory, and shipment information, then build a supply chain knowledge graph.

**IMPORTANT**: This implementation supports ONLY Python-based MCP servers and FastMCP servers. Users can bring their own Python/FastMCP MCP servers via URL connections.


**Documentation**: [API Reference](https://semantica.readthedocs.io/use-cases/)

## Installation

Install Semantica from PyPI:

```bash
pip install semantica
# Or with all optional dependencies:
pip install semantica[all]
```

### Modules Used (20+)

- **Ingestion**: FileIngestor, WebIngestor, FeedIngestor, StreamIngestor, DBIngestor, EmailIngestor, RepoIngestor, MCPIngestor
- **Parsing**: MCPParser, JSONParser, StructuredDataParser, CSVParser
- **Extraction**: NERExtractor, RelationExtractor, EventDetector, TripletExtractor
- **KG**: GraphBuilder, TemporalGraphQuery, GraphAnalyzer, ConnectivityAnalyzer
- **Analytics**: CentralityCalculator, CommunityDetector, ConnectivityAnalyzer
- **Reasoning**: InferenceEngine, RuleManager, ExplanationGenerator
- **Export**: JSONExporter, CSVExporter, RDFExporter, ReportGenerator
- **Visualization**: KGVisualizer, TemporalVisualizer, AnalyticsVisualizer

### Pipeline

**Connect to Supply Chain MCP Server → Ingest Logistics Data via MCP → Parse MCP Responses → Extract Supply Chain Entities → Build Supply Chain KG → Analyze Supply Chain → Generate Reports → Visualize**

---

## Step 1: Connect to Supply Chain Database MCP Server

Connect to a Python/FastMCP MCP server that provides supply chain data via URL. The MCP server can expose resources (inventory databases, shipment records) and tools (logistics queries, inventory checks).


In [None]:
!pip install semantica


In [None]:
from semantica.ingest import MCPIngestor, ingest_mcp, DBIngestor, FileIngestor
from semantica.parse import MCPParser, JSONParser, StructuredDataParser, CSVParser
from semantica.semantic_extract import NERExtractor, RelationExtractor, EventDetector, TripletExtractor
from semantica.kg import GraphBuilder, TemporalGraphQuery, GraphAnalyzer, ConnectivityAnalyzer
from semantica.kg import CentralityCalculator, CommunityDetector
from semantica.reasoning import InferenceEngine, RuleManager, ExplanationGenerator
from semantica.export import JSONExporter, CSVExporter, RDFExporter, ReportGenerator
from semantica.visualization import KGVisualizer, TemporalVisualizer, AnalyticsVisualizer
import json
from datetime import datetime, timedelta

# Initialize MCP ingestor
mcp_ingestor = MCPIngestor()

# Connect to supply chain database MCP server via URL
# Replace with your actual MCP server URL
# Example: http://localhost:8000/mcp or https://api.example.com/supplychain-mcp
supply_chain_mcp_url = "http://localhost:8000/mcp"

# Connect to MCP server with authentication (if required)
mcp_ingestor.connect(
    "supply_chain_server",
    url=supply_chain_mcp_url,
    headers={
        "Authorization": "Bearer your_token",
        "X-API-Key": "your_api_key"
    } if "api.example.com" in supply_chain_mcp_url else {}
)

# List available resources (inventory databases, shipment records)
resources = mcp_ingestor.list_available_resources("supply_chain_server")
print(f"\n📊 Available Resources ({len(resources)}):")
for resource in resources[:5]:  # Show first 5
    print(f"  - {resource.uri}: {resource.name}")
    if resource.description:
        print(f"    {resource.description[:80]}...")

# List available tools (logistics queries, inventory checks)
tools = mcp_ingestor.list_available_tools("supply_chain_server")
print(f"\n🔧 Available Tools ({len(tools)}):")
for tool in tools[:5]:  # Show first 5
    print(f"  - {tool.name}: {tool.description or 'No description'}")


## Step 2: Ingest Supply Chain Data from MCP Server

Ingest logistics data, inventory, and shipment information using both resource-based and tool-based methods.


In [None]:
# Initialize parsers
mcp_parser = MCPParser()
json_parser = JSONParser()
structured_parser = StructuredDataParser()
csv_parser = CSVParser()

supply_chain_data = []

# Method 1: Resource-based ingestion
# Ingest from MCP resources (inventory databases, shipment records)
inventory_data = mcp_ingestor.ingest_resources(
    "supply_chain_server",
    resource_uris=["resource://inventory/database", "resource://shipments/records"]
)

for item in inventory_data:
    supply_chain_data.append(item)
    print(f"  Ingested resource: {item}")

# Method 2: Tool-based ingestion
# Call MCP tools to retrieve data dynamically
# Example: Query inventory levels
inventory_levels = mcp_ingestor.ingest_tool_output(
    "supply_chain_server",
    tool_name="query_inventory",
    arguments={
        "warehouse_id": "WH001",
        "product_category": "Electronics"
    }
)

if inventory_levels:
    supply_chain_data.append(inventory_levels)
    print(f"  Retrieved inventory levels")

# Example: Get shipment status
shipment_status = mcp_ingestor.ingest_tool_output(
    "supply_chain_server",
    tool_name="get_shipment_status",
    arguments={
        "shipment_id": "SH001",
        "include_tracking": True
    }
)

if shipment_status:
    supply_chain_data.append(shipment_status)
    print(f"  Retrieved shipment status")

# Sample supply chain data (if MCP server is not available)
if not supply_chain_data:
    sample_data = {
        "inventory": [
            {
                "warehouse_id": "WH001",
                "product_id": "P001",
                "product_name": "Laptop",
                "quantity": 150,
                "location": "Aisle 3, Shelf 2",
                "last_updated": (datetime.now() - timedelta(days=1)).isoformat()
            },
            {
                "warehouse_id": "WH001",
                "product_id": "P002",
                "product_name": "Mouse",
                "quantity": 500,
                "location": "Aisle 1, Shelf 5",
                "last_updated": (datetime.now() - timedelta(hours=12)).isoformat()
            },
            {
                "warehouse_id": "WH002",
                "product_id": "P001",
                "product_name": "Laptop",
                "quantity": 200,
                "location": "Aisle 2, Shelf 1",
                "last_updated": datetime.now().isoformat()
            }
        ],
        "shipments": [
            {
                "shipment_id": "SH001",
                "origin": "WH001",
                "destination": "WH002",
                "product_id": "P001",
                "quantity": 50,
                "status": "in_transit",
                "estimated_arrival": (datetime.now() + timedelta(days=2)).isoformat(),
                "timestamp": (datetime.now() - timedelta(days=1)).isoformat()
            },
            {
                "shipment_id": "SH002",
                "origin": "WH002",
                "destination": "Customer A",
                "product_id": "P002",
                "quantity": 100,
                "status": "delivered",
                "estimated_arrival": (datetime.now() - timedelta(days=1)).isoformat(),
                "timestamp": (datetime.now() - timedelta(days=3)).isoformat()
            }
        ]
    }
    supply_chain_data.append(sample_data)
    print(f"  Loaded {len(sample_data['inventory'])} inventory records")
    print(f"  Loaded {len(sample_data['shipments'])} shipment records")

print(f"\n📊 Total supply chain data items ingested: {len(supply_chain_data)}")


## Step 3: Parse Supply Chain Data

Parse the supply chain data received from MCP server responses.


In [None]:
parsed_supply_chain_data = []

# Parse MCP responses
for data_item in supply_chain_data:
    # Parse MCP response (handles JSON, text, binary)
    if isinstance(data_item, dict):
        parsed_item = data_item
    else:
        parsed_item = mcp_parser.parse_response(data_item, response_type="json")
    
    parsed_supply_chain_data.append(parsed_item)
    print(f"  Parsed data item")

# Extract inventory and shipments
inventory_records = []
shipment_records = []

for item in parsed_supply_chain_data:
    if isinstance(item, dict):
        if "inventory" in item:
            inventory_records.extend(item["inventory"])
        elif "warehouse_id" in item and "product_id" in item:
            inventory_records.append(item)
        elif "shipments" in item:
            shipment_records.extend(item["shipments"])
        elif "shipment_id" in item:
            shipment_records.append(item)


## Step 4: Extract Supply Chain Entities and Relationships

Extract supply chain entities (warehouses, products, shipments, locations) and relationships from MCP data.


In [None]:
ner_extractor = NERExtractor()
relation_extractor = RelationExtractor()
event_detector = EventDetector()
triplet_extractor = TripletExtractor()

supply_chain_entities = []
supply_chain_relationships = []

# Extract from inventory records
for inventory in inventory_records:
    if isinstance(inventory, dict):
        warehouse_id = inventory.get("warehouse_id", "")
        product_id = inventory.get("product_id", "")
        product_name = inventory.get("product_name", "")
        location = inventory.get("location", "")
        
        # Warehouse entity
        supply_chain_entities.append({
            "id": warehouse_id,
            "type": "Warehouse",
            "name": warehouse_id,
            "properties": {}
        })
        
        # Product entity
        supply_chain_entities.append({
            "id": product_id,
            "type": "Product",
            "name": product_name or product_id,
            "properties": {}
        })
        
        # Warehouse-Product relationship (inventory)
        supply_chain_relationships.append({
            "source": warehouse_id,
            "target": product_id,
            "type": "stocks",
            "properties": {
                "quantity": inventory.get("quantity", 0),
                "location": location,
                "last_updated": inventory.get("last_updated", "")
            }
        })

# Extract from shipment records
for shipment in shipment_records:
    if isinstance(shipment, dict):
        shipment_id = shipment.get("shipment_id", "")
        origin = shipment.get("origin", "")
        destination = shipment.get("destination", "")
        product_id = shipment.get("product_id", "")
        
        # Shipment entity
        supply_chain_entities.append({
            "id": shipment_id,
            "type": "Shipment",
            "name": shipment_id,
            "properties": {
                "status": shipment.get("status", ""),
                "quantity": shipment.get("quantity", 0),
                "estimated_arrival": shipment.get("estimated_arrival", ""),
                "timestamp": shipment.get("timestamp", "")
            }
        })
        
        # Origin-Destination relationships
        if origin:
            supply_chain_relationships.append({
                "source": origin,
                "target": shipment_id,
                "type": "ships_from",
                "properties": {"timestamp": shipment.get("timestamp", "")}
            })
        
        if destination:
            supply_chain_relationships.append({
                "source": shipment_id,
                "target": destination,
                "type": "ships_to",
                "properties": {"timestamp": shipment.get("timestamp", "")}
            })
        
        # Shipment-Product relationship
        if product_id:
            supply_chain_relationships.append({
                "source": shipment_id,
                "target": product_id,
                "type": "contains",
                "properties": {"quantity": shipment.get("quantity", 0)}
            })

# Remove duplicates
seen_entities = set()
unique_entities = []
for entity in supply_chain_entities:
    entity_key = (entity["id"], entity["type"])
    if entity_key not in seen_entities:
        seen_entities.add(entity_key)
        unique_entities.append(entity)

supply_chain_entities = unique_entities

print(f"  - Warehouses: {len([e for e in supply_chain_entities if e['type'] == 'Warehouse'])}")
print(f"  - Products: {len([e for e in supply_chain_entities if e['type'] == 'Product'])}")
print(f"  - Shipments: {len([e for e in supply_chain_entities if e['type'] == 'Shipment'])}")


## Step 5: Build Supply Chain Knowledge Graph

Build a knowledge graph from the extracted supply chain entities and relationships.


In [None]:
builder = GraphBuilder()
temporal_query = TemporalGraphQuery()
graph_analyzer = GraphAnalyzer()
connectivity_analyzer = ConnectivityAnalyzer()

# Build knowledge graph
supply_chain_kg = builder.build(supply_chain_entities, supply_chain_relationships)

# Analyze graph structure
metrics = graph_analyzer.compute_metrics(supply_chain_kg)
centrality_calculator = CentralityCalculator()
community_detector = CommunityDetector()
connectivity = connectivity_analyzer.analyze_connectivity(supply_chain_kg)

# Calculate graph metrics
centrality_result = centrality_calculator.calculate_degree_centrality(supply_chain_kg)
centrality_scores = centrality_result.get('centrality', {})
communities = community_detector.detect_communities(supply_chain_kg)

print(f"  Entities: {len(supply_chain_kg.get('entities', []))}")
print(f"  Relationships: {len(supply_chain_kg.get('relationships', []))}")
print(f"  Graph density: {metrics.get('density', 0):.3f}")
print(f"  Communities detected: {len(communities)}")
print(f"  Connected components: {connectivity.get('connected_components', 0)}")


## Step 6: Analyze Supply Chain

Analyze supply chain patterns using temporal queries and inference rules.


In [None]:
# Temporal analysis
start_time = (datetime.now() - timedelta(days=30)).isoformat()
end_time = datetime.now().isoformat()

temporal_results = temporal_query.query_time_range(
    graph=supply_chain_kg,
    query="Find shipments in the last 30 days",
    start_time=start_time,
    end_time=end_time
)

# Inference engine for supply chain rules
inference_engine = InferenceEngine()
rule_manager = RuleManager()
explanation_generator = ExplanationGenerator()

# Supply chain analysis rules
inference_engine.add_rule("IF quantity < 100 AND product_type(Electronics) THEN low_stock_alert")
inference_engine.add_rule("IF status(in_transit) AND days_since_shipment > 5 THEN delayed_shipment")
inference_engine.add_rule("IF stocks(Warehouse, Product) AND quantity > 500 THEN high_inventory")

# Add facts from supply chain data
for inventory in inventory_records:
    if isinstance(inventory, dict):
        inference_engine.add_fact({
            "warehouse": inventory.get("warehouse_id", ""),
            "product": inventory.get("product_id", ""),
            "quantity": inventory.get("quantity", 0),
            "product_name": inventory.get("product_name", "")
        })

for shipment in shipment_records:
    if isinstance(shipment, dict):
        days_since = (datetime.now() - datetime.fromisoformat(shipment.get("timestamp", datetime.now().isoformat()))).days
        inference_engine.add_fact({
            "shipment_id": shipment.get("shipment_id", ""),
            "status": shipment.get("status", ""),
            "days_since_shipment": days_since
        })

# Generate supply chain insights
supply_chain_insights = inference_engine.forward_chain()

print(f"  Temporal entities: {len(temporal_results.get('entities', []))}")
print(f"  Supply chain insights: {len(supply_chain_insights)}")

# Display insights
for insight in supply_chain_insights[:3]:
    print(f"  - {insight}")


## Step 7: Export and Visualize

Export the supply chain knowledge graph and generate visualizations.


In [None]:
import tempfile
import os

temp_dir = tempfile.mkdtemp()

json_exporter = JSONExporter()
csv_exporter = CSVExporter()
rdf_exporter = RDFExporter()
report_generator = ReportGenerator()

# Export knowledge graph
json_exporter.export_knowledge_graph(supply_chain_kg, os.path.join(temp_dir, "supply_chain_kg.json"))
csv_exporter.export_entities(supply_chain_entities, os.path.join(temp_dir, "supply_chain_entities.csv"))
rdf_exporter.export_knowledge_graph(supply_chain_kg, os.path.join(temp_dir, "supply_chain_kg.rdf"))

# Generate report
report_data = {
    "summary": f"Supply chain data integration from MCP server identified {len(supply_chain_insights)} insights",
    "warehouses": len([e for e in supply_chain_entities if e['type'] == 'Warehouse']),
    "products": len([e for e in supply_chain_entities if e['type'] == 'Product']),
    "shipments": len([e for e in supply_chain_entities if e['type'] == 'Shipment']),
    "inventory_records": len(inventory_records),
    "insights": len(supply_chain_insights)
}

report = report_generator.generate_report(report_data, format="markdown")

print(f"  JSON: {os.path.join(temp_dir, 'supply_chain_kg.json')}")
print(f"  CSV: {os.path.join(temp_dir, 'supply_chain_entities.csv')}")
print(f"  RDF: {os.path.join(temp_dir, 'supply_chain_kg.rdf')}")

# Visualize
kg_visualizer = KGVisualizer()
temporal_visualizer = TemporalVisualizer()
analytics_visualizer = AnalyticsVisualizer()

kg_viz = kg_visualizer.visualize_network(supply_chain_kg, output="interactive")
temporal_viz = temporal_visualizer.visualize_timeline(supply_chain_kg, output="interactive")
analytics_viz = analytics_visualizer.visualize_analytics(supply_chain_kg, output="interactive")


# Cleanup: Disconnect from MCP server
mcp_ingestor.disconnect("supply_chain_server")
print("  Disconnected from MCP server")

print(f"📊 Total modules used: 20+")
