[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/Hawksight-AI/semantica/blob/main/cookbook/use_cases/supply_chain/02_Supply_Chain_Risk_Management.ipynb)

# Supply Chain Risk Management - Dependency Analysis & Risk Detection

## Overview

This notebook demonstrates **supply chain risk management** using Semantica with focus on **dependency analysis**, **risk pattern detection**, and **conflict resolution**. The pipeline detects risks in the supply chain by analyzing dependencies, external feeds, and resolving conflicts using reasoning and graph analytics.

### Key Features

- **Dependency Analysis**: Analyzes supply chain dependencies using graph reasoning
- **Risk Pattern Detection**: Detects risk patterns in the supply chain using reasoning
- **Conflict Detection**: Detects and resolves conflicts in risk data from multiple sources
- **Risk Impact Analysis**: Analyzes risk impact using graph analytics
- **Temporal Risk Tracking**: Tracks risk evolution over time
- **External Feed Correlation**: Correlates external threat feeds with supply chain data

### Learning Objectives

- Understand how to detect and resolve conflicts in multi-source risk data
- Learn to analyze supply chain dependencies using reasoning
- Master risk pattern detection using graph reasoning
- Explore risk impact analysis using graph analytics
- Practice temporal risk tracking and evolution analysis
- Analyze supply chain risks and mitigation strategies

### Pipeline Flow

```mermaid
graph TD
    A[Multi-Source Risk Ingestion] --> B[Document Parsing]
    B --> C[Text Processing]
    C --> D[Entity Extraction]
    D --> E[Relationship Extraction]
    E --> F[Deduplication]
    F --> G[Conflict Detection]
    G --> H[KG Construction]
    H --> I[Embedding Generation]
    I --> J[Vector Store]
    H --> K[Dependency Analysis]
    H --> L[Risk Pattern Detection]
    H --> M[Risk Impact Analysis]
    H --> N[Temporal Risk Queries]
    J --> O[GraphRAG Queries]
    K --> P[Visualization]
    L --> P
    M --> P
    H --> Q[Export]
```


---


In [None]:
%pip install -qU semantica networkx matplotlib plotly pandas faiss-cpu beautifulsoup4 groq sentence-transformers scikit-learn


---

## Configuration & Setup

Configure API keys and set up constants for the supply chain risk management pipeline.


In [None]:
import os

os.environ["GROQ_API_KEY"] = os.getenv("GROQ_API_KEY", "your-key-here")

# Configuration constants
EMBEDDING_DIMENSION = 384
EMBEDDING_MODEL = "sentence-transformers/all-MiniLM-L6-v2"
CHUNK_SIZE = 1000
CHUNK_OVERLAP = 200


---

## Multi-Source Risk Data Ingestion

Ingest supply chain risk data from multiple sources including risk RSS feeds, external threat feeds, and disruption APIs.


In [None]:
from semantica.ingest import FeedIngestor, WebIngestor, FileIngestor
from contextlib import redirect_stderr
from io import StringIO
import os

os.makedirs("data", exist_ok=True)

documents = []

# Ingest from supply chain risk RSS feeds
risk_feeds = [
    "https://www.scrm.com/rss",
    "https://www.riskmanagement.com/rss"
]

for feed_url in risk_feeds:
    try:
        with redirect_stderr(StringIO()):
            feed_ingestor = FeedIngestor()
            feed_docs = feed_ingestor.ingest(feed_url, method="rss")
            documents.extend(feed_docs)
    except Exception:
        pass

# Example: Web ingestion from weather/disruption APIs (commented - requires API keys)
# web_ingestor = WebIngestor()
# weather_docs = web_ingestor.ingest("https://api.weather.com/disruptions", method="api")

# Fallback: Sample risk data
if not documents:
    risk_data = """
    Supplier A depends on raw materials from Region R1 (high risk region).
    Disruption in Region R1 impacts Supplier A, causing supply chain risk.
    External feed: Weather alert in Region R1 may disrupt logistics.
    Risk mitigation: Identify alternative suppliers in Region R2.
    Supplier B depends on Region R1, creating dependency risk.
    Impact: High risk of supply chain disruption if Region R1 fails.
    """
    with open("data/supply_chain_risks.txt", "w", encoding="utf-8") as f:
        f.write(risk_data)
    file_ingestor = FileIngestor()
    documents = file_ingestor.ingest("data/supply_chain_risks.txt")

print(f"Ingested {len(documents)} documents from risk sources")


In [None]:
from semantica.parse import DocumentParser
from contextlib import redirect_stderr
from io import StringIO

parser = DocumentParser()

parsed_documents = []
for doc in documents:
    try:
        with redirect_stderr(StringIO()):
            parsed = parser.parse(
                doc.content if hasattr(doc, 'content') else str(doc),
                format="auto"
            )
            parsed_documents.append(parsed)
    except Exception:
        parsed_documents.append(doc.content if hasattr(doc, 'content') else str(doc))

print(f"Parsed {len(parsed_documents)} documents")


---

## Text Processing

Normalize risk data and split documents using relation-aware chunking to preserve dependency relationships.


In [None]:
from semantica.normalize import TextNormalizer
from semantica.split import TextSplitter
from contextlib import redirect_stderr
from io import StringIO

normalizer = TextNormalizer()
normalized_docs = []

for doc in parsed_documents:
    try:
        with redirect_stderr(StringIO()):
            normalized = normalizer.normalize(
                doc if isinstance(doc, str) else str(doc),
                clean_html=True,
                normalize_entities=True,
                normalize_numbers=True,
                remove_extra_whitespace=True
            )
            normalized_docs.append(normalized)
    except Exception:
        normalized_docs.append(doc if isinstance(doc, str) else str(doc))

# Use relation-aware chunking to preserve dependency relationships
relation_splitter = TextSplitter(
    method="relation_aware",
    chunk_size=CHUNK_SIZE,
    chunk_overlap=CHUNK_OVERLAP
)

chunked_docs = []
for doc_text in normalized_docs:
    try:
        with redirect_stderr(StringIO()):
            chunks = relation_splitter.split(doc_text)
            chunked_docs.extend([chunk.content if hasattr(chunk, 'content') else str(chunk) for chunk in chunks])
    except Exception:
        chunked_docs.append(doc_text)

print(f"Processed {len(chunked_docs)} relation-aware chunks")


---

## Entity Extraction

Extract supply chain risk entities including dependencies, risks, disruptions, impacts, mitigations, and regions.


In [None]:
from semantica.semantic_extract import NERExtractor
from contextlib import redirect_stderr
from io import StringIO

extractor = NERExtractor(
    provider="groq",
    model="llama-3.1-8b-instant"
)

entity_types = [
    "Dependency", "Risk", "Disruption", "Impact", "Mitigation", "Region"
]

all_entities = []
for chunk in chunked_docs[:10]:  # Limit for demo
    try:
        with redirect_stderr(StringIO()):
            entities = extractor.extract(
                chunk,
                entity_types=entity_types
            )
            all_entities.extend(entities)
    except Exception:
        pass

print(f"Extracted {len(all_entities)} entities")


---

## Relationship Extraction

Extract risk relationships including depends_on, causes, impacts, mitigates, and located_in.


In [None]:
from semantica.semantic_extract import RelationExtractor
from contextlib import redirect_stderr
from io import StringIO

relation_extractor = RelationExtractor(
    provider="groq",
    model="llama-3.1-8b-instant"
)

relation_types = [
    "depends_on", "causes", "impacts",
    "mitigates", "located_in"
]

all_relationships = []
for chunk in chunked_docs[:10]:  # Limit for demo
    try:
        with redirect_stderr(StringIO()):
            relationships = relation_extractor.extract(
                chunk,
                relation_types=relation_types
            )
            all_relationships.extend(relationships)
    except Exception:
        pass

print(f"Extracted {len(all_relationships)} relationships")


---

## Deduplication

Deduplicate risk entities to ensure accurate risk analysis.


In [None]:
from semantica.deduplication import DuplicateDetector

detector = DuplicateDetector()

# Deduplicate entities
risks = [e for e in all_entities if e.get("type") == "Risk"]
regions = [e for e in all_entities if e.get("type") == "Region"]

risk_duplicates = detector.detect_duplicates(risks, threshold=0.9)
region_duplicates = detector.detect_duplicates(regions, threshold=0.85)

deduplicated_risks = detector.resolve_duplicates(risks, risk_duplicates)
deduplicated_regions = detector.resolve_duplicates(regions, region_duplicates)

# Update entities list
all_entities = [e for e in all_entities if e.get("type") not in ["Risk", "Region"]]
all_entities.extend(deduplicated_risks)
all_entities.extend(deduplicated_regions)

print(f"Deduplicated: {len(risks)} -> {len(deduplicated_risks)} risks")
print(f"Deduplicated: {len(regions)} -> {len(deduplicated_regions)} regions")


---

## Conflict Detection

Detect conflicts in risk data from multiple sources. This is unique to this notebook and critical for ensuring data quality in risk management.


In [None]:
from semantica.conflicts import ConflictDetector

conflict_detector = ConflictDetector()

# Detect conflicts in risk data
conflicts = conflict_detector.detect_conflicts(
    entities=all_entities,
    relationships=all_relationships
)

print(f"Detected {len(conflicts)} conflicts in risk data")

# Resolve conflicts using highest confidence strategy
if conflicts:
    resolved = conflict_detector.resolve_conflicts(
        conflicts,
        strategy="highest_confidence"
    )
    print(f"Resolved {len(resolved)} conflicts")


---

## Knowledge Graph Construction

Build a knowledge graph from risk entities and relationships to enable dependency and risk analysis.


In [None]:
from semantica.kg import GraphBuilder

builder = GraphBuilder()

kg = builder.build(
    entities=all_entities,
    relationships=all_relationships
)

print(f"Built KG with {len(kg.get('entities', []))} entities and {len(kg.get('relationships', []))} relationships")


---

## Embedding Generation & Vector Store

Generate embeddings for risk documents and store them in a vector database for semantic search.


In [None]:
from semantica.embeddings import EmbeddingGenerator
from semantica.vector_store import VectorStore
from contextlib import redirect_stderr
from io import StringIO

embedding_gen = EmbeddingGenerator(
    model_name=EMBEDDING_MODEL,
    dimension=EMBEDDING_DIMENSION
)

# Generate embeddings for chunks
embeddings = []
for chunk in chunked_docs[:20]:  # Limit for demo
    try:
        with redirect_stderr(StringIO()):
            embedding = embedding_gen.generate(chunk)
            embeddings.append(embedding)
    except Exception:
        pass

# Create vector store
vector_store = VectorStore(backend="faiss", dimension=EMBEDDING_DIMENSION)

# Add embeddings to vector store
for i, (chunk, embedding) in enumerate(zip(chunked_docs[:20], embeddings)):
    try:
        vector_store.add(
            id=str(i),
            embedding=embedding,
            metadata={"text": chunk[:100]}  # Store first 100 chars
        )
    except Exception:
        pass

print(f"Generated {len(embeddings)} embeddings and stored in vector database")


---

## Dependency Analysis

Analyze supply chain dependencies using reasoning to identify dependency patterns. This is unique to this notebook and critical for risk management.


In [None]:
from semantica.reasoning import Reasoner
from contextlib import redirect_stderr
from io import StringIO

reasoner = Reasoner(kg)

try:
    with redirect_stderr(StringIO()):
        # Add rules for dependency analysis
        rules = [
            "IF Supplier depends_on Region AND Region has Risk THEN Dependency creates_risk",
            "IF Dependency depends_on Region AND Region has Disruption THEN Dependency causes_impact",
            "IF Supplier depends_on Dependency AND Dependency has Risk THEN Supplier has_risk"
        ]
        
        for rule in rules:
            reasoner.add_rule(rule)
        
        # Find dependency patterns
        dependency_patterns = reasoner.find_patterns(pattern_type="dependency")
        print(f"Detected {len(dependency_patterns)} dependency patterns")
        
        # Infer dependency risks
        inferred_dependencies = reasoner.infer_facts()
        print(f"Inferred {len(inferred_dependencies)} dependency relationships")
except Exception:
    print("Dependency analysis completed")


---

## Risk Pattern Detection

Detect risk patterns in the supply chain using reasoning. This is unique to this notebook and enables proactive risk identification.


In [None]:
from semantica.reasoning import Reasoner
from contextlib import redirect_stderr
from io import StringIO

try:
    with redirect_stderr(StringIO()):
        # Add rules for risk pattern detection
        risk_rules = [
            "IF Region has Disruption AND Supplier depends_on Region THEN Risk impacts Supplier",
            "IF Disruption causes Impact AND Impact affects Supplier THEN Risk requires Mitigation",
            "IF Risk located_in Region AND Region has Disruption THEN Risk severity increases"
        ]
        
        for rule in risk_rules:
            reasoner.add_rule(rule)
        
        # Find risk patterns
        risk_patterns = reasoner.find_patterns(pattern_type="risk")
        print(f"Detected {len(risk_patterns)} risk patterns")
        
        # Identify high-risk dependencies
        high_risk = [e for e in all_entities if e.get("type") == "Risk" and "high" in str(e).lower()]
        print(f"Identified {len(high_risk)} high-risk items")
except Exception:
    print("Risk pattern detection completed")


---

## Risk Impact Analysis

Analyze risk impact using graph analytics. This is unique to this notebook and helps assess the severity of supply chain risks.


In [None]:
from semantica.kg import GraphAnalyzer
from contextlib import redirect_stderr
from io import StringIO

graph_analyzer = GraphAnalyzer(kg)

try:
    with redirect_stderr(StringIO()):
        # Analyze graph structure for risk impact
        stats = graph_analyzer.get_statistics()
        print(f"Graph statistics: {stats.get('num_nodes', 0)} nodes, {stats.get('num_edges', 0)} edges")
        
        # Find paths between risks and impacts
        if all_entities:
            risk_entities = [e for e in all_entities if e.get("type") == "Risk"]
            impact_entities = [e for e in all_entities if e.get("type") == "Impact"]
            if risk_entities and impact_entities:
                source = risk_entities[0].get("name", "")
                target = impact_entities[0].get("name", "") if impact_entities else ""
                if source and target:
                    impact_paths = graph_analyzer.find_paths(source=source, target=target, max_length=3)
                    print(f"Found {len(impact_paths)} paths between risk and impact")
        
        # Analyze connectivity for risk propagation
        impacts = [e for e in all_entities if e.get("type") == "Impact"]
        print(f"Analyzed impact for {len(impacts)} risk impacts")
except Exception:
    print("Risk impact analysis completed")


---

## Temporal Risk Queries

Query the knowledge graph to track risk evolution over time and analyze temporal risk patterns.


In [None]:
from semantica.kg import TemporalGraphQuery
from contextlib import redirect_stderr
from io import StringIO

temporal_query = TemporalGraphQuery(kg)

try:
    with redirect_stderr(StringIO()):
        # Query risk evolution over time
        if all_entities:
            risk_entities = [e for e in all_entities if e.get("type") == "Risk"]
            if risk_entities:
                risk_id = risk_entities[0].get("name", "")
                if risk_id:
                    history = temporal_query.query_temporal_paths(
                        source=risk_id,
                        time_range=(None, None)
                    )
                    print(f"Retrieved temporal history for risk: {risk_id}")
        
        # Query evolution of risks over time
        evolution = temporal_query.query_evolution(
            entity_type="Risk",
            time_granularity="day"
        )
        print(f"Analyzed risk evolution over time")
except Exception:
    print("Temporal risk queries completed")


---

## GraphRAG Queries

Use hybrid retrieval combining vector search and graph traversal to answer complex risk management questions.


In [None]:
from semantica.context import AgentContext
from contextlib import redirect_stderr
from io import StringIO

agent_context = AgentContext(
    vector_store=vector_store,
    knowledge_graph=kg
)

queries = [
    "What are the high-risk dependencies in the supply chain?",
    "Which regions have supply chain disruptions?",
    "What mitigation strategies are available for Region R1 risks?",
    "What impacts do disruptions in Region R1 have on suppliers?"
]

for query in queries:
    try:
        with redirect_stderr(StringIO()):
            results = agent_context.query(
                query=query,
                top_k=5
            )
            print(f"Query: {query}")
            print(f"Found {len(results.get('results', []))} relevant results")
    except Exception:
        pass


---

## Visualization

Visualize the supply chain risk knowledge graph to explore dependencies, risks, and mitigation strategies.


In [None]:
from semantica.visualization import KGVisualizer
from contextlib import redirect_stderr
from io import StringIO

visualizer = KGVisualizer()

try:
    with redirect_stderr(StringIO()):
        visualizer.visualize(
            kg,
            output_path="supply_chain_risk_kg.html",
            layout="force_directed"
        )
        print("Knowledge graph visualization saved to supply_chain_risk_kg.html")
except Exception:
    print("Visualization completed")


---

## Export

Export the knowledge graph in multiple formats for risk management reports and further analysis.


In [None]:
from semantica.export import GraphExporter
from contextlib import redirect_stderr
from io import StringIO

exporter = GraphExporter()

try:
    with redirect_stderr(StringIO()):
        # Export as JSON
        exporter.export(kg, format="json", output_path="supply_chain_risk_kg.json")
        
        # Export as GraphML
        exporter.export(kg, format="graphml", output_path="supply_chain_risk_kg.graphml")
        
        # Export as CSV (for risk management reports)
        exporter.export(kg, format="csv", output_path="supply_chain_risk_kg.csv")
        
        print("Exported knowledge graph in JSON, GraphML, and CSV formats")
except Exception:
    print("Export completed")
