# AITF — AI Telemetry Framework

## Interactive Demo on Google Colab

**AITF** is a security-first telemetry framework for AI systems built on [OpenTelemetry](https://opentelemetry.io/) and [OCSF](https://ocsf.io/). It provides:

| Capability | Description |
|---|---|
| **OCSF Category 7** | 8 AI event classes (7001–7008) for SIEM/XDR integration |
| **12 Instrumentors** | LLM, Agent, MCP, RAG, Skills, ModelOps, Identity, and more |
| **3 Exporters** | OCSF JSON, Immutable Audit Logs, CEF Syslog |
| **4 Security Processors** | OWASP LLM Top 10, PII Redaction, Cost Tracking, Memory Monitoring |
| **Vendor Mapping** | Declarative JSON mappings for LangChain, CrewAI, and custom frameworks |
| **8 Compliance Frameworks** | NIST AI RMF, EU AI Act, MITRE ATLAS, ISO 42001, SOC2, GDPR, CCPA, CSA AICM |
| **AI-BOM** | AI Bill of Materials generation in AITF, CycloneDX, and SPDX formats |

This notebook walks through each capability interactively.

---

## Table of Contents

1. [Setup & Installation](#setup)
2. [OCSF Schema Explorer](#schema)
3. [LLM Inference Tracing](#llm)
4. [Agent Session Tracing](#agent)
5. [Vendor Mapping — LangChain & CrewAI](#vendor)
6. [Compliance Framework Mapping](#compliance)
7. [Agentic Log Entries](#agentic-log)
8. [AI-BOM Generation](#aibom)
9. [Full Pipeline — End to End](#pipeline)
10. [Export & Visualization](#viz)

---

## 1. Setup & Installation <a id="setup"></a>

In [None]:
# Install AITF and dependencies
# On Colab, this installs from the repo; locally you can use: pip install aitf
import subprocess, sys

# Clone the repo (Colab) or use local install
try:
    import google.colab
    IN_COLAB = True
    print("Running in Google Colab — installing from GitHub...")
    subprocess.check_call([sys.executable, "-m", "pip", "install", "-q",
                           "opentelemetry-api", "opentelemetry-sdk", "pydantic>=2.0"])
    subprocess.check_call(["git", "clone", "-q", "https://github.com/girdav01/AITF.git", "/content/AITF"])
    sys.path.insert(0, "/content/AITF/sdk/python/src")
except ImportError:
    IN_COLAB = False
    print("Running locally.")

# Verify imports
import aitf
print(f"\nAITF version: {aitf.__version__}")
print("Setup complete!")

In [None]:
# Common imports used throughout the notebook
import json
from unittest.mock import MagicMock
from datetime import datetime, timezone

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import SimpleSpanProcessor, ConsoleSpanExporter

# Pretty-print helper
def pp(obj, title=None):
    """Pretty-print a dict or Pydantic model."""
    if title:
        print(f"\n{'='*60}")
        print(f"  {title}")
        print(f"{'='*60}")
    if hasattr(obj, 'model_dump'):
        obj = obj.model_dump(exclude_none=True)
    print(json.dumps(obj, indent=2, default=str))

# Mock span helper (simulates what OpenTelemetry produces)
def make_span(name, attributes=None):
    """Create a mock ReadableSpan for demonstration."""
    span = MagicMock()
    span.name = name
    span.attributes = attributes or {}
    span.start_time = int(datetime.now(timezone.utc).timestamp() * 1e9)
    return span

print("Common imports loaded.")

---

## 2. OCSF Schema Explorer <a id="schema"></a>

AITF defines **Category 7: AI System Activity** in the OCSF schema with eight event classes.

In [None]:
from aitf.ocsf.schema import (
    AIClassUID, AIBaseEvent, AIModelInfo, AITokenUsage,
    AILatencyMetrics, AICostInfo, OCSFMetadata, OCSFSeverity,
    ComplianceMetadata, AISecurityFinding,
)
from aitf.ocsf.event_classes import (
    AIModelInferenceEvent, AIAgentActivityEvent, AIToolExecutionEvent,
    AIDataRetrievalEvent, AISecurityFindingEvent, AISupplyChainEvent,
    AIGovernanceEvent, AIIdentityEvent,
)

# Display all 8 OCSF Category 7 class UIDs
print("OCSF Category 7 — AI System Activity")
print("=" * 50)
class_map = {
    7001: ("AI Model Inference", AIModelInferenceEvent),
    7002: ("AI Agent Activity", AIAgentActivityEvent),
    7003: ("AI Tool Execution", AIToolExecutionEvent),
    7004: ("AI Data Retrieval", AIDataRetrievalEvent),
    7005: ("AI Security Finding", AISecurityFindingEvent),
    7006: ("AI Supply Chain", AISupplyChainEvent),
    7007: ("AI Governance", AIGovernanceEvent),
    7008: ("AI Identity", AIIdentityEvent),
}
for uid, (name, cls) in class_map.items():
    fields = [f for f in cls.model_fields if f not in AIBaseEvent.model_fields]
    print(f"\n  {uid}: {name}")
    print(f"         Fields: {', '.join(fields[:6])}{'...' if len(fields) > 6 else ''}")

In [None]:
# Create an OCSF event directly from Pydantic models
event = AIModelInferenceEvent(
    activity_id=1,  # chat
    model=AIModelInfo(
        model_id="claude-sonnet-4-5-20250929",
        provider="anthropic",
        type="llm",
    ),
    token_usage=AITokenUsage(input_tokens=500, output_tokens=200),
    latency=AILatencyMetrics(total_ms=1250.0, tokens_per_second=160.0),
    cost=AICostInfo(input_cost_usd=0.0015, output_cost_usd=0.003, total_cost_usd=0.0045),
    finish_reason="end_turn",
    streaming=True,
    message="chat claude-sonnet-4-5-20250929",
)

pp(event, "OCSF 7001: AI Model Inference Event")

print(f"\n  class_uid:  {event.class_uid}")
print(f"  type_uid:   {event.type_uid}  (class_uid * 100 + activity_id)")
print(f"  category:   {event.category_uid}  (AI System Activity)")
print(f"  tokens:     {event.token_usage.total_tokens}  ({event.token_usage.input_tokens} in + {event.token_usage.output_tokens} out)")

---

## 3. LLM Inference Tracing <a id="llm"></a>

Trace LLM calls with the `OCSFMapper` — it converts OpenTelemetry spans to OCSF events.

**Scenario:** A customer-support chatbot answers user questions.  Each LLM call
(chat completion, embedding) produces a span that AITF maps to OCSF 7001.

```python
# What your chatbot code looks like (simplified):
messages = [
    {"role": "system", "content": "You are the Acme support assistant."},
    {"role": "user",   "content": "How do I reset my password?"},
]
response = openai.chat.completions.create(model="gpt-4o", messages=messages)
#                                          ↑ AITF captures this call automatically
```

In [None]:
from aitf.ocsf.mapper import OCSFMapper

mapper = OCSFMapper()

# ── Scenario: Customer asks "How do I reset my password?" ──
# The chatbot sends the conversation to GPT-4o and gets a response.
# AITF captures the span with all attributes:

chat_span = make_span("chat gpt-4o", {
    # OpenTelemetry GenAI semantic conventions (what your app emits)
    "gen_ai.system": "openai",
    "gen_ai.request.model": "gpt-4o",
    "gen_ai.operation.name": "chat",
    "gen_ai.request.temperature": 0.4,
    "gen_ai.request.max_tokens": 1024,
    "gen_ai.usage.input_tokens": 350,   # system prompt + user question
    "gen_ai.usage.output_tokens": 180,  # "To reset your password, visit..."
    "gen_ai.response.finish_reasons": ["stop"],
    # AITF-enriched attributes (added by processors)
    "aitf.latency.total_ms": 920.5,
    "aitf.latency.tokens_per_second": 195.6,
    "aitf.cost.total_cost": 0.00265,
    "aitf.cost.input_cost": 0.000875,
    "aitf.cost.output_cost": 0.0018,
})

event = mapper.map_span(chat_span)
pp(event, "Customer Support Chat → OCSF 7001")

# ── Scenario: Embed the question for RAG retrieval ──
# Before answering, the chatbot embeds the question to search the knowledge base.

embed_span = make_span("embeddings text-embedding-3-small", {
    "gen_ai.system": "openai",
    "gen_ai.request.model": "text-embedding-3-small",
    "gen_ai.operation.name": "embeddings",
    "gen_ai.usage.input_tokens": 42,  # "How do I reset my password?"
})

event2 = mapper.map_span(embed_span)
print(f"\nEmbedding for KB search:  class_uid={event2.class_uid}, "
      f"activity_id={event2.activity_id}, model_type={event2.model.type}")

---

## 4. Agent Session Tracing <a id="agent"></a>

Map agent lifecycle spans (session start, step execute, delegation, memory access) to OCSF 7002.

**Scenario:** A travel-booking agent plans a trip, searches for flights, reasons
about the best option, books it, and stores the confirmation in memory.  Then a
multi-agent team delegates research and writing tasks.

```python
# What your agent code looks like (simplified):
agent = TravelAgent(model="gpt-4o", tools=[search_flights, book_flight])
result = agent.run("Book me a nonstop SFO→JFK flight on March 15")
#                   ↑ AITF traces: planning → tool_use → reasoning → booking
```

In [None]:
# ── Scenario: Travel agent plans approach and searches for flights ──
# The agent thinks: "User wants a nonstop SFO→JFK flight.  Let me
# fetch their preferences, search flights, and pick the best one."

agent_span = make_span("agent.step.planning", {
    "aitf.agent.name": "travel-booking-agent",
    "aitf.agent.id": "agent-travel-001",
    "aitf.agent.session_id": "sess-abc123",
    "aitf.agent.type": "autonomous",
    "aitf.agent.framework": "custom",
    "aitf.agent.step.type": "planning",
    "aitf.agent.step.index": 1,
    "aitf.agent.step.thought": (
        "User wants nonstop SFO→JFK on March 15.  Steps: "
        "1) fetch preferences, 2) search flights, 3) filter nonstop + preferred airlines, "
        "4) book the cheapest."
    ),
    "aitf.agent.step.action": "call_tool:flight-search",
})

event = mapper.map_span(agent_span)
pp(event, "Travel Agent Planning Step → OCSF 7002")

# ── Scenario: Manager delegates to researcher in a multi-agent team ──
# CrewAI-style: the manager assigns a research task to a specialist agent.

delegation_span = make_span("agent.delegation", {
    "aitf.agent.name": "manager",
    "aitf.agent.id": "agent-mgr",
    "aitf.agent.session_id": "sess-abc123",
    "aitf.agent.delegation.target_agent": "researcher",
    # In a real app: manager.delegate(to="researcher", task="Research AI supply-chain risks")
})

event2 = mapper.map_span(delegation_span)
print(f"\nDelegation: activity_id={event2.activity_id} (4=Delegation)")
print(f"From: {event2.agent_name} → To: {event2.delegation_target}")

In [None]:
# ── Scenario: Agent calls an MCP tool to search the knowledge base ──
# The support agent invokes the "search_knowledge_base" tool via MCP
# to find relevant articles before answering the customer's question.

tool_span = make_span("mcp.tool.search_knowledge_base", {
    "aitf.mcp.tool.name": "search_knowledge_base",
    "aitf.mcp.tool.server": "kb-server",
    "aitf.mcp.server.transport": "stdio",
    "aitf.mcp.tool.input": '{"query": "How do I reset my password?"}',
    "aitf.mcp.tool.output": '[{"title": "Password Reset Guide", "score": 0.95}, '
                            '{"title": "MFA Setup", "score": 0.82}]',
    "aitf.mcp.tool.duration_ms": 245.3,
    "aitf.mcp.tool.is_error": False,
    "aitf.mcp.tool.approval_required": True,
    "aitf.mcp.tool.approved": True,
})

event = mapper.map_span(tool_span)
pp(event, "MCP Tool: KB Search → OCSF 7003")

# ── Scenario: RAG retrieval — vector search for similar support tickets ──
# The chatbot retrieves past tickets to provide context-aware answers.

rag_span = make_span("rag.retrieve", {
    "aitf.rag.retrieve.database": "pinecone",
    "aitf.rag.query": "password reset not working after MFA change",
    "aitf.rag.retrieve.top_k": 10,
    "aitf.rag.retrieve.results_count": 8,
    "aitf.rag.retrieve.min_score": 0.72,
    "aitf.rag.retrieve.max_score": 0.98,
    "aitf.rag.pipeline.name": "support-ticket-qa",
    "aitf.rag.pipeline.stage": "retrieve",
    "aitf.rag.query.embedding_model": "text-embedding-3-large",
})

event2 = mapper.map_span(rag_span)
pp(event2, "RAG: Ticket Retrieval → OCSF 7004")

---

## 5. Vendor Mapping — LangChain & CrewAI <a id="vendor"></a>

Vendors supply **JSON mapping files** that translate their native telemetry to AITF conventions. No custom code needed — the `VendorMapper` loads the JSON and handles everything.

**Scenario:** Your app uses LangChain for a customer-support chatbot (ChatOpenAI + VectorStoreRetriever) and CrewAI for a security-audit crew (agents + tools + delegation).  Each framework emits its own attributes — the VendorMapper normalizes them all.

```
LangChain ChatOpenAI  ─┐
LangChain Retriever   ─┤  VendorMapper  ──>  OCSFMapper  ──>  SIEM
CrewAI Agent          ─┤  (JSON-driven)      (standard)       (OCSF)
CrewAI Tool           ─┘
```

In [None]:
from aitf.ocsf.vendor_mapper import VendorMapper

vendor_mapper = VendorMapper()

print("Loaded Vendor Mappings")
print("=" * 60)
for info in vendor_mapper.list_supported_vendors():
    print(f"\n  Vendor:      {info['vendor']}")
    print(f"  Version:     {info['version']}")
    print(f"  Event Types: {info['event_types']}")
    print(f"  Description: {info['description'][:70]}...")

In [None]:
# ── Scenario: LangChain customer-support chatbot ──
# The chatbot uses ChatOpenAI for generation and a VectorStoreRetriever
# for RAG against the support knowledge base.

# Step 1: LangChain ChatOpenAI answers "How do I reset my password?"
lc_inference = make_span("ChatOpenAI", {
    # LangChain-native attributes (what LangSmith would emit)
    "ls_provider": "openai",
    "ls_model_name": "gpt-4o",
    "ls_temperature": 0.4,
    "llm.token_count.prompt": 380,   # system prompt + KB context + user question
    "llm.token_count.completion": 120,  # "Visit sso.acme.corp/reset..."
    "llm.token_count.total": 500,
    "langchain.run.response_id": "chatcmpl-abc123",
    "langchain.run.finish_reason": "stop",
})

result = vendor_mapper.normalize_span(lc_inference)
vendor, event_type, aitf_attrs = result

print("LangChain ChatOpenAI (Support Bot) → AITF")
print("=" * 60)
print(f"  Detected Vendor: {vendor}")
print(f"  Event Type:      {event_type}")
print(f"  OCSF Class UID:  {vendor_mapper.get_ocsf_class_uid(vendor, event_type)}")
print(f"\n  Attribute Translation (LangChain → AITF):")
print(f"    ls_provider           → gen_ai.system         = {aitf_attrs.get('gen_ai.system')}")
print(f"    ls_model_name         → gen_ai.request.model  = {aitf_attrs.get('gen_ai.request.model')}")
print(f"    ls_temperature        → gen_ai.request.temp   = {aitf_attrs.get('gen_ai.request.temperature')}")
print(f"    llm.token_count.*     → gen_ai.usage.*        = {aitf_attrs.get('gen_ai.usage.input_tokens')} in / {aitf_attrs.get('gen_ai.usage.output_tokens')} out")

# Step 2: LangChain VectorStoreRetriever searches the support KB
lc_rag = make_span("VectorStoreRetriever", {
    "langchain.retriever.name": "support-kb-pinecone",
    "langchain.retriever.type": "pinecone",
    "langchain.retriever.k": 8,
    "langchain.retriever.query": "How do I reset my password?",
    "langchain.retriever.documents": 6,
    "retriever.embeddings.model": "text-embedding-3-small",
})

result = vendor_mapper.normalize_span(lc_rag)
print(f"\n\nLangChain Retriever (KB Search) → {result[1]} (OCSF {vendor_mapper.get_ocsf_class_uid(*result[:2])})")
print("  Translated attributes:")
for k, v in sorted(result[2].items()):
    print(f"    {k}: {v}")

In [None]:
# ── Scenario: CrewAI security audit crew ──
# A manager agent coordinates a vuln-scanner and report-writer.
# CrewAI emits crewai.* attributes that VendorMapper normalizes.

# Agent execution: security-researcher analyzes threat intel
crew_agent = make_span("Agent Execution", {
    "crewai.agent.role": "security-researcher",
    "crewai.agent.goal": "Analyze threat intelligence reports for CVE-2024-46946",
    "crewai.agent.backstory": "15 years in cybersecurity, specializing in supply-chain attacks",
    "crewai.agent.id": "agent-sec-001",
    "crewai.agent.llm": "claude-sonnet-4-5-20250929",
    "crewai.agent.tools": "web_search,cve_lookup,code_scan",
    "crewai.crew.name": "security-audit-crew",
    "crewai.crew.process": "hierarchical",
})

result = vendor_mapper.normalize_span(crew_agent)
vendor, event_type, aitf_attrs = result

print("CrewAI Security Researcher → AITF")
print("=" * 60)
print(f"  Vendor: {vendor}, Event: {event_type}, OCSF: {vendor_mapper.get_ocsf_class_uid(vendor, event_type)}")
print(f"\n  CrewAI → AITF translations:")
print(f"    crewai.agent.role    → aitf.agent.name          = {aitf_attrs.get('aitf.agent.name')}")
print(f"    crewai.crew.name     → aitf.agent.team.name     = {aitf_attrs.get('aitf.agent.team.name')}")
print(f"    crewai.crew.process  → aitf.agent.team.topology = {aitf_attrs.get('aitf.agent.team.topology')}")
print(f"    (default)            → aitf.agent.framework     = {aitf_attrs.get('aitf.agent.framework')}")

# Task delegation: manager assigns report writing to specialist
crew_delegation = make_span("Task Delegation", {
    "crewai.delegation.from_agent": "security-manager",
    "crewai.delegation.to_agent": "report-writer",
    "crewai.delegation.task": "Write executive summary of CVE-2024-46946 impact",
    "crewai.delegation.reason": "Requires professional report writing skills",
    "crewai.delegation.strategy": "capability",
})

result = vendor_mapper.normalize_span(crew_delegation)
print(f"\n\nCrewAI Delegation → {result[1]} (OCSF {vendor_mapper.get_ocsf_class_uid(*result[:2])})")
for k, v in sorted(result[2].items()):
    print(f"    {k}: {v}")

In [None]:
# --- Load a custom vendor mapping at runtime ---

import tempfile
from pathlib import Path

autogen_mapping = {
    "vendor": "autogen",
    "version": "0.4",
    "description": "Maps Microsoft AutoGen telemetry to AITF conventions",
    "homepage": "https://microsoft.github.io/autogen/",
    "span_name_patterns": {
        "inference": ["^AutoGen\\.LLM", "^AutoGen\\.ChatCompletion"],
        "agent": ["^AutoGen\\.Agent", "^AutoGen\\.GroupChat"],
    },
    "attribute_mappings": {
        "inference": {
            "vendor_to_aitf": {
                "autogen.llm.model": "gen_ai.request.model",
                "autogen.llm.provider": "gen_ai.system",
                "autogen.llm.input_tokens": "gen_ai.usage.input_tokens",
                "autogen.llm.output_tokens": "gen_ai.usage.output_tokens",
            },
            "ocsf_class_uid": 7001,
            "ocsf_activity_id_map": {"chat": 1, "default": 1},
            "defaults": {"gen_ai.operation.name": "chat"},
        },
        "agent": {
            "vendor_to_aitf": {
                "autogen.agent.name": "aitf.agent.name",
                "autogen.agent.type": "aitf.agent.type",
                "autogen.group.name": "aitf.agent.team.name",
            },
            "ocsf_class_uid": 7002,
            "ocsf_activity_id_map": {"default": 3},
            "defaults": {"aitf.agent.framework": "autogen"},
        },
    },
    "provider_detection": {
        "attribute_keys": ["autogen.llm.provider"],
        "model_prefix_to_provider": {"gpt-": "openai", "claude-": "anthropic"},
    },
    "severity_mapping": {},
    "metadata": {
        "ocsf_product": {"name": "AutoGen", "vendor_name": "Microsoft", "version": "0.4"},
    },
}

# Write to temp file and load
with tempfile.NamedTemporaryFile(mode="w", suffix=".json", delete=False) as f:
    json.dump(autogen_mapping, f)
    tmp_path = f.name

vendor_mapper.load_file(tmp_path)

print("Custom Vendor Mapping Loaded")
print("=" * 60)
print(f"  Total vendors: {len(vendor_mapper.vendors)}")
print(f"  Vendors: {', '.join(vendor_mapper.vendors)}")

# Test AutoGen span
autogen_span = make_span("AutoGen.Agent.execute", {
    "autogen.agent.name": "code-reviewer",
    "autogen.agent.type": "assistant",
    "autogen.group.name": "dev-team",
})

result = vendor_mapper.normalize_span(autogen_span)
print(f"\n  AutoGen span detected: {result[0]}/{result[1]}")
print(f"  Translated attributes:")
for k, v in sorted(result[2].items()):
    print(f"    {k}: {v}")

---

## 6. Compliance Framework Mapping <a id="compliance"></a>

AITF maps every AI event to controls from **8 regulatory frameworks** automatically.

In [None]:
from aitf.ocsf.compliance_mapper import ComplianceMapper

compliance_mapper = ComplianceMapper()

# Map model_inference to all 8 frameworks
compliance = compliance_mapper.map_event("model_inference")

print("Compliance Mapping: model_inference")
print("=" * 60)

frameworks = [
    ("NIST AI RMF", compliance.nist_ai_rmf),
    ("MITRE ATLAS", compliance.mitre_atlas),
    ("ISO 42001", compliance.iso_42001),
    ("EU AI Act", compliance.eu_ai_act),
    ("SOC 2", compliance.soc2),
    ("GDPR", compliance.gdpr),
    ("CCPA", compliance.ccpa),
    ("CSA AICM", compliance.csa_aicm),
]

for name, data in frameworks:
    if data:
        # Get the primary control list
        controls = data.get("controls") or data.get("techniques") or data.get("articles") or data.get("sections") or []
        display = controls[:5]
        suffix = f" (+{len(controls)-5} more)" if len(controls) > 5 else ""
        print(f"  {name:15s} {', '.join(str(c) for c in display)}{suffix}")

In [None]:
# Coverage matrix — which frameworks apply to each event type
matrix = compliance_mapper.get_coverage_matrix()

print("Compliance Coverage Matrix")
print("=" * 80)

header_fw = ["nist", "atlas", "iso", "eu_ai", "soc2", "gdpr", "ccpa", "aicm"]
fw_keys = ["nist_ai_rmf", "mitre_atlas", "iso_42001", "eu_ai_act", "soc2", "gdpr", "ccpa", "csa_aicm"]

print(f"  {'Event Type':22s} {' '.join(f'{h:>6s}' for h in header_fw)}  Total")
print(f"  {'-'*22} {' '.join(['------'] * 8)}  -----")

for event_type, fw_map in matrix.items():
    counts = []
    total = 0
    for fk in fw_keys:
        n = len(fw_map.get(fk, []))
        counts.append(n)
        total += n
    print(f"  {event_type:22s} {' '.join(f'{c:6d}' for c in counts)}  {total:5d}")

In [None]:
# Enrich an OCSF event with compliance metadata
event = AIModelInferenceEvent(
    activity_id=1,
    model=AIModelInfo(model_id="gpt-4o", provider="openai"),
    token_usage=AITokenUsage(input_tokens=100, output_tokens=50),
    finish_reason="stop",
)

enriched = compliance_mapper.enrich_event(event, "model_inference")

print("Enriched OCSF Event (with compliance)")
print("=" * 60)
print(f"  Event class: {enriched.class_uid} ({enriched.message or 'model_inference'})")
print(f"  Has compliance: {enriched.compliance is not None}")
print(f"  NIST controls: {enriched.compliance.nist_ai_rmf}")
print(f"  EU AI Act:     {enriched.compliance.eu_ai_act}")

# Generate audit record
audit = compliance_mapper.generate_audit_record(
    event_type="model_inference",
    actor="analyst@example.com",
    model="gpt-4o",
)
pp(audit, "Audit Record")

---

## 7. Agentic Log Entries <a id="agentic-log"></a>

Structured log entries that capture security-relevant context for every AI agent action, based on the **Table 10.1 minimal fields** specification.

**Scenario:** A DevOps incident-response agent handles a production alert for high CPU on the payments service.  It checks health, queries logs, runs EXPLAIN on a slow query, and applies a hotfix — each action logged with goal, tool, outcome, confidence, anomaly score, and policy evaluation.

```python
# What your incident-response agent does:
health = k8s.check_health("payments-service")       # CPU 94%, degraded
logs = datadog.query_logs("payments-service", "15m") # QueryTimeout errors
explain = postgres.explain(slow_query)               # Seq Scan on 2.4M rows
postgres.kill_slow_queries(min_duration="30s")       # Fix: kill + add index
# ↑ Every action is agentic-logged with policy evaluation
```

In [None]:
from aitf.instrumentation.agentic_log import AgenticLogInstrumentor

provider = TracerProvider()
provider.add_span_processor(SimpleSpanProcessor(ConsoleSpanExporter()))
trace.set_tracer_provider(provider)

agentic_log = AgenticLogInstrumentor(tracer_provider=provider)

# ── Scenario: DevOps agent checks service health ──
# First action in an incident-response workflow.
print("Agentic Log: DevOps Incident Response")
print("=" * 60)
print("  ALERT: payments-service CPU at 94%\n")

with agentic_log.log_action(
    agent_id="agent-devops-responder-prod-001",
    session_id="sess-inc-4421",
    event_id="e-health-check-001",
) as entry:
    entry.set_goal_id("goal-resolve-payments-cpu-alert")
    entry.set_sub_task_id("task-check-health")
    entry.set_tool_used("k8s.health_check")
    entry.set_tool_parameters({"service": "payments-service", "namespace": "production"})
    entry.set_outcome("SUCCESS")
    entry.set_confidence_score(0.95)
    entry.set_anomaly_score(0.10)
    entry.set_policy_evaluation({
        "policy": "read_only_monitoring",
        "result": "PASS",
    })

print(f"  Event ID:      {entry.event_id}")
print(f"  Timestamp:     {entry.timestamp}")
print(f"  Agent:         agent-devops-responder-prod-001")
print(f"  Goal:          goal-resolve-payments-cpu-alert")
print(f"  Tool:          k8s.health_check")
print(f"  Outcome:       SUCCESS")
print(f"  Confidence:    0.95")
print(f"  Anomaly:       0.10  (low = normal read-only operation)")
print(f"  Policy:        PASS  (read_only_monitoring)")

In [None]:
# ── Scenario: Agent tries a destructive action → DENIED by policy ──
# The agent wants to DROP a cache table, but the policy engine blocks it.
# High anomaly score (0.92) would trigger a SIEM alert.

print("Agentic Log: Blocked Destructive Action")
print("=" * 60)
print("  Agent attempts: DROP TABLE query_cache CASCADE\n")

with agentic_log.log_action(
    agent_id="agent-devops-responder-prod-001",
    session_id="sess-inc-4421",
    goal_id="goal-resolve-payments-cpu-alert",
    sub_task_id="task-drop-cache-table",
    tool_used="postgres.drop_table",
    tool_parameters={"table": "query_cache", "cascade": True},
) as entry:
    entry.set_outcome("DENIED")
    entry.set_confidence_score(0.40)
    entry.set_anomaly_score(0.92)
    entry.set_policy_evaluation({
        "policy": "destructive_operations",
        "result": "FAIL",
        "reason": "DROP TABLE is permanently destructive — requires human approval",
        "escalated_to": "oncall@acme.corp",
    })

print(f"  Outcome:       DENIED")
print(f"  Confidence:    0.40  (agent was unsure this was the right fix)")
print(f"  Anomaly:       0.92  (HIGH — triggers SIEM alert)")
print(f"  Policy:        FAIL  (destructive_operations)")
print(f"  Escalation:    oncall@acme.corp")
print(f"\n  In production, this generates an OCSF security finding event")
print(f"  and pages the on-call engineer for manual review.")

---

## 8. AI-BOM Generation <a id="aibom"></a>

Generate an **AI Bill of Materials** from telemetry — automatically discovers models, tools, and frameworks in use.

**Scenario:** A fraud-detection platform uses Claude Sonnet for transaction scoring, GPT-4o for report generation, Pinecone for similar-case retrieval, and LangChain for agent orchestration.  The AI-BOM catalogs every AI component and flags known vulnerabilities (e.g., CVE-2024-46946 in LangChain).

In [None]:
from aitf.generators.ai_bom import AIBOMGenerator

bom = AIBOMGenerator(
    system_name="fraud-detection-platform",
    system_version="4.2.0",
)

# ── Register components discovered in the fraud-detection platform ──
# In production, models and tools are auto-discovered from OTel spans.
# Here we register them manually to show the AI-BOM structure.

# Models used for transaction scoring and report generation
bom.add_component("model", "claude-sonnet-4-5-20250929", "20250929", provider="Anthropic", license="Commercial")
bom.add_component("model", "gpt-4o", "2025-01-01", provider="OpenAI", license="Commercial")
bom.add_component("model", "text-embedding-3-large", "2024-01-25", provider="OpenAI", license="Commercial")

# Frameworks and infrastructure
bom.add_component("framework", "langchain", "0.3.12", provider="LangChain Inc.", license="MIT")
bom.add_component("vector_store", "pinecone", "3.0.0", provider="Pinecone")
bom.add_component("tool", "merchant-risk-mcp", "2.3.0", provider="internal")

# Guardrails and prompt templates
bom.add_component("guardrail", "pii-redaction-filter", "2.4", provider="internal",
                   description="Redacts card numbers and SSNs before model input")
bom.add_component("prompt_template", "fraud-scoring-v3", "3.1", provider="internal",
                   properties={"category": "fraud_detection", "reviewed": True})

# Known vulnerabilities in dependencies
bom.add_vulnerability("framework", "langchain", "CVE-2024-46946",
    severity="high", description="Arbitrary code execution via YAML deserialization")
bom.add_vulnerability("framework", "langchain", "CVE-2024-28088",
    severity="medium", description="SSRF in web retrieval chain")

print("Fraud Detection Platform — AI-BOM Components")
print("=" * 60)
summary = bom.get_component_summary()
pp(summary)

In [None]:
# Generate AI-BOM in multiple formats
doc = bom.generate(bom_id="bom-fraud-platform-2026-001")

print(f"AI-BOM: Fraud Detection Platform")
print(f"  ID:              {doc.bom_id}")
print(f"  System:          {doc.system_name} v{doc.system_version}")
print(f"  Components:      {doc.component_count}")
print(f"  Types:           {doc.component_types}")
print(f"  Vulnerabilities: {len(doc.vulnerabilities)}")

# CycloneDX — the format used by most vulnerability scanners
cdx = doc.to_cyclonedx()
print(f"\nCycloneDX Export (for dependency scanners)")
print(f"  bomFormat:       {cdx['bomFormat']}")
print(f"  specVersion:     {cdx['specVersion']}")
print(f"  components:      {len(cdx['components'])}")
print(f"  vulnerabilities: {len(cdx.get('vulnerabilities', []))}")

# SPDX — the format used for license compliance
spdx = doc.to_spdx()
print(f"\nSPDX Export (for license compliance)")
print(f"  spdxVersion:     {spdx['spdxVersion']}")
print(f"  packages:        {len(spdx['packages'])}")

# Show a sample component
print(f"\nSample Component (CycloneDX):")
pp(cdx['components'][0])

---

## 9. Full Pipeline — End to End <a id="pipeline"></a>

Demonstrates the complete flow from framework-native telemetry to SIEM-ready events:

```
Your App Code (LangChain/CrewAI)
    ↓ emits vendor-native spans
VendorMapper (JSON-driven normalization)
    ↓ AITF semantic conventions
OCSFMapper (span → OCSF event)
    ↓ OCSF Category 7 events
ComplianceMapper (regulatory enrichment)
    ↓ NIST AI RMF + EU AI Act + ...
SIEM / XDR (Splunk, QRadar, Sentinel)
```

**Scenario:** A LangChain `ChatAnthropic` span from the customer-support bot flows through the entire pipeline, emerging as a compliance-enriched OCSF event.

In [None]:
from aitf.ocsf.vendor_mapper import VendorMapper
from aitf.ocsf.mapper import OCSFMapper
from aitf.ocsf.compliance_mapper import ComplianceMapper

# Initialize the pipeline
vendor_mapper = VendorMapper()
ocsf_mapper = OCSFMapper()
compliance_mapper = ComplianceMapper(frameworks=["nist_ai_rmf", "eu_ai_act", "csa_aicm"])

# Simulate a LangChain inference span with vendor-native attributes
raw_span = make_span("ChatAnthropic", {
    "ls_provider": "anthropic",
    "ls_model_name": "claude-sonnet-4-5-20250929",
    "ls_temperature": 0.5,
    "llm.token_count.prompt": 500,
    "llm.token_count.completion": 300,
    "llm.token_count.total": 800,
})

print("Full Pipeline Execution")
print("=" * 70)

# Step 1: Vendor normalization
norm = vendor_mapper.normalize_span(raw_span)
vendor, event_type, aitf_attrs = norm
print(f"\n  Step 1 — Vendor Detection")
print(f"    Vendor:     {vendor}")
print(f"    Event Type: {event_type}")
print(f"    Attributes: {len(aitf_attrs)} AITF keys")

# Step 2: Create a normalized span and map to OCSF
normalized_span = make_span(f"chat {aitf_attrs.get('gen_ai.request.model', 'unknown')}", aitf_attrs)
ocsf_event = ocsf_mapper.map_span(normalized_span)

print(f"\n  Step 2 — OCSF Mapping")
print(f"    Class UID:   {ocsf_event.class_uid}  (AI Model Inference)")
print(f"    Type UID:    {ocsf_event.type_uid}")
print(f"    Activity:    {ocsf_event.activity_id}  (chat)")
print(f"    Model:       {ocsf_event.model.model_id}")
print(f"    Provider:    {ocsf_event.model.provider}")
print(f"    Tokens:      {ocsf_event.token_usage.total_tokens}")

# Step 3: Compliance enrichment
enriched = compliance_mapper.enrich_event(ocsf_event, "model_inference")

print(f"\n  Step 3 — Compliance Enrichment")
print(f"    NIST AI RMF: {enriched.compliance.nist_ai_rmf['controls']}")
print(f"    EU AI Act:   {enriched.compliance.eu_ai_act['articles']}")
csa = enriched.compliance.csa_aicm
print(f"    CSA AICM:    {len(csa['controls'])} controls across {csa['domains']}")

# Final serialized event
final = enriched.model_dump(exclude_none=True)
print(f"\n  Final OCSF event: {len(json.dumps(final))} bytes")
print(f"  Top-level keys: {list(final.keys())}")

In [None]:
# ── Full pipeline with a CrewAI security audit crew ──
# Simulates a real CrewAI workflow: manager coordinates vulnerability
# scanning and report writing across multiple agents.

spans = [
    # Manager agent kicks off the security audit
    ("Crew Execution", {
        "crewai.agent.role": "security-manager",
        "crewai.agent.id": "agent-mgr-001",
        "crewai.agent.goal": "Coordinate security audit of payments service",
        "crewai.crew.name": "security-audit-crew",
        "crewai.crew.process": "hierarchical",
    }),
    # Claude Sonnet analyzes vulnerabilities
    ("LLM Call claude-sonnet-4-5-20250929", {
        "crewai.llm.model": "claude-sonnet-4-5-20250929",
        "crewai.llm.provider": "anthropic",
        "crewai.llm.input_tokens": 1500,
        "crewai.llm.output_tokens": 600,
        "crewai.llm.response_time_ms": 2800.0,
    }),
    # Tool: CVE lookup for langchain vulnerabilities
    ("Tool Execution cve_lookup", {
        "crewai.tool.name": "cve_lookup",
        "crewai.tool.input": '{"package": "langchain", "min_severity": "high"}',
        "crewai.tool.output": '[{"id": "CVE-2024-46946", "severity": "high"}]',
        "crewai.tool.agent": "vuln-scanner",
        "crewai.tool.duration_ms": 1850.0,
    }),
    # Manager delegates report writing
    ("Task Delegation", {
        "crewai.delegation.from_agent": "security-manager",
        "crewai.delegation.to_agent": "report-writer",
        "crewai.delegation.task": "Write executive summary of security findings",
        "crewai.delegation.reason": "Requires professional report writing skills",
    }),
    # GPT-4o generates the executive report
    ("LLM Call gpt-4o", {
        "crewai.llm.model": "gpt-4o",
        "crewai.llm.provider": "openai",
        "crewai.llm.input_tokens": 2200,
        "crewai.llm.output_tokens": 900,
        "crewai.llm.response_time_ms": 3200.0,
    }),
]

print("CrewAI Security Audit → Full Pipeline")
print("=" * 70)

events_collected = []
for span_name, attrs in spans:
    span = make_span(span_name, attrs)
    result = vendor_mapper.normalize_span(span)
    if result:
        vendor, etype, aitf_a = result
        class_uid = vendor_mapper.get_ocsf_class_uid(vendor, etype)
        events_collected.append({
            "span": span_name,
            "vendor": vendor,
            "event_type": etype,
            "ocsf_class": class_uid,
            "aitf_keys": len(aitf_a),
        })
        print(f"\n  {span_name}")
        print(f"    → {vendor}/{etype} → OCSF {class_uid} | {len(aitf_a)} AITF attrs")

print(f"\n  Total events: {len(events_collected)}")
print(f"  Event types: {set(e['event_type'] for e in events_collected)}")
print(f"  All events are compliance-enriched and SIEM-ready.")

---

## 10. Export & Visualization <a id="viz"></a>

Visualize the OCSF events as a formatted table and export as JSONL.

**Scenario:** A production environment runs both LangChain and CrewAI workloads.
We collect all spans, normalize them through the pipeline, and export as JSONL
for ingestion into your SIEM.

In [None]:
# ── Collect OCSF events from a mixed LangChain + CrewAI production workload ──

all_spans = [
    # LangChain: customer support chatbot
    ("ChatOpenAI", {"ls_provider": "openai", "ls_model_name": "gpt-4o",
                    "llm.token_count.prompt": 380, "llm.token_count.completion": 120}),
    ("ChatAnthropic", {"ls_provider": "anthropic", "ls_model_name": "claude-sonnet-4-5-20250929",
                       "llm.token_count.prompt": 500, "llm.token_count.completion": 300}),
    ("AgentExecutor", {"langchain.agent.name": "support-agent"}),
    ("VectorStoreRetriever", {"langchain.retriever.name": "support-kb-pinecone", "langchain.retriever.k": 8}),
    # CrewAI: security audit crew
    ("Crew Execution", {"crewai.agent.role": "security-manager", "crewai.crew.name": "security-audit-crew"}),
    ("LLM Call claude-sonnet-4-5-20250929", {"crewai.llm.model": "claude-sonnet-4-5-20250929", "crewai.llm.input_tokens": 1500}),
    ("Tool Execution cve_lookup", {"crewai.tool.name": "cve_lookup", "crewai.tool.duration_ms": 1850.0}),
    ("Task Delegation", {"crewai.delegation.from_agent": "security-manager", "crewai.delegation.to_agent": "report-writer"}),
]

print(f"{'Span Name':<40s} {'Vendor':<12s} {'Event Type':<14s} {'OCSF':>6s}")
print(f"{'-'*40} {'-'*12} {'-'*14} {'-'*6}")

ocsf_events = []
for name, attrs in all_spans:
    span = make_span(name, attrs)
    result = vendor_mapper.normalize_span(span)
    if result:
        v, et, aa = result
        uid = vendor_mapper.get_ocsf_class_uid(v, et) or "?"
        print(f"{name:<40s} {v:<12s} {et:<14s} {uid:>6}")

        norm_span = make_span(f"{et} {name}", aa)
        ocsf_ev = ocsf_mapper.map_span(norm_span)
        if ocsf_ev:
            enriched = compliance_mapper.enrich_event(ocsf_ev, et.replace("inference", "model_inference"))
            ocsf_events.append(enriched)

print(f"\nTotal OCSF events generated: {len(ocsf_events)}")

In [None]:
# Export events as JSONL (the format used by OCSF Exporter)
import io

jsonl_buffer = io.StringIO()
for ev in ocsf_events:
    line = json.dumps(ev.model_dump(exclude_none=True), default=str)
    jsonl_buffer.write(line + "\n")

jsonl_content = jsonl_buffer.getvalue()

print("OCSF JSONL Export")
print("=" * 60)
print(f"  Events:    {len(ocsf_events)}")
print(f"  Size:      {len(jsonl_content):,} bytes")
print(f"  Format:    JSONL (one OCSF event per line)")
print(f"\nFirst event (preview):")
first_event = json.loads(jsonl_content.split("\n")[0])
preview_keys = {"class_uid", "type_uid", "activity_id", "category_uid",
                "severity_id", "status_id", "message", "time"}
preview = {k: v for k, v in first_event.items() if k in preview_keys}
pp(preview)

In [None]:
# Event statistics
from collections import Counter

class_counts = Counter()
vendor_counts = Counter()

for name, attrs in all_spans:
    span = make_span(name, attrs)
    result = vendor_mapper.normalize_span(span)
    if result:
        v, et, _ = result
        uid = vendor_mapper.get_ocsf_class_uid(v, et)
        class_name = class_map.get(uid, (f"Class {uid}",))[0] if uid else "Unknown"
        class_counts[class_name] += 1
        vendor_counts[v] += 1

print("Event Distribution by OCSF Class")
print("=" * 50)
for cls, count in class_counts.most_common():
    bar = '#' * (count * 5)
    print(f"  {cls:<25s} {bar} {count}")

print(f"\nEvent Distribution by Vendor")
print("=" * 50)
for vendor, count in vendor_counts.most_common():
    bar = '#' * (count * 5)
    print(f"  {vendor:<12s} {bar} {count}")

print(f"\n  Total spans processed: {sum(vendor_counts.values())}")
print(f"  Unique OCSF classes:   {len(class_counts)}")
print(f"  Unique vendors:        {len(vendor_counts)}")

---

## Summary

This notebook demonstrated the full AITF pipeline:

| Step | Component | What it Does |
|------|-----------|-------------|
| 1 | **VendorMapper** | Normalizes LangChain/CrewAI/custom telemetry to AITF conventions |
| 2 | **OCSFMapper** | Converts OTel spans to OCSF Category 7 events (7001-7008) |
| 3 | **ComplianceMapper** | Enriches events with controls from 8 regulatory frameworks |
| 4 | **AgenticLogInstrumentor** | Structured security-context log entries per Table 10.1 |
| 5 | **AIBOMGenerator** | AI Bill of Materials in AITF/CycloneDX/SPDX formats |
| 6 | **Exporters** | OCSF JSONL, Immutable Audit Logs, CEF Syslog to SIEM |

### Next Steps

- **Production deployment**: See `docs/deployment-guide.md` for 10 complete deployment examples
- **Custom vendor mapping**: Create a JSON file for your agentic framework
- **SIEM integration**: Use the CEF Syslog exporter for Splunk/QRadar/ArcSight
- **Detection rules**: See `examples/detection-rules/` for Sigma and Splunk queries

---

*AITF — Security-first telemetry for AI systems*