# OAI RAN Bug Root-Cause Analysis Workflow

This notebook implements a structured workflow for investigating bugs in the OpenAirInterface5G (OAI) RAN project. It follows the process defined in `QAprompt.json`, guiding the user from raw bug reports to root-cause analysis by cross-referencing specifications and code.

### Agent Instruction

If you are a AI coding agent, you are given access to access the entire codebase in the current folder. Also, you have access to "spec_ingestor.py" to help you make correct citation to the latest 3GPP standard. Note it is not using AI and you should consider it a grep tool with structured output. If you make any claim about 3GPP standard, you *MUST* verify it with this tool. Use it strictly follow this pattern "C:/Users/woaiw/cmd/envs/CS211/python.exe p:/openairinterface5g/spec_ingestor.py F1AP.docx 'keyword'" for F1AP and "C:/Users/woaiw/cmd/envs/CS211/python.exe p:/openairinterface5g/spec_ingestor.py NGAP.docx 'keyword'"

### Global Definitions
The workflow adheres to the following definitions and enumerations:

```json
  "$defs": {
    "RepoEnum": {
      "type": "string",
      "enum": ["https://gitlab.eurecom.fr/oai/openairinterface5g/-/tree/develop/openair2/F1AP?ref_type=heads"]
    },
    "InterfaceEnum": {
      "type": "string",
      "enum": ["F1AP","E1AP","NGAP","XnAP","RRC","PDCP","RLC","MAC","GTP-U","SCTP"]
    },
    "VerdictEnum": {
      "type": "string",
      "enum": ["spec-nonconformance", "implementation-bug", "ambiguous", "needs-more-data"]
    },
    "ComponentEnum": {
      "type": "string",
      "enum": ["gNB-CU", "gNB-DU", "UE", "AMF", "UPF"]
    }
  },
```

The workflow consists of sequential steps, each broken down into:
1. **Requirements:** Definition of the step's goal, inputs, and expected outputs based on `QAprompt.json`.
2. **Results:** Placeholders for the execution results.
3. **Check:** Verification criteria to ensure the step was completed successfully.

## Step 1: Bug Ingest

### 1.1 Requirements

**Goal:** Normalize a freeform bug report into a structured bug card with interface/procedure guesses and key identifiers.

**Input:**
- `bug_text`: Bug: CU registers UE and relies on RRC Setup Success when receiving initialRRCTransfer with mutated DU ID and Transaction ID before F1 Setup
Repro Steps:
1) Attacker sends an incorrect UE Context Setup Request to the gNB-CU.
2) gNB-CU rejects the request as expected.
3) Attacker immediately sends a UE Context Setup Success.
4) In parallel, attacker sends an initialRRCTransfer with mutated DU ID and Transaction ID.
5) Attacker follows with an RRC Setup Success message.

**Role & Instructions:**
Act as **BugCardBuilder**. Extract and normalize from `bug_text`: likely interface(s), procedure(s), component roles, and key IDs (transaction IDs, CU/DU UE F1AP IDs, DU ID, RNTI, PCI, served-cell list). Summarize observed vs expected (if implied).

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["bug_card"],
        "properties": {
          "bug_card": {
            "type": "object",
            "required": ["observed_behavior", "interface_guess", "procedure_guess", "components_involved"],
            "properties": {
              "observed_behavior": { "type": "string" },
              "interface_guess": { "type": "array", "items": {"$ref":"#/$defs/InterfaceEnum" } },
              "procedure_guess": { "type": "array", "items": { "type": "string" } },
              "components_involved": { "type": "array", "items": {"$ref":"#/$defs/ComponentEnum" },"minItems":1 },
              "key_ids": {
                "type": "object",
                "properties": {
                  "transaction_id": { "type": "string" },
                  "cu_ue_f1ap_id": { "type": "string" },
                  "du_ue_f1ap_id": { "type": "string" },
                  "du_id": { "type": "string" },
                  "rnti": { "type": "string" },
                  "pci": { "type": "string" },
                  "served_cell_list": { "type": "array", "items": { "type": "string" } }
                }
              },
              "signals_or_timers": { "type": "array", "items": { "type": "string" } }
            }
          }
        }
      }
```

### 1.2 Results

```json
{
  "bug_card": {
    "observed_behavior": "CU registers UE and processes RRC Setup Success despite receiving initialRRCTransfer with mutated DU ID and Transaction ID after a rejected UE Context Setup Request and a fake UE Context Setup Success.",
    "interface_guess": ["F1AP", "RRC"],
    "procedure_guess": ["UE Context Setup", "Initial UL RRC Message Transfer", "RRC Setup"],
    "components_involved": ["gNB-CU", "gNB-DU", "UE"],
    "key_ids": {
      "transaction_id": "mutated",
      "du_id": "mutated",
      "cu_ue_f1ap_id": "implied",
      "du_ue_f1ap_id": "implied",
      "rnti": "implied"
    },
    "signals_or_timers": ["UE Context Setup Request", "UE Context Setup Response", "Initial UL RRC Message Transfer", "RRC Setup Complete"]
  }
}
```

### 1.3 Check

**Verification:** Check that the output matches the JSON schema, specifically that `interface_guess` and `procedure_guess` are not empty, and no freeform prose is included.

## Step 2: Spec Section Fetcher

### 2.1 Requirements

**Goal:** Propose spec sections (procedures, timers, message formats) that likely govern the bug.

**Input:**
- `bug_card`: Structured description from Step 1.
- `spec_toc`: Table of contents with ids/titles/anchors.

**Role & Instructions:**
Act as **SpecIndexer**. Using `bug_card` and attached spec slices, identify 3–6 most relevant spec sections. Produce keywords, titles, and summarize `expected_behaviour`.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["candidate_spec_sections","expected_behaviour"],
        "properties": {
          "candidate_spec_sections": {
            "type": "array",
            "minItems": 3,
            "maxItems": 6,
            "items": {
              "type": "object",
              "required": ["Keywords", "title"],
              "properties": { "Keywords": { "type": "array", "items": { "type": "string" }}, "title": { "type": "array", "items": { "type": "string" }} }
            }
          },
          "expected_behaviour": {"type": "string"}
        }
      }
```

### 2.2 Results

```json
{
  "candidate_spec_sections": [
    {
      "Keywords": ["Initial UL RRC Message Transfer", "gNB-DU UE F1AP ID", "Transaction ID"],
      "title": ["TS 38.473 8.4.1 Initial UL RRC Message Transfer", "TS 38.473 9.3.1.5 gNB-DU UE F1AP ID", "TS 38.473 9.3.1.23 Transaction ID"]
    },
    {
      "Keywords": ["UE Context Setup"],
      "title": ["TS 38.473 8.3.1 UE Context Setup"]
    },
    {
      "Keywords": ["RRC Setup"],
      "title": ["TS 38.331 6.2.2 RRC Setup"]
    }
  ],
  "expected_behaviour": "The gNB-CU should reject UE Context Setup Response if no corresponding Request is pending. It should also ensure consistency of gNB-DU UE F1AP ID during the setup phase. Initial UL RRC Message Transfer should initiate the logical F1 connection, and subsequent messages should respect the established IDs."
}
```

### 2.3 Check

**Verification:** Check that 3-6 relevant sections are identified and `expected_behaviour` is summarized.

## Step 3: Code Fetcher (metadata-driven)

### 3.1 Requirements

**Goal:** Select likely source files/functions using metadata (paths, interface names, keywords, ASN.1 types, timers).

**Input:**
- `bug_card`: From Step 1.
- `candidate_spec_sections`: From Step 2.
- `repo_metadata`: Path, interface names, and known keywords.

**Role & Instructions:**
Act as **CodeLocator**. Build a prioritized list of files/functions to fetch using interface names, procedure/timer, and keywords. Plan recursion if initial fetch seems insufficient.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["candidate_code"],
        "properties": {
          "candidate_code": {
            "type": "array",
            "items": {
              "type": "object",
              "required": ["path", "function_name", "reason"],
              "properties": { "path": { "type": "string" }, "function_name": { "type": "string" }, "reason": { "type": "string" } }
            }
          }
        }
      }
```

### 3.2 Results

```json
{
  "candidate_code": [
    {
      "path": "openair2/F1AP/f1ap_cu_rrc_message_transfer.c",
      "function_name": "CU_handle_INITIAL_UL_RRC_MESSAGE_TRANSFER",
      "reason": "Handles the Initial UL RRC Message Transfer where the mutated DU ID is received."
    },
    {
      "path": "openair2/F1AP/f1ap_cu_ue_context_management.c",
      "function_name": "CU_handle_UE_CONTEXT_SETUP_RESPONSE",
      "reason": "Handles the fake UE Context Setup Success sent by the attacker."
    },
    {
      "path": "openair2/RRC/NR/rrc_gNB.c",
      "function_name": "rrc_gNB_process_initial_ul_rrc_message",
      "reason": "Processes the decoded CCCH message from Initial UL RRC Transfer."
    }
  ]
}
```

### 3.3 Check

**Verification:** Check that `candidate_code` is not empty and includes paths and function names.

## Step 4: Lightweight Retriever

### 4.1 Requirements

**Goal:** Retrieve high-signal snippets only from shortlisted specs and code.

**Input:**
- `candidate_spec_sections`: From Step 2.
- `candidate_code`: From Step 3.

**Role & Instructions:**
Act as **SnippetRetriever**. For each query, return top snippets with exact locations (spec id/anchor, file:line-range). Keep each snippet ≤ 120 words.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["snippets"],
        "properties": {
          "snippets": {
            "type": "array",
            "items": {
              "type": "object",
              "required": ["source", "location", "text", "kind"],
              "properties": {
                "kind": { "type": "string", "enum": ["spec", "code"] },
                "source": { "type": "string" },
                "location": { "type": "string" },
                "text": { "type": "string" }
              }
            }
          }
        }
      }
```

### 4.2 Results

```json
{
  "snippets": [
    {
      "kind": "code",
      "source": "openair2/F1AP/f1ap_cu_rrc_message_transfer.c",
      "location": "48-64",
      "text": "int CU_handle_INITIAL_UL_RRC_MESSAGE_TRANSFER(...) { ... decode_initial_ul_rrc_message_transfer(pdu, &msg); ... itti_send_msg_to_task(TASK_RRC_GNB, instance, message_p); ... }"
    },
    {
      "kind": "code",
      "source": "openair2/F1AP/f1ap_cu_ue_context_management.c",
      "location": "70-84",
      "text": "int CU_handle_UE_CONTEXT_SETUP_RESPONSE(...) { ... decode_ue_context_setup_resp(pdu, &resp); ... itti_send_msg_to_task(TASK_RRC_GNB, instance, msg_p); ... }"
    },
    {
      "kind": "spec",
      "source": "TS 38.473 8.4.1.2",
      "location": "8.4.1.2",
      "text": "The gNB-DU initiates the procedure by sending an INITIAL UL RRC MESSAGE TRANSFER. The establishment of the UE-associated logical F1-connection shall be initiated as part of the procedure."
    }
  ]
}
```

### 4.3 Check

**Verification:** Check that snippets are retrieved from both spec and code, with exact locations.

## Step 5: Event/State Sketcher (draft)

### 5.1 Requirements

**Goal:** Draft expected message sequence and 5–8 state sketch from snippets.

**Input:**
- `bug_card`: From Step 1.
- `snippets`: From Step 4.

**Role & Instructions:**
Act as **SketchMaker**. Build (1) an expected message sequence as ordered events and (2) a compact state machine (5–8 states, transitions with guards). Note assumptions and open questions.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["sequence_diagram", "state_machine", "assumptions", "questions", "review_request"],
        "properties": {
          "sequence_diagram": {
            "type": "array",
            "items": { "type": "object", "required": ["from", "to", "message"], "properties": { "from": { "type": "string" }, "to": { "type": "string" }, "message": { "type": "string" }, "precond": { "type": "string" }, "postcond": { "type": "string" } } }
          },
          "state_machine": {
            "type": "object",
            "required": ["states", "transitions"],
            "properties": {
              "states": { "type": "array", "items": { "type": "string" }, "minItems": 5, "maxItems": 8 },
              "transitions": {
                "type": "array",
                "items": { "type": "object", "required": ["from", "to", "on"], "properties": { "from": { "type": "string" }, "to": { "type": "string" }, "on": { "type": "string" }, "guard": { "type": "string" } } }
              }
            }
          },
          "assumptions": { "type": "array", "items": { "type": "string" } },
          "questions": { "type": "array", "items": { "type": "string" } },
          "review_request": { "type": "string", "enum": ["HumanInLoop_B_required"] }
        }
      }
```

### 5.2 Results

```json
{
  "sequence_diagram": [
    {"from": "Attacker", "to": "gNB-CU", "message": "UE Context Setup Request (Invalid)", "precond": "None", "postcond": "Rejected"},
    {"from": "gNB-CU", "to": "Attacker", "message": "UE Context Setup Failure/Reject", "precond": "Received Invalid Request", "postcond": "None"},
    {"from": "Attacker", "to": "gNB-CU", "message": "UE Context Setup Response (Fake)", "precond": "None", "postcond": "CU Context Active"},
    {"from": "Attacker", "to": "gNB-CU", "message": "Initial UL RRC Message Transfer (Mutated IDs)", "precond": "Parallel", "postcond": "RRC Setup Request Processed"},
    {"from": "Attacker", "to": "gNB-CU", "message": "RRC Setup Complete", "precond": "RRC Setup Received", "postcond": "Connected"}
  ],
  "state_machine": {
    "states": ["IDLE", "WAIT_RRC_SETUP", "CTX_ESTABLISHED", "CONNECTED", "ERROR"],
    "transitions": [
      {"from": "IDLE", "to": "ERROR", "on": "UE Context Setup Request (from DU)", "guard": "Invalid Direction"},
      {"from": "IDLE", "to": "CTX_ESTABLISHED", "on": "UE Context Setup Response", "guard": "No Pending Request (Bug)"},
      {"from": "CTX_ESTABLISHED", "to": "WAIT_RRC_SETUP", "on": "Initial UL RRC Message Transfer", "guard": "Mutated IDs accepted"},
      {"from": "WAIT_RRC_SETUP", "to": "CONNECTED", "on": "RRC Setup Complete", "guard": "None"}
    ]
  },
  "assumptions": ["Attacker can mimic F1AP messages.", "CU does not validate Transaction ID for UE Context Setup Response against a pending Request."],
  "questions": ["Does CU validate Transaction ID for UE Context Setup Response?", "Does Initial UL RRC Transfer overwrite existing F1AP IDs in CU?"],
  "review_request": "HumanInLoop_B_required"
}
```

### 5.3 Check

**Verification:** Check for a sequence diagram, a state machine with 5-8 states, and any assumptions.

## Step 6: Conformance Checker (simple invariants)

### 6.1 Requirements

**Goal:** Check core invariants to propose preliminary verdict.

**Input:**
- `bug_card`: From Step 1.
- `sequence_diagram`: From Step 5.
- `state_machine`: From Step 5.
- `snippets`: From Step 4.

**Role & Instructions:**
Act as **ConformanceChecker**. Using snippets as authority, test invariants: directionality, req/resp pairing, ID binding, transaction scoping, timer preconditions. For each invariant: PASS/FAIL with cite. Propose preliminary verdict.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["invariant_checks", "preliminary_verdict", "rationale"],
        "properties": {
          "invariant_checks": {
            "type": "array",
            "items": { "type": "object", "required": ["name", "result", "evidence"], "properties": { "name": { "type": "string" }, "result": { "type": "string", "enum": ["PASS", "FAIL", "UNKNOWN"] }, "evidence": { "type": "array", "items": { "type": "string" } } } }
          },
          "preliminary_verdict": { "type": "string", "enum": ["spec-nonconformance", "implementation-bug", "ambiguous", "needs-more-data"] },
          "rationale": { "type": "string" }
        }
      }
```

### 6.2 Results

```json
{
  "invariant_checks": [
    {
      "name": "Transaction ID Matching",
      "result": "FAIL",
      "evidence": ["CU accepts UE Context Setup Response without pending Request (inferred from bug report)."]
    },
    {
      "name": "Directionality Check",
      "result": "PASS",
      "evidence": ["CU rejects UE Context Setup Request from Attacker (DU) as expected."]
    },
    {
      "name": "ID Consistency",
      "result": "FAIL",
      "evidence": ["CU accepts Initial UL RRC Transfer with mutated DU ID after Context Setup."]
    }
  ],
  "preliminary_verdict": "implementation-bug",
  "rationale": "The CU implementation fails to enforce strict state checks for F1AP procedures. It accepts a UE Context Setup Response without a preceding valid Request transaction, and subsequently allows an Initial UL RRC Message Transfer with inconsistent IDs to proceed/corrupt the state."
}
```

### 6.3 Check

**Verification:** Check that invariants are tested and a preliminary verdict is proposed.

## Step 7: Evidence Linker

### 7.1 Requirements

**Goal:** Bundle a compact evidence pack linking spec clauses and code lines with an expected-vs-observed chain.

**Input:**
- `bug_card`: From Step 1.
- `invariant_checks`: From Step 6.
- `snippets`: From Step 4.

**Role & Instructions:**
Act as **EvidenceBundler**. Assemble a minimal yet complete pack: spec refs (id/anchor), code refs (file:lines), and a single paragraph contrasting expected vs observed.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["evidence_pack"],
        "properties": {
          "evidence_pack": {
            "type": "object",
            "required": ["spec_citations", "code_citations", "expected_vs_observed"],
            "properties": {
              "spec_citations": { "type": "array", "items": { "type": "string" } },
              "code_citations": { "type": "array", "items": { "type": "string" } },
              "expected_vs_observed": { "type": "string" }
            }
          }
        }
      }
```

### 7.2 Results

```json
{
  "evidence_pack": {
    "spec_citations": ["TS 38.473 8.4.1.2", "TS 38.473 9.3.1.23"],
    "code_citations": ["openair2/F1AP/f1ap_cu_ue_context_management.c:70-84", "openair2/F1AP/f1ap_cu_rrc_message_transfer.c:48-64"],
    "expected_vs_observed": "Expected: CU should drop unsolicited UE Context Setup Response and reject Initial UL RRC Message Transfer if IDs don't match established context. Observed: CU processes both, leading to invalid state and successful registration."
  }
}
```

### 7.3 Check

**Verification:** Check that an evidence pack with spec and code citations and a comparison paragraph is created.

## Step 8: Report Generator

### 8.1 Requirements

**Goal:** Emit a one-page final report and request human sign-off on verdict.

**Input:**
- `bug_card`: From Step 1.
- `invariant_checks`: From Step 6.
- `evidence_pack`: From Step 7.

**Role & Instructions:**
Act as **ReportWriter**. Produce the final JSON report with {verdict, rationale, citations, suggested_repro, risks, next_steps}. Be concise and actionable.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["report", "review_request"],
        "properties": {
          "report": {
            "type": "object",
            "required": ["verdict", "rationale", "citations", "suggested_repro", "next_steps"],
            "properties": {
              "verdict": { "type": "string", "$ref": "#/$defs/VerdictEnum" },
              "rationale": { "type": "string" },
              "citations": {
                "type": "object",
                "properties": {
                  "spec": { "type": "array", "items": { "type": "string" } },
                  "code": { "type": "array", "items": { "type": "string" } }
                }
              },
              "suggested_repro": { "type": "array", "items": { "type": "string" } },
              "risks": { "type": "array", "items": { "type": "string" } },
              "next_steps": { "type": "array", "items": { "type": "string" } }
            }
          },
          "review_request": { "type": "string", "enum": ["HumanInLoop_C_required"] }
        }
      }
```

### 8.2 Results

```json
{
  "report": {
    "verdict": "implementation-bug",
    "rationale": "The CU incorrectly processes unsolicited UE Context Setup Response messages and fails to validate consistency of F1AP IDs across parallel/out-of-order messages, allowing an attacker to bypass proper setup flows.",
    "citations": {
      "spec": ["TS 38.473 8.4.1", "TS 38.473 9.3.1.23"],
      "code": ["openair2/F1AP/f1ap_cu_ue_context_management.c", "openair2/F1AP/f1ap_cu_rrc_message_transfer.c"]
    },
    "suggested_repro": [
      "Send UE Context Setup Request (to be rejected).",
      "Send UE Context Setup Response (fake success).",
      "Send Initial UL RRC Message Transfer with different DU ID/Transaction ID.",
      "Send RRC Setup Complete."
    ],
    "risks": ["Security bypass", "State corruption", "Denial of Service"],
    "next_steps": [
      "Implement Transaction ID validation for UE Context Setup Response.",
      "Ensure Initial UL RRC Message Transfer checks against existing Context/IDs."
    ]
  },
  "review_request": "HumanInLoop_C_required"
}
```

### 8.3 Check

**Verification:** Check that the final report contains a verdict, rationale, citations, and next steps.