# OAI RAN Bug Root-Cause Analysis Workflow

This notebook implements a structured workflow for investigating bugs in the OpenAirInterface5G (OAI) RAN project. It follows the process defined in `QAprompt.json`, guiding the user from raw bug reports to root-cause analysis by cross-referencing specifications and code.

### Agent Instruction

If you are a AI coding agent, you are given access to access the entire codebase in the current folder. Also, you have access to "spec_ingestor.py" to help you make correct citation to the latest 3GPP standard. Note it is not using AI and you should consider it a grep tool with structured output. If you make any claim about 3GPP standard, you *MUST* verify it with this tool. Use it strictly follow this pattern "C:/Users/woaiw/cmd/envs/CS211/python.exe p:/openairinterface5g/spec_ingestor.py F1AP.docx 'keyword'" for F1AP and "C:/Users/woaiw/cmd/envs/CS211/python.exe p:/openairinterface5g/spec_ingestor.py NGAP.docx 'keyword'"

### Global Definitions
The workflow adheres to the following definitions and enumerations:

```json
  "$defs": {
    "RepoEnum": {
      "type": "string",
      "enum": ["https://gitlab.eurecom.fr/oai/openairinterface5g/-/tree/develop/openair2/F1AP?ref_type=heads"]
    },
    "InterfaceEnum": {
      "type": "string",
      "enum": ["F1AP","E1AP","NGAP","XnAP","RRC","PDCP","RLC","MAC","GTP-U","SCTP"]
    },
    "VerdictEnum": {
      "type": "string",
      "enum": ["spec-nonconformance", "implementation-bug", "ambiguous", "needs-more-data"]
    },
    "ComponentEnum": {
      "type": "string",
      "enum": ["gNB-CU", "gNB-DU", "UE", "AMF", "UPF"]
    }
  },
```

The workflow consists of sequential steps, each broken down into:
1. **Requirements:** Definition of the step's goal, inputs, and expected outputs based on `QAprompt.json`.
2. **Results:** Placeholders for the execution results.
3. **Check:** Verification criteria to ensure the step was completed successfully.

## Step 1: Bug Ingest

### 1.1 Requirements

**Goal:** Normalize a freeform bug report into a structured bug card with interface/procedure guesses and key identifiers.

**Input:**
- `bug_text`: Bug: gNB-CU hangs when receiving F1AP Setup Request with mutated DU-Served-Cell-List.
Repro Steps:
1) The attacker intercepts a legitimate UE Context Modification request.
2) Modify the RNTI in the IE to an incorrect value and forward it to the gNB-CU.
3) For gNB-DUs, the attacker waits for UE release and immediately sends a DLRRCMessage Transfer.

**Role & Instructions:**
Act as **BugCardBuilder**. Extract and normalize from `bug_text`: likely interface(s), procedure(s), component roles, and key IDs (transaction IDs, CU/DU UE F1AP IDs, DU ID, RNTI, PCI, served-cell list). Summarize observed vs expected (if implied).

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["bug_card"],
        "properties": {
          "bug_card": {
            "type": "object",
            "required": ["observed_behavior", "interface_guess", "procedure_guess", "components_involved"],
            "properties": {
              "observed_behavior": { "type": "string" },
              "interface_guess": { "type": "array", "items": {"$ref":"#/$defs/InterfaceEnum" } },
              "procedure_guess": { "type": "array", "items": { "type": "string" } },
              "components_involved": { "type": "array", "items": {"$ref":"#/$defs/ComponentEnum" },"minItems":1 },
              "key_ids": {
                "type": "object",
                "properties": {
                  "transaction_id": { "type": "string" },
                  "cu_ue_f1ap_id": { "type": "string" },
                  "du_ue_f1ap_id": { "type": "string" },
                  "du_id": { "type": "string" },
                  "rnti": { "type": "string" },
                  "pci": { "type": "string" },
                  "served_cell_list": { "type": "array", "items": { "type": "string" } }
                }
              },
              "signals_or_timers": { "type": "array", "items": { "type": "string" } }
            }
          }
        }
      }
```

In [None]:
{
  "bug_card": {
    "observed_behavior": "gNB-CU hangs when receiving F1AP Setup Request with mutated DU-Served-Cell-List, possibly triggered by a sequence involving UE Context Modification and DLRRCMessage Transfer.",
    "interface_guess": ["F1AP", "RRC"],
    "procedure_guess": ["F1 Setup", "UE Context Modification", "DL RRC Message Transfer"],
    "components_involved": ["gNB-CU", "gNB-DU"],
    "key_ids": {
      "rnti": "modified",
      "served_cell_list": ["mutated"]
    },
    "signals_or_timers": ["F1AP_SETUP_REQUEST", "UE_CONTEXT_MODIFICATION_REQUEST", "DL_RRC_MESSAGE_TRANSFER"]
  }
}

### 1.3 Check

**Verification:** Check that the output matches the JSON schema, specifically that `interface_guess` and `procedure_guess` are not empty, and no freeform prose is included.

## Step 2: Spec Section Fetcher

### 2.1 Requirements

**Goal:** Propose spec sections (procedures, timers, message formats) that likely govern the bug.

**Input:**
- `bug_card`: Structured description from Step 1.
- `spec_toc`: Table of contents with ids/titles/anchors.

**Role & Instructions:**
Act as **SpecIndexer**. Using `bug_card` and attached spec slices, identify 3–6 most relevant spec sections. Produce keywords, titles, and summarize `expected_behaviour`.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["candidate_spec_sections","expected_behaviour"],
        "properties": {
          "candidate_spec_sections": {
            "type": "array",
            "minItems": 3,
            "maxItems": 6,
            "items": {
              "type": "object",
              "required": ["Keywords", "title"],
              "properties": { "Keywords": { "type": "array", "items": { "type": "string" }}, "title": { "type": "array", "items": { "type": "string" }} }
            }
          },
          "expected_behaviour": {"type": "string"}
        }
      }
```

In [None]:
{
  "candidate_spec_sections": [
    {
      "Keywords": ["F1 Setup Request", "DU Served Cell List"],
      "title": ["9.2.1.4", "F1 SETUP REQUEST"]
    },
    {
      "Keywords": ["UE Context Modification"],
      "title": ["9.2.2.8", "UE CONTEXT MODIFICATION RESPONSE"]
    },
    {
      "Keywords": ["DL RRC Message Transfer"],
      "title": ["9.2.3.2", "DL RRC MESSAGE TRANSFER"]
    }
  ],
  "expected_behaviour": "F1 Setup Request should contain a valid DU Served Cell List (e.g., proper length, valid IEs). If invalid, CU should reject with F1 Setup Failure. UE Context Modification should proceed with valid IDs. DL RRC Message Transfer is typically CU-to-DU; if involved in a repro sequence, its direction and content must be validated."
}

### 2.3 Check

**Verification:** Check that 3-6 relevant sections are identified and `expected_behaviour` is summarized.

## Step 3: Code Fetcher (metadata-driven)

### 3.1 Requirements

**Goal:** Select likely source files/functions using metadata (paths, interface names, keywords, ASN.1 types, timers).

**Input:**
- `bug_card`: From Step 1.
- `candidate_spec_sections`: From Step 2.
- `repo_metadata`: Path, interface names, and known keywords.

**Role & Instructions:**
Act as **CodeLocator**. Build a prioritized list of files/functions to fetch using interface names, procedure/timer, and keywords. Plan recursion if initial fetch seems insufficient.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["candidate_code"],
        "properties": {
          "candidate_code": {
            "type": "array",
            "items": {
              "type": "object",
              "required": ["path", "function_name", "reason"],
              "properties": { "path": { "type": "string" }, "function_name": { "type": "string" }, "reason": { "type": "string" } }
            }
          }
        }
      }
```

In [None]:
{
  "candidate_code": [
    {
      "path": "openair2/RRC/NR/rrc_gNB_du.c",
      "function_name": "rrc_gNB_process_f1_setup_req",
      "reason": "Handles F1 Setup Request logic in RRC, including cell capacity and info validation."
    },
    {
      "path": "openair2/F1AP/lib/f1ap_interface_management.c",
      "function_name": "decode_f1ap_setup_request",
      "reason": "Decodes F1 Setup Request and parses the DU Served Cell List IE."
    },
    {
      "path": "openair2/F1AP/f1ap_cu_interface_management.c",
      "function_name": "CU_handle_F1_SETUP_REQUEST",
      "reason": "Entry point for F1 Setup Request handling in CU F1AP task."
    }
  ]
}

### 3.3 Check

**Verification:** Check that `candidate_code` is not empty and includes paths and function names.

## Step 4: Lightweight Retriever

### 4.1 Requirements

**Goal:** Retrieve high-signal snippets only from shortlisted specs and code.

**Input:**
- `candidate_spec_sections`: From Step 2.
- `candidate_code`: From Step 3.

**Role & Instructions:**
Act as **SnippetRetriever**. For each query, return top snippets with exact locations (spec id/anchor, file:line-range). Keep each snippet ≤ 120 words.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["snippets"],
        "properties": {
          "snippets": {
            "type": "array",
            "items": {
              "type": "object",
              "required": ["source", "location", "text", "kind"],
              "properties": {
                "kind": { "type": "string", "enum": ["spec", "code"] },
                "source": { "type": "string" },
                "location": { "type": "string" },
                "text": { "type": "string" }
              }
            }
          }
        }
      }
```

In [None]:
{
  "snippets": [
    {
      "kind": "code",
      "source": "openair2/RRC/NR/rrc_gNB_du.c",
      "location": "307-312",
      "text": "if (req->num_cells_available != 1) { ... fail.cause = F1AP_CauseRadioNetwork_gNB_CU_Cell_Capacity_Exceeded; rrc->mac_rrc.f1_setup_failure(assoc_id, &fail); return; }"
    },
    {
      "kind": "code",
      "source": "openair2/F1AP/lib/f1ap_interface_management.c",
      "location": "844",
      "text": "AssertError(out->num_cells_available > 0, return false, \"at least 1 cell must be present\");"
    },
    {
      "kind": "spec",
      "source": "TS 38.473",
      "location": "9.2.1.4",
      "text": "F1 SETUP REQUEST ... gNB-DU Served Cells List ... This IE contains the information of the cells served by the gNB-DU."
    }
  ]
}

### 4.3 Check

**Verification:** Check that snippets are retrieved from both spec and code, with exact locations.

## Step 5: Event/State Sketcher (draft)

### 5.1 Requirements

**Goal:** Draft expected message sequence and 5–8 state sketch from snippets.

**Input:**
- `bug_card`: From Step 1.
- `snippets`: From Step 4.

**Role & Instructions:**
Act as **SketchMaker**. Build (1) an expected message sequence as ordered events and (2) a compact state machine (5–8 states, transitions with guards). Note assumptions and open questions.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["sequence_diagram", "state_machine", "assumptions", "questions", "review_request"],
        "properties": {
          "sequence_diagram": {
            "type": "array",
            "items": { "type": "object", "required": ["from", "to", "message"], "properties": { "from": { "type": "string" }, "to": { "type": "string" }, "message": { "type": "string" }, "precond": { "type": "string" }, "postcond": { "type": "string" } } }
          },
          "state_machine": {
            "type": "object",
            "required": ["states", "transitions"],
            "properties": {
              "states": { "type": "array", "items": { "type": "string" }, "minItems": 5, "maxItems": 8 },
              "transitions": {
                "type": "array",
                "items": { "type": "object", "required": ["from", "to", "on"], "properties": { "from": { "type": "string" }, "to": { "type": "string" }, "on": { "type": "string" }, "guard": { "type": "string" } } }
              }
            }
          },
          "assumptions": { "type": "array", "items": { "type": "string" } },
          "questions": { "type": "array", "items": { "type": "string" } },
          "review_request": { "type": "string", "enum": ["HumanInLoop_B_required"] }
        }
      }
```

In [None]:
{
  "sequence_diagram": [
    {
      "from": "Attacker",
      "to": "gNB-CU",
      "message": "UE Context Modification (Tampered RNTI)"
    },
    {
      "from": "Attacker/DU",
      "to": "gNB-CU",
      "message": "DL RRC Message Transfer (Unexpected)"
    },
    {
      "from": "Attacker/DU",
      "to": "gNB-CU",
      "message": "F1 Setup Request (Mutated Served Cell List)"
    },
    {
      "from": "gNB-CU",
      "to": "Attacker",
      "message": "No Response (Hang)"
    }
  ],
  "state_machine": {
    "states": [
      "IDLE",
      "CONNECTED",
      "F1_SETUP_PROCESSING",
      "VALIDATING_CELLS",
      "HANG",
      "REJECT"
    ],
    "transitions": [
      {
        "from": "IDLE",
        "to": "F1_SETUP_PROCESSING",
        "on": "F1 Setup Request"
      },
      {
        "from": "F1_SETUP_PROCESSING",
        "to": "VALIDATING_CELLS",
        "on": "Decode Success"
      },
      {
        "from": "VALIDATING_CELLS",
        "to": "HANG",
        "on": "Mutated List Traversal",
        "guard": "Missing Bounds Check"
      }
    ]
  },
  "assumptions": [
    "Attacker can spoof DU messages and maintain SCTP association.",
    "CU is in a state where it accepts F1 Setup (e.g., initial or re-setup)."
  ],
  "questions": [
    "Does the mutation involve recursive structures or large lists?",
    "Is the hang a CPU spin or a deadlock?"
  ],
  "review_request": "HumanInLoop_B_required"
}

### 5.3 Check

**Verification:** Check for a sequence diagram, a state machine with 5-8 states, and any assumptions.

## Step 6: Conformance Checker (simple invariants)

### 6.1 Requirements

**Goal:** Check core invariants to propose preliminary verdict.

**Input:**
- `bug_card`: From Step 1.
- `sequence_diagram`: From Step 5.
- `state_machine`: From Step 5.
- `snippets`: From Step 4.

**Role & Instructions:**
Act as **ConformanceChecker**. Using snippets as authority, test invariants: directionality, req/resp pairing, ID binding, transaction scoping, timer preconditions. For each invariant: PASS/FAIL with cite. Propose preliminary verdict.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["invariant_checks", "preliminary_verdict", "rationale"],
        "properties": {
          "invariant_checks": {
            "type": "array",
            "items": { "type": "object", "required": ["name", "result", "evidence"], "properties": { "name": { "type": "string" }, "result": { "type": "string", "enum": ["PASS", "FAIL", "UNKNOWN"] }, "evidence": { "type": "array", "items": { "type": "string" } } } }
          },
          "preliminary_verdict": { "type": "string", "enum": ["spec-nonconformance", "implementation-bug", "ambiguous", "needs-more-data"] },
          "rationale": { "type": "string" }
        }
      }
```

In [None]:
{
  "invariant_checks": [
    {
      "name": "F1 Setup Cell Count Limit",
      "result": "PASS",
      "evidence": [
        "rrc_gNB_du.c:307 explicitly checks num_cells_available != 1 and sends failure."
      ]
    },
    {
      "name": "Input Robustness (Mutated Data)",
      "result": "FAIL",
      "evidence": [
        "Observed behavior indicates a hang/stall when processing mutated DU-Served-Cell-List.",
        "Code relies on ASN.1 decoder (uper_decode_complete) and subsequent struct access; specific mutation likely bypasses checks.",
        "CRITICAL: decode_f1ap_setup_request lacks upper bound check for num_cells_available against F1AP_MAX_NB_CELLS (2), allowing buffer overflow."
      ]
    }
  ],
  "preliminary_verdict": "implementation-bug",
  "rationale": "The gNB-CU fails to gracefully handle invalid/mutated input, resulting in a system hang. This violates robustness requirements, even if specific logical checks (like cell count) exist. The lack of bounds check on num_cells_available in decode_f1ap_setup_request is the primary cause."
}

### 6.3 Check

**Verification:** Check that invariants are tested and a preliminary verdict is proposed.

## Step 7: Evidence Linker

### 7.1 Requirements

**Goal:** Bundle a compact evidence pack linking spec clauses and code lines with an expected-vs-observed chain.

**Input:**
- `bug_card`: From Step 1.
- `invariant_checks`: From Step 6.
- `snippets`: From Step 4.

**Role & Instructions:**
Act as **EvidenceBundler**. Assemble a minimal yet complete pack: spec refs (id/anchor), code refs (file:lines), and a single paragraph contrasting expected vs observed.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["evidence_pack"],
        "properties": {
          "evidence_pack": {
            "type": "object",
            "required": ["spec_citations", "code_citations", "expected_vs_observed"],
            "properties": {
              "spec_citations": { "type": "array", "items": { "type": "string" } },
              "code_citations": { "type": "array", "items": { "type": "string" } },
              "expected_vs_observed": { "type": "string" }
            }
          }
        }
      }
```

In [None]:
{
  "evidence_pack": {
    "spec_citations": [
      "TS 38.473 9.2.1.4 F1 SETUP REQUEST",
      "TS 38.473 9.3.1.5 gNB-DU Served Cells List"
    ],
    "code_citations": [
      "openair2/RRC/NR/rrc_gNB_du.c:292 (rrc_gNB_process_f1_setup_req)",
      "openair2/F1AP/lib/f1ap_interface_management.c:841 (decode_f1ap_setup_request)"
    ],
    "expected_vs_observed": "Expected the gNB-CU to validate the DU Served Cell List and reject any malformed or mutated content with an F1 Setup Failure message, or drop the message if decoding fails. Observed that the gNB-CU hangs upon receiving the mutated list, due to a stack buffer overflow in decode_f1ap_setup_request where num_cells_available (unchecked) exceeds the fixed array size of 2."
  }
}

### 7.3 Check

**Verification:** Check that an evidence pack with spec and code citations and a comparison paragraph is created.

## Step 8: Report Generator

### 8.1 Requirements

**Goal:** Emit a one-page final report and request human sign-off on verdict.

**Input:**
- `bug_card`: From Step 1.
- `invariant_checks`: From Step 6.
- `evidence_pack`: From Step 7.

**Role & Instructions:**
Act as **ReportWriter**. Produce the final JSON report with {verdict, rationale, citations, suggested_repro, risks, next_steps}. Be concise and actionable.

**Output Schema:**
Return JSON only conforming to the following structure:

```json
      "output_schema": {
        "type": "object",
        "required": ["report", "review_request"],
        "properties": {
          "report": {
            "type": "object",
            "required": ["verdict", "rationale", "citations", "suggested_repro", "next_steps"],
            "properties": {
              "verdict": { "type": "string", "$ref": "#/$defs/VerdictEnum" },
              "rationale": { "type": "string" },
              "citations": {
                "type": "object",
                "properties": {
                  "spec": { "type": "array", "items": { "type": "string" } },
                  "code": { "type": "array", "items": { "type": "string" } }
                }
              },
              "suggested_repro": { "type": "array", "items": { "type": "string" } },
              "risks": { "type": "array", "items": { "type": "string" } },
              "next_steps": { "type": "array", "items": { "type": "string" } }
            }
          },
          "review_request": { "type": "string", "enum": ["HumanInLoop_C_required"] }
        }
      }
```

In [None]:
{
  "report": {
    "verdict": "implementation-bug",
    "rationale": "The gNB-CU fails to handle mutated DU-Served-Cell-List in F1 Setup Request gracefully, leading to a hang. This is likely due to improper validation of recursive or complex IEs in the Served Cell List during RRC processing or ASN.1 decoding. Specifically, `decode_f1ap_setup_request` does not check if `num_cells_available` exceeds `F1AP_MAX_NB_CELLS` (2), causing a buffer overflow.",
    "citations": {
      "spec": [
        "TS 38.473 9.2.1.4 F1 Setup Request"
      ],
      "code": [
        "openair2/RRC/NR/rrc_gNB_du.c",
        "openair2/F1AP/lib/f1ap_interface_management.c"
      ]
    },
    "suggested_repro": [
      "Send F1 Setup Request with mutated/malformed DU Served Cell List (e.g. invalid lengths, recursive structures, or >2 cells)."
    ],
    "risks": [
      "DoS attack against gNB-CU.",
      "Potential RCE due to buffer overflow."
    ],
    "next_steps": [
      "Debug the specific mutation that causes the hang.",
      "Add bounds checks in F1AP decoding and RRC processing (ensure num_cells_available <= 2)."
    ]
  },
  "review_request": "HumanInLoop_C_required"
}

### 8.3 Check

**Verification:** Check that the final report contains a verdict, rationale, citations, and next steps.