# Plan Phase: Investigation Plan Generation

## Overview

The Plan Phase generates an Investigation Plan that guides the troubleshooting process in Phase1. It analyzes the Knowledge Graph from Phase0 and creates a structured plan with specific steps to investigate the volume I/O issues.

### Key Components

- **InvestigationPlanner**: Orchestrates the plan generation process
- **Rule-based Plan Generator**: Creates initial investigation steps based on predefined rules
- **Static Plan Steps**: Incorporates mandatory steps from `static_plan_step.json`
- **LLM Plan Generator**: Refines and enhances the plan using an LLM without tool invocation

### Three-Step Process

1. **Rule-based preliminary steps**: Generate critical initial investigation steps
2. **Static plan steps integration**: Add mandatory steps from `static_plan_step.json`
3. **LLM refinement**: Refine and supplement the plan using an LLM without tool invocation

In [1]:
# Import necessary libraries
import json
import os
from typing import Dict, List, Any, Optional

# Import mock data for demonstration
import sys
sys.path.append('../')
from tests.mock_knowledge_graph import create_mock_knowledge_graph

## Mock Static Plan Steps

First, let's create a mock version of the static plan steps that would normally be loaded from `static_plan_step.json`.

In [2]:
# Mock static plan steps
MOCK_STATIC_PLAN_STEPS = [
    {
        "step": "S1",
        "description": "Check for primary issues",
        "tool": "kg_get_all_issues",
        "arguments": {"severity": "primary"},
        "expected": "Primary issues in the system"
    },
    {
        "step": "S2",
        "description": "Analyze issues",
        "tool": "kg_analyze_issues",
        "arguments": {},
        "expected": "Root cause analysis and patterns"
    }
]

# Mock fallback steps
MOCK_FALLBACK_STEPS = [
    {
        "step": "F1",
        "description": "Print Knowledge Graph",
        "tool": "kg_print_graph",
        "arguments": {"include_details": True, "include_issues": True},
        "expected": "Complete system visualization",
        "trigger": "kg_get_entity_info_failed"
    }
]

In [3]:
class MockInvestigationPlanner:
    """Mock implementation of Investigation Planner"""
    
    def __init__(self, knowledge_graph, config_data=None):
        self.knowledge_graph = knowledge_graph
        self.config_data = config_data or {}
        print("Initializing Investigation Planner...")
    
    def generate_investigation_plan(self, pod_name, namespace, volume_path, message_list=None):
        print(f"Generating investigation plan for pod {namespace}/{pod_name}")
        
        # Generate steps
        steps = [
            {
                "step": "1",
                "description": "Get pod details",
                "tool": "kg_get_entity_info",
                "arguments": {"entity_type": "Pod", "id": f"gnode:Pod:{namespace}/{pod_name}"},
                "expected": "Pod configuration and status"
            },
            {
                "step": "2",
                "description": "Check related PVC",
                "tool": "kg_find_path",
                "arguments": {
                    "source_entity_type": "Pod",
                    "source_id": f"gnode:Pod:{namespace}/{pod_name}",
                    "target_entity_type": "PVC",
                    "target_id": "*"
                },
                "expected": "Path from Pod to PVC"
            }
        ]
        
        # Format the plan
        formatted_plan = f"Investigation Plan:\nTarget: Pod {namespace}/{pod_name}, Volume Path: {volume_path}\nGenerated Steps: {len(steps)} steps\n\n"
        
        for step in steps:
            formatted_plan += f"Step {step['step']}: {step['description']} | Tool: {step['tool']} | Expected: {step['expected']}\n"
        
        return formatted_plan, message_list

## Running the Plan Phase

Now let's run the Plan Phase with our mock implementation.

In [4]:
# Create a mock knowledge graph
knowledge_graph = create_mock_knowledge_graph()

# Define the target pod, namespace, and volume path
target_pod = "test-pod"
target_namespace = "default"
target_volume_path = "/var/lib/kubelet/pods/pod-123-456/volumes/kubernetes.io~csi/test-pv/mount"

# Define configuration data
config_data = {
    "plan_phase": {
        "save_plan": True
    }
}

# Initialize the investigation planner
planner = MockInvestigationPlanner(knowledge_graph, config_data)

# Generate the investigation plan
investigation_plan, _ = planner.generate_investigation_plan(target_pod, target_namespace, target_volume_path)

Initializing Investigation Planner...
Generating investigation plan for pod default/test-pod


In [5]:
# Display the investigation plan
print(investigation_plan)

Investigation Plan:
Target: Pod default/test-pod, Volume Path: /var/lib/kubelet/pods/pod-123-456/volumes/kubernetes.io~csi/test-pv/mount
Generated Steps: 2 steps

Step 1: Get pod details | Tool: kg_get_entity_info | Expected: Pod configuration and status
Step 2: Check related PVC | Tool: kg_find_path | Expected: Path from Pod to PVC



## Sample Investigation Plan

A complete investigation plan would include more steps and fallback steps. Here's an example of a more comprehensive plan:

In [6]:
sample_plan = """
Investigation Plan:
Target: Pod default/example-pod, Volume Path: /var/lib/kubelet/pods/123/volumes/kubernetes.io~csi/pvc-abc/mount
Generated Steps: 8 steps

Step 1: Get pod details | Tool: kg_get_entity_info(entity_type='Pod', id='gnode:Pod:default/example-pod') | Expected: Pod configuration and status
Step 2: Check related PVC | Tool: kg_find_path(source_entity_type='Pod', source_id='gnode:Pod:default/example-pod', target_entity_type='PVC', target_id='*') | Expected: Path from Pod to PVC
Step 3: Get PVC details | Tool: kg_get_entity_info(entity_type='PVC', id='gnode:PVC:default/example-pvc') | Expected: PVC configuration and status
Step 4: Check related PV | Tool: kg_find_path(source_entity_type='PVC', source_id='gnode:PVC:default/example-pvc', target_entity_type='PV', target_id='*') | Expected: Path from PVC to PV
Step 5: Get PV details | Tool: kg_get_entity_info(entity_type='PV', id='gnode:PV:pv-example') | Expected: PV configuration and status
Step 6: Check node status | Tool: kg_get_entity_info(entity_type='Node', id='gnode:Node:worker-1') | Expected: Node status and conditions
Step 7: Check for issues | Tool: kg_get_all_issues(severity='primary') | Expected: Primary issues in the system
Step 8: Analyze issues | Tool: kg_analyze_issues() | Expected: Root cause analysis and patterns

Fallback Steps (if main steps fail):
Step F1: Print Knowledge Graph | Tool: kg_print_graph(include_details=True, include_issues=True) | Expected: Complete system visualization | Trigger: kg_get_entity_info_failed
Step F2: Check system logs | Tool: kubectl_logs(pod_name='example-pod', namespace='default') | Expected: Pod logs for error messages | Trigger: kg_get_all_issues_failed
"""

print(sample_plan)


Investigation Plan:
Target: Pod default/example-pod, Volume Path: /var/lib/kubelet/pods/123/volumes/kubernetes.io~csi/pvc-abc/mount
Generated Steps: 8 steps

Step 1: Get pod details | Tool: kg_get_entity_info(entity_type='Pod', id='gnode:Pod:default/example-pod') | Expected: Pod configuration and status
Step 2: Check related PVC | Tool: kg_find_path(source_entity_type='Pod', source_id='gnode:Pod:default/example-pod', target_entity_type='PVC', target_id='*') | Expected: Path from Pod to PVC
Step 3: Get PVC details | Tool: kg_get_entity_info(entity_type='PVC', id='gnode:PVC:default/example-pvc') | Expected: PVC configuration and status
Step 4: Check related PV | Tool: kg_find_path(source_entity_type='PVC', source_id='gnode:PVC:default/example-pvc', target_entity_type='PV', target_id='*') | Expected: Path from PVC to PV
Step 5: Get PV details | Tool: kg_get_entity_info(entity_type='PV', id='gnode:PV:pv-example') | Expected: PV configuration and status
Step 6: Check node status | Tool: kg

## Summary

The Plan Phase is responsible for generating an Investigation Plan that guides the troubleshooting process in Phase1. It follows a three-step process:

1. Rule-based preliminary steps: Generate critical initial investigation steps
2. Static plan steps integration: Add mandatory steps from static_plan_step.json
3. LLM refinement: Refine and supplement the plan using an LLM without tool invocation

The output of the Plan Phase is a structured Investigation Plan that includes:

- Main investigation steps with tools, arguments, and expected outcomes
- Fallback steps that can be triggered if main steps fail

This Investigation Plan serves as the roadmap for Phase1, which will execute the plan to identify the root cause of the volume I/O issues.