# Assignment: Test Fail-Safe Handling and Memory Debugging in LangGraph


---

### Objective:
This assignment focuses on two critical aspects of robust LangGraph application development:
1.  **Fail-Safe Handling**: Designing a LangGraph workflow that can gracefully handle errors or unexpected outcomes in one of its nodes, preventing a complete crash and allowing for alternative execution paths.
2.  **Memory Debugging**: Understanding and inspecting the state (memory) of the LangGraph workflow at each step of its execution to effectively debug and trace data flow.

---

### Instructions:
1.  **Environment Setup**: Install the necessary Python libraries: `pip install langgraph langchain_core`.
2.  **Jupyter Notebook**: All your code, outputs, observations, and analysis must be documented in this Jupyter Notebook.
3.  **Workflow Scenario**: You will build a three-node workflow for a simulated data processing task:
    * **Data Fetcher**: Simulates fetching data, which can *sometimes fail*.
    * **Data Processor**: Processes the fetched data.
    * **Report Generator**: Generates a final report.
4.  **Fail-Safe Implementation**: If the `Data Fetcher` node fails, the graph should skip the `Data Processor` and directly transition to the `Report Generator`, which will then generate an "error report" or a report indicating data unavailability.
5.  **Memory Debugging**: Use LangGraph's streaming capabilities (`app.stream()`) to observe and print the state of the graph after each node's execution. Add print statements within nodes to show intermediate data and state updates.
6.  **Analysis**: Critically evaluate your implementation of fail-safe mechanisms and demonstrate your ability to debug the graph's memory.

---

## Part 1: Setup and Workflow State Definition
Begin by setting up your environment and defining the state your LangGraph workflow will manage.

### Task 1.1: Install Libraries
Ensure `langgraph` and `langchain_core` are installed.

In [None]:
# Install necessary libraries (if not already installed)
# !pip install langgraph langchain_core --quiet

from typing import TypedDict, Optional
from langgraph.graph import StateGraph, END
import random # For simulating random failures

print("Libraries imported!")

### Task 1.2: Define Workflow State
Define a `TypedDict` to represent the state of your LangGraph workflow. This state will hold all necessary information as it passes through the nodes.

* `query`: The initial request or topic for data fetching.
* `raw_data`: The data fetched by the `Data Fetcher` node.
* `processed_data`: The data after being processed by the `Data Processor` node.
* `report`: The final report generated.
* `fetch_success`: A boolean flag indicating if data fetching was successful (crucial for fail-safe).
* `error_message`: A string to store any error messages during data fetching.

In [None]:
class DataWorkflowState(TypedDict):
    query: str
    raw_data: Optional[str]
    processed_data: Optional[str]
    report: Optional[str]
    fetch_success: bool
    error_message: Optional[str]

print("DataWorkflowState defined!")

---

## Part 2: Define Nodes with Fail-Safe Handling
Create the nodes, incorporating error handling and state updates for graceful failure.

### Task 2.1: `fetch_data_node` (with controlled failure)
This node simulates fetching data. Implement a mechanism where it *can fail* based on a probability or a specific input. If it fails, update the state accordingly (set `fetch_success` to `False` and populate `error_message`). Use `try-except` blocks.

In [None]:
def fetch_data_node(state: DataWorkflowState) -> DataWorkflowState:
    print(f"\n--- Node: fetch_data_node (Query: {state['query']}) ---")
    query = state["query"]

    # Simulate a failure condition (e.g., 30% chance of failure)
    # Or, make it fail if query contains a specific keyword like 'fail_data_fetch'
    should_fail = random.random() < 0.3 or "fail_data_fetch" in query.lower()

    if should_fail:
        print("Simulating data fetch FAILURE!")
        return {
            "raw_data": None,
            "fetch_success": False,
            "error_message": f"Failed to fetch data for '{query}' due to a simulated network error."
        }
    else:
        print("Simulating data fetch SUCCESS.")
        simulated_data = f"Successfully fetched sales data for {query}. Key metrics: Revenue up 10%, Customers up 5%, Satisfaction 85%."
        return {
            "raw_data": simulated_data,
            "fetch_success": True,
            "error_message": None
        }

print("fetch_data_node defined!")

### Task 2.2: `process_data_node`
This node processes the fetched data. It should only proceed if `fetch_success` is `True` in the state. Otherwise, it should skip processing and leave `processed_data` as `None`.

In [None]:
def process_data_node(state: DataWorkflowState) -> DataWorkflowState:
    print(f"\n--- Node: process_data_node (Fetch Success: {state['fetch_success']}) ---")
    if not state["fetch_success"]:
        print("Skipping data processing due to previous fetch failure.")
        return {"processed_data": None}
    else:
        raw_data = state["raw_data"]
        # Simulate data processing
        processed_data = f"Processed data: Derived key insights from '{raw_data}'. Identified growth opportunities in Q3 and risk factors related to customer churn."
        print("Data successfully processed.")
        return {"processed_data": processed_data}

print("process_data_node defined!")

### Task 2.3: `generate_report_node`
This node generates the final report. Its content should depend on whether data fetching and processing were successful. If `fetch_success` is `False`, generate an "Error Report"; otherwise, generate a report based on `processed_data`.

In [None]:
def generate_report_node(state: DataWorkflowState) -> DataWorkflowState:
    print(f"\n--- Node: generate_report_node (Fetch Success: {state['fetch_success']}) ---")
    if not state["fetch_success"]:
        report = f"Error Report for Query: {state['query']}\n\nData Fetching Status: Failed\nReason: {state['error_message']}\n\nCould not generate a full report due to data acquisition failure. Please investigate the data source or retry with a different query."
        print("Generated error report.")
    else:
        processed_data = state["processed_data"]
        report = f"Comprehensive Report for Query: {state['query']}\n\n-- Raw Data Summary --\n{state['raw_data']}\n\n-- Processed Insights --\n{processed_data}\n\n-- Conclusion --\nBased on the analysis, significant opportunities for growth are present, but require strategic investment in identified areas."
        print("Generated full report.")
    return {"report": report}

print("generate_report_node defined!")

---

## Part 3: Construct and Execute the LangGraph Workflow
Combine your nodes into a `StateGraph`, define conditional edges for fail-safe handling, and execute the workflow while observing its state.

### Task 3.1: Define Conditional Edge Logic
Create a function that will determine the next node based on the `fetch_success` flag in the state. This is your routing logic for fail-safe handling.

In [None]:
def route_after_fetch(state: DataWorkflowState) -> str:
    print(f"\n--- Router: route_after_fetch (Current fetch_success: {state['fetch_success']}) ---")
    if state["fetch_success"]:
        print("Routing to 'process_data' node.")
        return "process_data"
    else:
        print("Routing directly to 'generate_report' due to fetch failure.")
        return "generate_report"

print("route_after_fetch router defined!")

### Task 3.2: Build and Compile the Graph
Assemble the nodes and define the edges, including the conditional edge for fail-safe handling.

* **Nodes**: Add `fetch_data_node`, `process_data_node`, `generate_report_node`.
* **Entry Point**: Start at `fetch_data_node`.
* **Conditional Edge**: From `fetch_data_node`, use `add_conditional_edges` with `route_after_fetch`.
* **Standard Edges**: From `process_data_node` to `generate_report_node`.
* **End Point**: From `generate_report_node` to `END`.

In [None]:
# Build the graph
workflow = StateGraph(DataWorkflowState)

# Add nodes
workflow.add_node("fetch_data", fetch_data_node)
workflow.add_node("process_data", process_data_node)
workflow.add_node("generate_report", generate_report_node)

# Set entry point
workflow.set_entry_point("fetch_data")

# Define conditional edge from fetch_data
workflow.add_conditional_edges(
    "fetch_data",
    route_after_fetch,
    {
        "process_data": "process_data",
        "generate_report": "generate_report" # Directly to report if fetch failed
    }
)

# Define direct edge from process_data to generate_report
workflow.add_edge("process_data", "generate_report")

# Define exit point
workflow.add_edge("generate_report", END)

# Compile the graph
app = workflow.compile()

print("LangGraph workflow compiled!")

### Task 3.3: Execute Workflow (Success Scenario)
Run the compiled `app` with an input that is likely to succeed (e.g., no `fail_data_fetch` keyword). Observe the full state updates at each step to see how data flows and is transformed.

In [None]:
print("\n\n=========== Running Workflow (Success Scenario) ===========")
initial_state_success = {
    "query": "Q2 2024 Market Trends",
    "raw_data": None, "processed_data": None, "report": None,
    "fetch_success": False, "error_message": None
}

final_state_success = None
for state_update in app.stream(initial_state_success):
    print("\n[LangGraph State Update]:")
    for key, value in state_update.items():
        # Print only the latest update for each key, or the whole state if it's the full step output
        if isinstance(value, dict) and 'node' in value:
            print(f"  Node '{value['node']}': {value}")
        else:
            print(f"  {key}: {value}")
    final_state_success = state_update # Keep track of the last state

print("\n--- Workflow Execution Complete (Success) --- ")
print("Final Report:\n", final_state_success.get('report', 'N/A'))

### Task 3.4: Execute Workflow (Failure Scenario)
Run the compiled `app` with an input that *will guarantee failure* in the `fetch_data_node` (e.g., by including the `fail_data_fetch` keyword in the query). Observe how the graph handles the failure and the resulting report.

In [None]:
print("\n\n=========== Running Workflow (Failure Scenario) ===========")
initial_state_failure = {
    "query": "Recent Stock Market Data (fail_data_fetch)", # Trigger failure
    "raw_data": None, "processed_data": None, "report": None,
    "fetch_success": False, "error_message": None
}

final_state_failure = None
for state_update in app.stream(initial_state_failure):
    print("\n[LangGraph State Update]:")
    for key, value in state_update.items():
        if isinstance(value, dict) and 'node' in value:
            print(f"  Node '{value['node']}': {value}")
        else:
            print(f"  {key}: {value}")
    final_state_failure = state_update # Keep track of the last state

print("\n--- Workflow Execution Complete (Failure) --- ")
print("Final Report:\n", final_state_failure.get('report', 'N/A'))

---

## Part 4: Analysis and Reflection
Provide a comprehensive summary of your findings and reflections based on this assignment.

### Task 4.1: Fail-Safe Handling Analysis
* **How was fail-safe handled?**: Describe the mechanism you implemented to handle the failure in `fetch_data_node`. How did the graph's execution path change?
* **Effectiveness**: Was the fail-safe mechanism effective in preventing a full workflow crash? Did it produce a meaningful output in the failure scenario?
* **Alternative Strategies**: What other strategies could be used for fail-safe handling in LangGraph (e.g., retries, human-in-the-loop, more granular error types)? When would you use them?

### Task 4.2: Memory Debugging Analysis
* **Observing State**: How did `app.stream()` help you understand the state of the graph at each step? What specific information was visible, and how did it change?
* **Debugging Insights**: Provide an example from your `app.stream()` output (either success or failure scenario) that illustrates how you used the state updates to debug or confirm the data flow.
* **Importance of State**: Why is maintaining and being able to inspect the graph's state crucial for debugging complex multi-step LLM applications?
* **Challenges in Debugging**: What might be challenging when debugging memory in larger, more complex LangGraph applications? How could you mitigate these challenges?

---

### Submission:
* Ensure all code cells have been executed and their outputs are visible.
* All analysis and reflections are clearly written in markdown cells.
* Save your Jupyter Notebook as `[YourName]_LangGraph_FailSafe_Memory_Assignment.ipynb`.