## Demonstrates human in loop when parsing document or image using LLM
### - Example shows 
#### - Parsing a sample driving license to structured output and then saving to a file (using agent tool call)
#### - Human intervenes when agent tool is called to investigate parsing output
#### - Human has choice to make correction in parsing output and then submit to tool for completion (saving to file in this case). Or reject output all together.
#### - Using Langchain for agent configuration and OpenAI model for LLM

### 1) Set OpenAI API Key, Install and Import Packages 

In [None]:
#!pip install -U langchain
#!pip install pydantic

In [None]:
import os
os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY"

In [None]:
from langchain.agents import create_agent
from langchain.messages import HumanMessage
from pydantic import BaseModel
from langchain.tools import tool
from langgraph.checkpoint.memory import InMemorySaver
from langchain.agents.middleware import HumanInTheLoopMiddleware
from langgraph.types import Command
import base64

### 2) Read a sample driving license image and encode to base64

In [None]:
with open("dl.jpg", "rb") as image_file:
    img_b64 = base64.b64encode(image_file.read()).decode("utf-8")

### 3) Define Class with properties holding driving license details

In [None]:
class DrivingLicenseInfo(BaseModel):
    driver_name: str
    driving_license_number: str
    expiry_date: str

### 4) Define agent tool which takes parsed license details as input and saves to a local text file

In [None]:
@tool
def save_license_data(driver_name: str, license_number: str, expiry_date: str) -> str:
    """Saves license data in a text file
    Args:
        driver_name: name of the driver
        license_number: driving license number
        expiry_date: License expirty date
    """
    with open("driving_license_file.txt", "w") as f:
        f.write(f"driver {driver_name} has driving license {license_number} with expiry date {expiry_date}")
    return "License saved successfully."

### 5) Create agent which will use LLM to parse document details and also call tool
#### - Pay attention to response_format configuration for structured output
#### - Also check memory configuration (InMemorySaver) which allows holding coversational memory
#### - Finally check HumanInTheLoopMiddleware configuration which allows user to intecept tool execution for approval or rejection

In [None]:
agent = create_agent(
    model='gpt-5-nano',
    system_prompt="You are a driving license image reader which reads information from driving license image and saves into a text file.",
    response_format=DrivingLicenseInfo,
    tools=[save_license_data],
    checkpointer=InMemorySaver(),
    middleware=[
        HumanInTheLoopMiddleware(
            interrupt_on={
                 "save_license_data": True,
            },
            description_prefix="saving the license data needs approval",
        ),
    ]
)

### 6) Invoke agent passing base64 encoded image as multi-modal input
#### - Passing thread id as conversational memory is configured for the agent

In [None]:
multimodal_question = HumanMessage(content=[
    {"type": "image", "base64": img_b64, "mime_type": "image/jpg"}
])

config = {"configurable": {"thread_id": "1"}}

response = agent.invoke(
    {"messages": [multimodal_question]},
     config=config
)


### 7) Check if interruption was raised for human in loop intervention
#### - if interruption was raised, fetch tool name and argument passed to the tool (in this case llm's parsing details for driving license)

In [None]:
## interruption flag - it should show value as true as human in loop interruption has been raised.
print('__interrupt__' in response)

In [None]:
## Fetch tool name for which interruption was raised
tool_name = response['__interrupt__'][0].value["action_requests"][0]["name"]

In [None]:
dl_parsing_details = response['__interrupt__'][0].value["action_requests"][0]["args"]
print(dl_parsing_details)

### 8) Let human update tool argument. In this case, updating the driver name.

In [None]:
dl_parsing_details = {'driver_name': 'John K Doe', 'license_number': '123456789', 'expiry_date': '07/11/2025'}

### 9) Human approves the tool execution with updated parsing details

In [None]:
response = agent.invoke(
    Command(        
        resume={
            "decisions": [
                {
                    # Approve with edit
                    "type": "edit",
                    "edited_action": {
                        # Tool name to call
                        "name": tool_name,
                        # updated arguments
                        "args": dl_parsing_details
                    }
                }
            ]
        }
    ), 
    config=config # Same thread ID 
)   

### 10) Alternate flow - human rejects the tool execution

In [None]:
response = agent.invoke(
    Command(        
        resume={
            "decisions": [
                {
                    "type": "reject",
                    # explaination for rejection
                    "message": "I do not approval the information fetched"
                }
            ]
        }
    ), 
    config=config # Same thread ID 
)   