
# Incorporating human-in-the-loop in agentic logic via LangGraph 

## Prerequisites

To run this notebook, you need to [follow the steps from here](https://python.langchain.com/docs/integrations/text_embedding/nvidia_ai_endpoints#setup) and generate an API key from [NVIDIA API Catalog](https://build.nvidia.com/).

Please ensure you have the following dependencies installed  :

- langchain
- jupyterlab==4.0.8
- langchain-core
- langchain-nvidia-ai-endpoints==0.2.0
- markdown
- colorama

you will also need to install the following -



This notebook will walk you though how to incoporate **human-in-the-loop** into a **multi-agents** pipeline in a minimalistic examples.

The cognitive agentic architecture will look like the below :

![agent architecture](./data/imgs/HumanInTheLoopLangGraph.png) 


We will first construct the 2 agents in the middle : 

- Using **meta/llama-3.1-405b-instruct** to construct the 2 agents, each will be created with [LCEL expression ](https://python.langchain.com/v0.1/docs/expression_language/)

- then we will give each agent one tool to use to achieve the task

The task at hand is creating promotion assets with text and image for social medial promotion.
We are aiming for something similar to the below ...


![agent architecture](./data/imgs/finish_social_post.png)


Just like in real world, a human in charge of the task will delegate tasks to specalist writer to writ the promotion text and assign a digital artist for the artworks.

In this scenario, we will let human assign an agent ( either **ContentCreator** or **DigitalArtist** ) just like the flow depicted above. 
    


Note: As one can see, since we are using NVIDIA AI Catalog as an API, there is no further requirement in the prerequisites about GPUs as compute hardware





In [None]:
## install a few python packages we will need
#!pip install colorama markdown langgraph

In [None]:
import getpass
import os

import getpass
import os

## API Key can be found by going to NVIDIA NGC -> AI Foundation Models -> (some model) -> Get API Code or similar.
## 10K free queries to any endpoint (which is a lot actually).

# del os.environ['NVIDIA_API_KEY']  ## delete key and reset
if os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
    print("Valid NVIDIA_API_KEY already in environment. Delete to reset")
else:
    global nvapi_key
    nvapi_key = getpass.getpass("NVAPI Key (starts with nvapi-): ")
    assert nvapi_key.startswith("nvapi-"), f"{nvapi_key[:5]}... is not a valid key"
    os.environ["NVIDIA_API_KEY"] = nvapi_key
    

## We will prepare the 2 agents , each is made out of [LCEL expression ](https://python.langchain.com/v0.1/docs/expression_language/)

For simplicity , each agent will be given one tool to use.

- a **content_creator** agent which will create promotion message per input **_product_desc_**
- an **digital_artist** agent what is able to create visually appealing image from the promotion title



---

## Step 1 : construct **content_creator** agent 

in order to construct the **content_creator** agent we need the following :

- system prompt which anchor the task for the agent

- provide a seeded product desc  

- a powerful LLM [llama3.1-405b from NVIDIA NIM](https://build.nvidia.com/meta/llama-3_1-405b-instruct) 

- using **with_structured_output** for formatting



In [None]:

# test run and see that you can genreate a respond successfully 
from langchain_nvidia_ai_endpoints import ChatNVIDIA
from langchain import prompts, chat_models, hub
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate
from langchain_core.pydantic_v1 import BaseModel, Field, validator
from typing import Optional, List

## construct the system prompt 
prompt_template = """
### [INST]

You are an expert social media content creator.
Your task is to create a different promotion message with the following 
Product Description :
------
{product_desc}
------

The output promotion message MUST use the following format :

'''
Title: a powerful, short message that dipict what this product is about 
Message: be creative for the promotion message, but make it short and ready for social media feeds.
Tags: the hash tag human will nomally use in social media
'''

Begin!

[/INST]
 """
prompt = PromptTemplate(
input_variables=['produce_desc'],
template=prompt_template,
)



## provide the product_desc
product_desc="Explore the latest community-built AI models with an API optimized and accelerated by NVIDIA, then deploy anywhere with NVIDIA NIM™ inference microservices."

## structural output using LMFE 
class StructureOutput(BaseModel):     
    Title: str = Field(description="Title of the promotion message")
    Message : str = Field(description="The actual promption message")
    Tags: List[str] = Field(description="Hash tags for social media, usually starts with #")

llm_with_output_structure=ChatNVIDIA(model="meta/llama-3.1-405b-instruct").with_structured_output(StructureOutput)     

## construct the content_creator agent
content_creator = ( prompt | llm_with_output_structure )
out=content_creator.invoke({"product_desc":product_desc})


In [None]:
out.Title


In [None]:
out.Message

In [None]:
out.Tags

## Step 2 : we will now create **digital_artist** agent 

We will equip the **digital_artist** with the following :

- a text-to-image model [stableXL-turbo from NVIDIA NIM ](https://build.nvidia.com/explore/visual-design?snippet_tab=Python#sdxl-turbo)
- wrap this tool into llm with llm.bind_tools
- construct our **digital_artist** agent with LCEL expression

##  a text-to-image model [stableXL-turbo from NVIDIA NIM ](https://build.nvidia.com/explore/visual-design?snippet_tab=Python#sdxl-turbo)

In [None]:
# test run and see that you can genreate a respond successfully 
from langchain_nvidia_ai_endpoints import ChatNVIDIA
from langchain import prompts, chat_models, hub
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate, PromptTemplate

def llm_rewrite_to_image_prompts(user_query):
    prompt = prompts.ChatPromptTemplate.from_messages(
        [
            (
                "system",
                "Summarize the following user query into a very short, one-sentence theme for image generation, MUST follow this format : A iconic, futuristic image of , no text, no amputation, no face, bright, vibrant",
            ),
            ("user", "{input}"),
        ]
    )
    model = ChatNVIDIA(model="mistralai/mixtral-8x7b-instruct-v0.1")
    chain = ( prompt    | model   | StrOutputParser() )
    out= chain.invoke({"input":user_query})
    #print(type(out))
    return out


In [None]:
import requests
import base64, io
from PIL import Image
import requests, json
def generate_image(prompt :str) -> str :
    """
    generate image from text
    Args:
        prompt: input text
    """
    ## re-writing the input promotion title in to appropriate image_gen prompt 
    gen_prompt=llm_rewrite_to_image_prompts(prompt)
    print("start generating image with llm re-write prompt:", gen_prompt)
    invoke_url = "https://ai.api.nvidia.com/v1/genai/stabilityai/sdxl-turbo"
    
    headers = {
        "Authorization": f"Bearer {nvapi_key}",
        "Accept": "application/json",
    }
    
    payload = {
        "text_prompts": [{"text": gen_prompt}],
        "seed": 0,
        "sampler": "K_EULER_ANCESTRAL",
        "steps": 2
    }
    
    response = requests.post(invoke_url, headers=headers, json=payload)
    
    response.raise_for_status()
    response_body = response.json()
    ## load back to numpy array 
    print(response_body['artifacts'][0].keys())
    imgdata = base64.b64decode(response_body["artifacts"][0]["base64"])
    filename = 'output.jpg'
    with open(filename, 'wb') as f:
        f.write(imgdata)   
    im = Image.open(filename)  
    img_location=f"the output of the generated image will be stored in this path : {filename}"
    return img_location


In [None]:
out=generate_image("NVIDIA NeMo is a powerful SDK for all your GenAI needs")
out

## Wrap the tool into llm with **llm.bind_tools**

In [None]:
llm=ChatNVIDIA(model="meta/llama-3.1-405b-instruct")
llm_with_img_gen_tool=llm.bind_tools([generate_image],tool_choice="generate_image")

In [None]:
out=llm_with_img_gen_tool.invoke("NVIDIA power GenAI workflow")
out.tool_calls

In [None]:
def output_to_invoke_tools(out):
    tool_calls=out.tool_calls
    ## check there are indeed tool_calls in the output
    if len(tool_calls) > 0 :
        ## assert the args attribute exists 
        if 'args' in tool_calls[0] :            
            
            prompt=tool_calls[0]['args']['prompt']
            output=generate_image(prompt)
        else:
            print("### out.tool_calls", out.tool_calls[0].keys() )
            output="cannot find input prompt from llm output, please rerun again"
    else:
        print("------------" , out)
        print("### out.tool_calls", out.tool_calls )
        output="agent did not find generate_image tool, please check the tool binding is successful"
    return output
            

## creating **digital_artist** using LCEL chain 

In [None]:
digital_artist = (
    llm_with_img_gen_tool
    | output_to_invoke_tools
)
    

In [None]:
digital_artist.invoke("NVIDIA power GenAI workflow")

---
## Step 3 - Embed Human-in-the-loop agentic logic with LangGraph

- construct a **get_human_input** function to integrate into the first node of LangGraph putting Human-in-the-loop deciding which tool to use
- establish **State** to keep track of the internal states
- create functions as graph nodes for LangGraph 
- compose the agentic cognitive logic in langGraph by connecting the nodes and edges


## construct a **get_human_input** function to integrate into the first node of LangGraph 

putting Human-in-the-loop deciding which tool to use

In [None]:
# Or you can directly instantiate the tool
from langchain_community.tools import HumanInputRun
from langchain.agents import AgentType, load_tools
from langchain.agents import AgentType, initialize_agent, load_tools


def get_human_input() -> str:
    """ Put human as decision maker, human will decide which agent is best for the task"""
    
    print("You have been given 2 agents. Please select exactly _ONE_ agent to help you with the task, enter 'y' to confirm your choice.")
    print("""Available agents are : \n
            1 ContentCreator  \n
            2 DigitalArtist \n          
            Enter 1 or 2""")
    contents = []
    while True:
        try:            
            line = input()
            if line=='1':
                tool="ContentCreator"                
                line=tool
                
            elif line=='2':
                tool="DigitalArtist"                
                line=tool
                
            else:
                pass
            
        except EOFError:
            break
        if line == "y":
            print(f"tool selected : {tool} ")
            break
        contents.append(line)
        
    return "\n".join(contents)


# You can modify the tool when loading

ask_human = HumanInputRun(input_func=get_human_input)


## establish **State** to keep track of the internal states

In [None]:
## first we define GraphState 
from typing import Dict, TypedDict
from typing import TypedDict, Annotated, List, Union
from langchain_core.agents import AgentAction, AgentFinish
import operator

from langchain_core.messages import BaseMessage
class State(TypedDict):
    # The input string
    input: str
    input_to_agent : str
    agent_choice : str
    agent_use_tool_respond : str


## create functions as graph nodes for LangGraph 

In [None]:
from langgraph.graph import END, StateGraph
from langgraph.prebuilt import ToolInvocation
from colorama  import Fore,Style
# Define the functions needed 
def human_assign_to_agent(state):
    # ensure using original prompt 
    inputs = state["input"]
    input_to_agent = state["input_to_agent"]

    concatenate_str = Fore.BLUE+inputs+ ' : '+Fore.CYAN+input_to_agent + Fore.RESET
    print(concatenate_str)
    print("---"*10)
    
    agent_choice=ask_human.invoke(concatenate_str)
    print(Fore.CYAN+ "choosen_agent : " + agent_choice + Fore.RESET)
    return {"agent_choice": agent_choice }

def agent_execute_task(state):    
    inputs= state["input"]
    input_to_agent = state["input_to_agent"]
    print(Fore.CYAN+input_to_agent + Fore.RESET)
    # choosen agent will execute the task
    choosen_agent = state['agent_choice']
    if choosen_agent=='ContentCreator':
        structured_respond=content_creator.invoke({"product_desc":input_to_agent})
        respond='\n'.join([structured_respond.Title,structured_respond.Message,''.join(structured_respond.Tags)])       
    elif choosen_agent=="DigitalArtist":
        respond=digital_artist.invoke(input_to_agent)
    else:
        respond="please reselect the agent, there are only 2 agents available: 1.ContentCreator or 2.DigitalArtist"
        
    
    print(Fore.CYAN+ "agent_output: \n" + respond + Fore.RESET)

    return {"agent_use_tool_respond": respond}





## compose the agentic cognitive logic in langGraph by connecting the nodes and edges

In [None]:

from langgraph.graph import END, StateGraph

# Define a new graph
workflow = StateGraph(State)

# Define the two nodes 
workflow.add_node("start", human_assign_to_agent)
workflow.add_node("end", agent_execute_task)

# This means that this node is the first one called
workflow.set_entry_point("start")
workflow.add_edge("start", "end")
workflow.add_edge("end", END)

# Finally, we compile it!
# This compiles it into a LangChain Runnable,
# meaning you can use it as you would any other runnable
app = workflow.compile()

---
## time to test this out 

In [None]:
my_query="create a good promption message for social promotion events using the following inputs"
product_desc="NVIDIA NIM microservices power GenAI workflow"
respond=app.invoke({"input":my_query, "input_to_agent":product_desc})

#### now we will use the output from the **ContentCreator** agent to go for a 2nd round to generate beautiful image for this promotion 

In [None]:
prompt_for_image=respond['agent_use_tool_respond'].split('\n')[0].split(':')[-1].strip()
prompt_for_image

In [None]:
input_query="generate an image for me from the below promotion message"
respond2=app.invoke({"input":input_query, "input_to_agent":prompt_for_image})

In [None]:
im = Image.open('output.jpg')  
im.show()

---

## let's try to print this out using markdown 

In [None]:
title = respond['agent_use_tool_respond'].split('\n')[0].split(':')[-1].strip()
promotion_msg = respond['agent_use_tool_respond'].split('\n')[1].split(':')[-1].strip()
hash_tags =  ['#'+s for s in respond['agent_use_tool_respond'].split('\n')[-1].split(':')[-1].split('#') if s!=""]



In [None]:
hash_tag_in_md=[]
for hash_tag in hash_tags:
    
    temp=f"""<span>{hash_tag}</span>"""
    hash_tag_in_md.append(temp)

hashtags_in_md= '<br>'+ ''.join(hash_tag_in_md) + '</br>'

In [None]:
from IPython.display import Markdown, display

import markdown
markdown_str = markdown.markdown(f'''
<img src="output.jpg" width=600 height=480 class=center/>


#### {title}

{promotion_msg}

{hashtags_in_md}

''')

def printmd(markdown_str):
    display(Markdown(markdown_str))
printmd(markdown_str)