# LangChain Decorators ✨ - output parsing

In the notebook we will explore details of output parsing with decorators
 

In [1]:

# install langchain_decorators (will install also langchain and promptwatch)
!pip install langchain_decorators

#########################################
# you need to setup your openai api key
#########################################
%env OPENAI_API_KEY=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 #sk-******************************** 


import os
if not os.environ["OPENAI_API_KEY"]:
    raise Exception("You need to setup your openai api key")

from langchain_decorators import GlobalSettings

import logging

# let's define our settings, just to make it more verbose for demonstration
GlobalSettings.define_settings(
    #default_llm=ChatOpenAI(temperature=0.0), this is default... can change it here globally
    #default_streaming_llm=ChatOpenAI(temperature=0.0,streaming=True), this is default... can change it here for all ... will be used for streaming
    logging_level=logging.INFO, 
    print_prompt=True, 
    print_prompt_name=True)

env: OPENAI_API_KEY=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 #sk-********************************


## Basic concepts

- by default decorators are trying to determine the the expected format based the return type annotation.

-  the basic rules are for `auto` output parsing follows these rules:
  
  - output type: `str` | `None` - > no output parsing is applied
  - output type: `dict` -> `JsonOutputParser` will be used.
    > will extract any JSON from the output. Only **one** JSON is expected in the output. The json can be buried in the code block, but it is not required. 
  - output type: `list` or `List[str]` -> `ListOutputParser`
    > will parse **bullet lists** and **numbered lists** from the output (doesn't matter which one). This is tailored to the native behavior to most LLMs, that when asked for a list of something, they ususaly generate numbered or bullet list. Therefore you usually don't need to add any instructions when using this (works great with chatGPT).
  - output type: subclass of `BaseClass` -> [PydanticOutputParser](#PydanticOutputParser)
    > Expects JSON in that could be parsed as the desired BaseClass. Several tricks are used to make sure this 

- on the top ot this there are two more interesting parser available here:
  - [CheckListParser](#CheckListParser) - similar to list, but expects items of the list in this pattern: `- {key}: {value}`
  - [MarkdownStructureParser](#MarkdownStructureParser) - a powerful parser to extract complex structures with long texts. Among other things, it allows you to define specific format for specific sections. Great for combining long reasoning with tool and arguments!
  
- you can also use any of langchain output parser (`from langchain.output_parsers`) but by default, decorators are using customized output parsers available at `from langchain_decorators.output_parsers`. 

## JsonOutputParser
- will extract any JSON from the output. Only **one** JSON is expected in the output. The json can be buried in the code block, but it is not required. 
- returns `dict`

## ListOutputParser
- will parse **bullet lists** and **numbered lists** from the output (doesn't matter which one). This is tailored to the native behavior to most LLMs, that when asked for a list of something, they ususaly generate numbered or bullet list. Therefore you usually don't need to add any instructions when using this (works great with chatGPT).
- returns `List[str]` 

## 


## Initialization

> All the cells are designed to run on its own, except this initialization cell that is required to install the packages and setup override some default settings for more verbose logging for demonstration purposes. 

In [None]:
# install langchain_decorators (will install also langchain and promptwatch)
!pip install langchain_decorators

#########################################
# you need to setup your openai api key
#########################################
%env OPENAI_API_KEY=sk-******************************** 


import os
if not os.environ["OPENAI_API_KEY"]:
    raise Exception("You need to setup your openai api key")

from langchain_decorators import GlobalSettings

import logging

# let's define our settings, just to make it more verbose for demonstration
GlobalSettings.define_settings(
    #default_llm=ChatOpenAI(temperature=0.0), this is default... can change it here globally
    #default_streaming_llm=ChatOpenAI(temperature=0.0,streaming=True), this is default... can change it here for all ... will be used for streaming
    logging_level=logging.INFO, 
    print_prompt=True, 
    print_prompt_name=True)


# Running simple prompt

In [2]:
# this code example is complete and should run as it is

from langchain_decorators import llm_prompt

@llm_prompt
def give_me_straight_answer(question:str)->bool:
    """
    Question: {question}
    {FORMAT_INSTRUCTIONS}
    """
    pass

give_me_straight_answer(topic="Is the earth flat?")

ValidationError: 1 validation error for PromptDecoratorTemplate
output_parser
  value is not a valid dict (type=type_error.dict)

In [None]:
from langchain_decorators import llm_prompt

# Simple script with async and streaming

If we wan't to leverage streaming:
 - we need to define prompt as async function 
 - turn on the streaming on the decorator, or we can define PromptType with streaming on
 - capture the stream using StreamingContext

In [8]:
# this code example is complete and should run as it is

from langchain_decorators import StreamingContext, llm_prompt

@llm_prompt(capture_stream=True) # this will mark the prompt for streaming (usefull if we want stream just some prompts)
async def write_me_short_post(topic:str, platform:str="twitter", audience:str = "developers"):
    """
    Write me a short header for my post about {topic} for {platform} platform. 
    It should be for {audience} audience.
    (Max 15 words)
    """
    pass

async def run_prompt():
    return await write_me_short_post(topic="Releasing a new App that can do real magic!")

tokens=[]
def capture_stream_func(new_token:str):
    tokens.append(new_token)


with StreamingContext(stream_to_stdout=True, callback=capture_stream_func):
    result = await run_prompt()
    print("Stream finished ... we can distinguish tokens thanks to alternating colors")



print("\nWe've captured",len(tokens),"tokens🎉\n")
print("Here is the result:")
print(result)

[90mPrompt template name: write_me_short_post[0m
[90mPrompt:
Write me a short header for my post about Releasing a new App that can do real magic! for twitter platform. 
It should be for developers audience.
(Max 15 words)[0m
[90m•[0m[0m"[0m[90mInt[0m[0mroducing[0m[90m the[0m[0m Revolutionary[0m[90m Twitter[0m[0m App[0m[90m for[0m[0m Developers[0m[90m:[0m[0m Real[0m[90m Magic[0m[0m at[0m[90m Your[0m[0m F[0m[90ming[0m[0mert[0m[90mips[0m[0m!"[0m[90m•[0m
Stream finished ... we can distinguish tokens thanks to alternating colors

We've captured 21 tokens🎉

Here is the result:
"Introducing the Revolutionary Twitter App for Developers: Real Magic at Your Fingertips!"


# Prompt declarations

 - with additional (non 'executable') documentation

In [10]:
# this code example is complete and should run as it is

from langchain_decorators import llm_prompt

@llm_prompt
def write_me_short_post(topic:str, platform:str="twitter", audience:str = "developers"):
    """
    Here is a good way to write a prompt as part of a function docstring, with additional documentation for devs.

    It needs to be a code block, marked as a `<prompt>` language
    ```<prompt>
    Write me a short header for my post about {topic} for {platform} platform. 
    It should be for {audience} audience.
    (Max 15 words)
    ```

    Now only to code block above will be used as a prompt, and the rest of the docstring will be used as a description for developers.
    (It has also a nice benefit that IDE (like VS code) will display the prompt properly (not trying to parse it as markdown, and thus not showing new lines properly))
    """
    pass

print("Note shat prompt is only the inside codeblock")
write_me_short_post(topic="Cookies", platform="facebook", audience="my mom")

Note shat prompt is only the inside codeblcok
[90mPrompt template name: write_me_short_post[0m
[90mPrompt:
Write me a short header for my post about Cookies for facebook platform. 
It should be for my mom audience.
(Max 15 words)[0m


'"Indulge in Sweet Treats: Easy Cookie Recipes for Busy Moms"'

## Prompt with messages

We can use this technique to annotate also different ChatMessageTemplates ...

In [5]:
# this code example is complete and should run as it is

from langchain_decorators import llm_prompt

@llm_prompt
def simulate_conversation(human_input:str, agent_role:str="a pirate"):
    """
    ## System message
     - note the `:system` sufix inside the <prompt:_role_> tag
     

    ```<prompt:system>
    You are a {agent_role} hacker. You mus act like one.
    You reply always in code, using python or javascript code block...
    for example:
    
    ... do not reply with anything else.. just with code - respecting your role.
    ```

    # human message 
    (we are using the real role that are enforced by the LLM - GPT supports system, assistant, user)
    ``` <prompt:user>
    Helo, who are you
    ```
    a reply:
    

    ``` <prompt:assistant>
    \``` python
    def hello():
        print("Argh... hello you pesky pirate")
    \```
    ```
    - note the \ escape chars inside the prompt to allow us to pass a code block example inside the prompt.


    we can also add some history using placeholder
    ```<prompt:placeholder>
    {history}
    ```
    ```<prompt:user>
    {human_input}
    ```

    Now only to code block above will be used as a prompt, and the rest of the docstring will be used as a description for developers.
    (It has also a nice benefit that IDE (like VS code) will display the prompt properly (not trying to parse it as markdown, and thus not showing new lines properly))
    """
    pass


# the history is optional, ... the placeholder will just be ignored if not provided
response =simulate_conversation(human_input="What is your purpose?", history=None) 
print(response)

[90mPrompt template name: simulate_conversation[0m
[90mPrompt:
system: You are a a pirate hacker. You mus act like one.
You reply always in code, using python or javascript code block...
for example:

... do not reply with anything else.. just with code - respecting your role.
user: Helo, who are you
assistant: ``` python
def hello():
    print("Argh... hello you pesky prirate")
```
user: What is your purpose?[0m
``` javascript
const purpose = "My purpose is to plunder and hack, arrrr!";
```


# Optional sections
- you can define a whole sections of your prompt that should be optional
- if any input in the section is missing, the whole section wont be rendered

the syntax for this is as follows:

```
  """
  this text will be rendered always, but

  {? anything inside this block will be rendered only if all the {value}s parameters are not empty (None | "")   ?}

  you can also place it in between the words
  this too will be rendered{? , but
    this  block will be rendered only if {this_value} and {this_value}
    is not empty?} !
  """
```



In [3]:
# this code example is complete and should run as it is

from langchain_decorators import llm_prompt

@llm_prompt
def write_me_short_post(topic:str, platform:str="twitter",  max_words:int=None):
    """ Write a  header about {topic} for {platform} platform. {?
    It has to be {max_words} words long.
    ?}
    """
    pass


# with the max words set to 10, the prompt will be:
print("Prompt with max words set to 10")
print(write_me_short_post(topic="Cookies", platform="facebook", max_words=5))


# without the max words, the part of the prompt wrapped in {? ?} will be ignored
print("\n\nPrompt with max words left as default=None")
print(write_me_short_post(topic="Cookies", platform="facebook"))

Prompt with max words set to 10
[90mPrompt template name: write_me_short_post[0m
[90mPrompt:
Write a  header about Cookies for facebook platform. 
It has to be 5 words long.[0m
"Craving Cookies? Learn More!"


Prompt with max words left as default=None
[90mPrompt template name: write_me_short_post[0m
[90mPrompt:
Write a  header about Cookies for facebook platform.[0m
"Indulge in the Sweet World of Cookies on Facebook: Discover Delicious Recipes, Tips, and More!"


# Output parsers
- llm_prompt decorator natively tries to detect the best output parser based on the output type. (if not set, it returns the raw string)
- list, dict and pydantic outputs are also supported

In [2]:
# this code example is complete and should run as it is

from langchain_decorators import llm_prompt

@llm_prompt
def write_name_suggestions(company_business:str, count:int)->list:
    """ Write me {count} good name suggestions for company that {company_business}
    """
    pass

write_name_suggestions(company_business="sells cookies", count=5)

['Sweet Cravings',
 'Cookie Haven',
 'Crumbly Delights',
 'Sugar Rush Cookies',
 'The Cookie Jar Co.']

## Pydantic parser

(note that by default we use different `pydantic` parser than the standard from LangChain. It has support for re-prompting llm to reformat the output and it generates shorter and more informative format instructions)

In [2]:
# this code example is complete and should run as it is

from langchain_decorators import llm_prompt
from pydantic import BaseModel, Field


class TheOutputStructureWeExpect(BaseModel):
    name:str = Field (description="The name of the company")
    headline:str = Field( description="The description of the company (for landing page)")
    employees:list[str] = Field(description="5-8 fake employee names with their positions")

@llm_prompt()
def fake_company_generator(company_business:str)->TheOutputStructureWeExpect:
    """ Generate a fake company that {company_business}
    {FORMAT_INSTRUCTIONS}
    """
    return

company = fake_company_generator(company_business="sells cookies")

# print the result nicely formatted
print("Company name: ",company.name)
print("company headline: ",company.headline)
print("company employees: ",company.employees)


[32m
Result:
{
"name": "Sweet Delights",
"headline": "Indulge in our heavenly cookies!",
"employees": [
    "Emily Johnson - Head Baker",
    "David Lee - Marketing Manager",
    "Sarah Chen - Sales Representative",
    "Michael Rodriguez - Operations Manager",
    "Avery Thompson - Customer Service Representative"
    ]
}[0m

Company name:  Sweet Delights
company headline:  Indulge in our heavenly cookies!
company employees:  ['Emily Johnson - Head Baker', 'David Lee - Marketing Manager', 'Sarah Chen - Sales Representative', 'Michael Rodriguez - Operations Manager', 'Avery Thompson - Customer Service Representative']


# Passing parameters as object
- you can also pass your inputs as a (non-kword) argument...

###  Example:



In [3]:
# this code example is complete and should run as it is

from pydantic import BaseModel
from langchain_decorators import llm_prompt

class AssistantPersonality(BaseModel):
    assistant_name:str
    assistant_role:str



@llm_prompt
def introduce_your_self(obj:AssistantPersonality)->str:
    """
    ``` <prompt:system>
    You are an assistant named {assistant_name}. 
    Your role is to act as {assistant_role}
    ```
    ```<prompt:user>
    Introduce your self (in less than 20 words)
    ```
    """

personality = AssistantPersonality(assistant_name="John", assistant_role="a pirate")

print(introduce_your_self(personality))

[90mPrompt template name: introduce_your_self[0m
[90mPrompt:
system: You are an assistant named John. 
Your role is to act as a pirate
user: Introduce your self (in less than 20 words)[0m
Ahoy mateys! I be John, yer trusty pirate assistant.


# More complex stuff

What is even **more interesting use case** for this is to pack a bunch of functions into single object to share some inputs with multiple prompts and allowing us more Object-oriented approach

Here is an example of complete ReAct reimplemented with lanchchain decorators✨.

*Hint: To make it async, just turn all the functions async* 

In [5]:
from typing import List
from langchain_decorators import llm_prompt
from langchain.agents import load_tools
from langchain.tools.base import BaseTool
from textwrap import dedent
from langchain_decorators import PromptTypes
from langchain_decorators.output_parsers import JsonOutputParser
import json

tools = load_tools([ "llm-math"], llm=GlobalSettings.get_current_settings().default_llm)

# you may, or may not use pydantic as your base class... totally up to you
class MultilingualAgent:

    def __init__(self,  tools:List[BaseTool],result_language:str=None) -> None:
        self.tools = tools

        # we can refer to our field in all out prompts
        self.result_language = result_language
        
        self.agent_scratchpad = "" # we initialize our scratchpad
        self.feedback = "" # we initialize our feedback if we get some error

        # other settings
        self.iterations=10
        self.agent_format_instructions = dedent("""\
            # Reasoning
            ... write your reasoning here ...

            # Tool
            ```json
                {{
                    "tool": name of the tool to use,
                    "tool_input": the input for the tool
                }}
            ```

            # Observation
            output from the tool

            ... repeat this # Reasoning, # Tool, # Observation sequence multiple times until you know the final answer, when you write:

            # Final answer
            ... write the final answer 
            """)

    @property
    def tools_description(self)->str:  # we can refer to properties in out prompts too
        return "\n".join([f" - {tool.name}: {tool.description}" for tool in self.tools])

    # we defined prompt type here, which will make 
    @llm_prompt(prompt_type=PromptTypes.AGENT_REASONING, output_parser="markdown", stop_tokens=["Observation"], verbose=True)
    def reason(self)->dict:
        """
        The system prompt:
        ``` <prompt:system>
        You are an assistant that uses reasoning and tools to help user. You use tools for the task the tool is designed to. 
        Before answering the question and/or using the tool, you should write down the explanation. 
        
        Here is the list of tools available:
        {tools_description}
        
        Use this format:
        
        {agent_format_instructions}{? in {result_language}?} here ...{?
        Make sure to write the final answer in in {result_language}!?} 
        
        ```
        User question:
        ```<prompt:user>
        {question}
        ```
        Scratchpad:
        ```<prompt:assistant>
        {agent_scratchpad}
        ```
        ```<prompt:user>
        {feedback}
        ```
        """
        return
    
    
    def act(self, tool_name:str, tool_input:str)->str:
        tool = next((tool for tool in self.tools if tool.name.lower()==tool_name.lower()==tool_name.lower()))
        if tool is None:
            self.feedback = f"Tool {tool_name} is not available. Available tools are: {self.tools_description}"
            return
        else:
            try:
                result = tool.run(tool_input)
            except Exception as e:
                if self.feedback is not None:
                    # we've already experienced an error, so we are not going to try forever... let's raise this one
                    raise e
                self.feedback = f"Tool {tool_name} failed with error: {e}.\nLet's fix it and try again."
            tool_instructions = json.dumps({"tool":tool.name, "tool_input":tool_input})
            self.agent_scratchpad += f"# Tool\n```json\n{tool_instructions}\n```\n# Observation\n\nResult from tool {tool_name}:\n\t{result}\n"



    def run(self, question):
        for i in range(self.iterations):
            reasoning = self.reason(question=question)
            if reasoning.get("Final answer") is not None:
                return reasoning.get("Final answer")
            else:
                tool_info = reasoning.get("Tool")
                tool_name, tool_input = (None, None)
                if tool_info:
                    tool_info_parsed = JsonOutputParser().parse(tool_info)
                    tool_name = tool_info_parsed.get("tool")
                    tool_input = tool_info_parsed.get("tool_input")

                if tool_name is None or tool_input is None:
                    self.feedback = "Your response was not in the expected format. Please make sure to response in correct format:\n" + self.agent_format_instructions 
                    continue
                self.act(tool_name, tool_input)
        raise Exception(f"Failed to answer the question after {self.iterations} iterations. Last result: {reasoning}")

        
agent = MultilingualAgent(tools=tools, result_language="German" )

result = agent.run("What is the surface of a sphere with radius with diameter of 100km?")

print("\n\nHere is the agent's answer:", result)

[1m> Entering reason prompt decorator chain[0m

[32mPrompt:
system: You are an assistant that uses reasoning and tools to help user. You use tools for the task the tool is designed to. 
Before answering the question and/or using the tool, you should write down the explanation. 

Here is the list of tools available:
 - Calculator: Useful for when you need to answer questions about math.

Use this format:

# Reasoning
... write your reasoning here ...

# Tool
```json
    {{
        "tool": name of the tool to use,
        "tool_input": the input for the tool
    }}
```

# Observation
output from the tool

... repeat this # Reasoning, # Tool, # Observation sequence multiple times until you know the final answer, when you write:

# Final answer
... write the final answer 
 in German here ...
Make sure to write the final answer in in German!
user: What is the surface of a sphere with radius with diameter of 100km?[0m

[32m
Result:
# Reasoning
The surface area of a sphere can be calcula