# Getting Structured Output from LLM

All the very popular LLMs out there respond in a a very conversational manner with the output being an unstructured text or in a natural language. There are times where you might want to get a structured output on the user query from a LLM such as JSON or an object with certain attributes(pydantic or dataclass objects).

We have seen one such example in our "langchain_tools" notebook where in a LLM responds to the user query and select tools from a list of avaialble tools by returning a "ToolMessage" object.

It is important to keep in mind that not all LLMs can respond in a structured way. LLMs which have the capabilities of function or tool calling are best suited for getting structured outputs. You can find a list of LLMs that support structured output [Here](https://python.langchain.com/v0.2/docs/integrations/chat/)

In this notebook , we are going to look at few of the methods that Langchain offers when it comes to requesting a structured output from a LLM.

## Using with_structured_output method

Langchain have implemented this method with all the LLM models that support returning structured outputs(JSON or tool/function calling).

This method takes in a JSON schema or a Pydantic class as an input and returns dictionary or a Pydantic object when invoked with ".invoke()" method.

Lets see an example of using both a JSON schema and a Pydantic class with OpenAI LLM model

### Using Pydantic Class

In [None]:
#! pip install langchain-core langchain-openai
import os, getpass
from langchain_openai import ChatOpenAI
from langchain_core.pydantic_v1 import BaseModel, Field

os.environ["OPENAI_API_KEY"] = getpass.getpass()
llm = ChatOpenAI(model='gpt-4o')

class NewRouter(BaseModel):
    """A class that provides a Hostname , private IP address\
          and a Login Banner to a new Router"""
    
    hostname: str = Field(..., description="A hostname based on the location of the device")
    ip_address: str = Field(..., description="An IPv4 address from private address space,\
                             easy to remember")
    login_banner: str = Field(..., description="A login banner for a router as a motivational quote")

llm_with_structure = llm.with_structured_output(NewRouter)

llm_with_structure.invoke("provide a router hostname , and Ipv4 address and \
                          a login banner for a router located in san francisco")

It is important to note here that besides just defining a Pydantic class ; the name of the class , the docstring and the descriptions of the class attributes are very important as all of these metadata is passed on as context to LLM

### Using JSON Schema

In [6]:
json_schema = {
    "title" :  "new_router",
    "description": "A hostname , an IPv4 address and a login banner\
        to provision a new router",
    "type": "object",
    "properties": {
        "hostname": {
            "type": "string",
            "description": "A hostname based on the location of the device"
        },
        "ip_address":{
            "type": "string",
            "description": "An IPv4 address from private address space,\
                easy to remember"
        },
        "login_banner": {
            "type": "string",
            "description": "A login banner for a router as a motivational quote"
        }
    },
    "required": ["hostname", "ip_address"]
}

llm_with_json_output = llm.with_structured_output(json_schema)

llm_with_json_output.invoke("provide a router hostname , an Ipv4 address and \
                          a login banner for a router located in san francisco")

{'hostname': 'sf-router',
 'ip_address': '192.168.1.1',
 'login_banner': 'Welcome! The future depends on what you do today.'}

## Getting Structured output directly from the model

Not all models support "with_structured_output()", since not all models have tool calling or JSON mode support. For such models langchain provides output parsers that can extract a structured response from the raw model output.

We have actually covered this method in our first Notebook in this series "Langchain Quick Introduction".
For the sake of completeness of this section , lets see an example of using Langchain's output parser.

In [None]:
from langchain_core.output_parsers import PydanticOutputParser
from langchain.pydantic_v1 import BaseModel, Field
from langchain_core.prompts import PromptTemplate

class WeatherForcast(BaseModel):
    query:str = Field(description="question asked to LLM Model")
    response:str = Field(description="response to the query from LLM model")

output_parser = PydanticOutputParser(pydantic_object=WeatherForcast)
prompt = PromptTemplate(
    template="{format_instructions}\nDescribe weather forcast for today at {location}\
          in no more than two short sentences",
    input_variables=["location"],
    partial_variables={"format_instructions": output_parser.get_format_instructions()}
    )

llm(prompt.format(location="San Francisco"))
