# üìò Topic: Deep dive into Langchain- Strucutred Output



## üéØ Objective
###  Understanding structured Output and Output Parsers


### üî∂ What is Structure Output?

* Usually, when we get the output in the chat(from LLM) is appearing in term of text, meaning it's not stored in specific format.

* Structured Output is the LLM output that is in well-defined data format, such as JSON.

* This help, when we send our output to another LLM to perform related task. In agentic AI, The output of an LLM can be input of the another LLM to solve the task, this case is a very good example of why we need output in specific structured format

####  Structured output allows agents to return data in a specific, predictable format. Instead of parsing natural language responses, you get structured data in the form of JSON objects, Pydantic models, or dataclasses that your application can directly use. -LangchainDocs

### üî∂ Use Cases:

1. Data Extraction - When we need to store the output of the LLM into database, such as candidates information from the resume.

2. Knowledge graph or Scene Graph creation - to connect nodes to edges in scene graph, structured output helps.

3. Multi-Agent communications

4. function or tool calling

#### üîµ Some LLM providers can respond in structured format, and some cannot.

üìå Well, According to LangchainDocs they have specifically focused on create_agent class which include `response_format` parameter. I will understand from the perspective of both standalone LLMs and agents. Basically they both follow the same strategy, we just have to add that strategy to `response_format= strategy` for agents.

#### When a schema type is provided directly, LangChain automatically chooses `ProviderStrategy` for models supporting native structured output (e.g. OpenAI, Grok), `ToolStrategy` for all other models

## Provider Strategy

### 1. TypeDict- Typed dictionary classes

In [None]:
# So, we want our model to extract information of candidates from resume. -YOuTube-CampusX

from typing import TypedDict
from langchain.chat_models import init_chat_model
from dotenv import load_dotenv
from langchain.messages import HumanMessage

load_dotenv()

class candidate_info(TypedDict):
    name: str
    email: str
    skills: list[str]
    github_url: str
    linkedin_url: str
    phone: str

chat_model = init_chat_model("openai:gpt-4o-mini")

message = HumanMessage("""John Doe
Bengaluru, India
Email: johndoe.dev@gmail.com
Phone: +91 98765 43210
LinkedIn: linkedin.com/in/johndoe-dev
GitHub: github.com/johndoe-dev

Objective:
Motivated and detail-oriented developer with a passion for building intelligent systems and automation tools. Seeking opportunities to apply skills in machine learning, AI-driven systems, and DevOps pipelines to real-world problems.

Education:
M.Tech in Artificial Intelligence & Data Science, National Institute of Technology, Trichy (Aug 2023 ‚Äì May 2025)
B.E. in Computer Science & Engineering, Visvesvaraya Technological University (Aug 2019 ‚Äì Jun 2023)

Skills:
Programming: Python, JavaScript, Bash, C++
Frameworks: LangChain, FastAPI, Streamlit, PyTorch, TensorFlow
DevOps: Docker, GitHub Actions, AWS (EC2, S3, Lambda), Jenkins
Tools: Git, VS Code, Linux, Postman
Other: Prompt Engineering, API Integration, Data Visualization (Matplotlib, Seaborn)

Projects:

CellSense: Multi-Agent System for Cell Growth Analysis

Built a LangChain-based multi-agent system analyzing biomaterial properties from multimodal data (text, images, tabular).

Agents collaborate to summarize research papers, interpret microscope images, and recommend optimal biomaterials.

Technologies: Python, LangChain, OpenAI GPT-4, Streamlit.
GitHub: github.com/johndoe-dev/cellsense

Personalized Learning Assistant

Developed a Streamlit app using LangChain that serves as an interactive mentor for learning AI frameworks.

Integrated memory and dynamic prompting to simulate adaptive teaching.

Deployed on Streamlit Cloud with OpenAI API and Hugging Face integration.
GitHub: github.com/johndoe-dev/langchain-learning-assistant

DevOps Automation Pipeline

Automated CI/CD workflow for a Flask web app using GitHub Actions and Docker.

Deployed on AWS EC2 with versioned updates triggered by Git commits.

Implemented monitoring using Prometheus and Grafana.
GitHub: github.com/johndoe-dev/devops-pipeline

Achievements:

AWS Certified Solutions Architect ‚Äî Associate (2025)

Published paper on Multi-Agent Collaboration in Scientific Data Interpretation at IEEE ICMLA 2024

Won 2nd place in Smart India Hackathon 2023 for AI-driven traffic safety solution

Languages:
English (Fluent), Hindi (Native)
                       """)
# The following line enables structured output only for the LLM providers who can generate structured output in given schema.

structure_model = chat_model.with_structured_output(candidate_info) 

response = structure_model.invoke([message])

print(response)

{'name': 'John Doe', 'email': 'johndoe.dev@gmail.com', 'phone': '+91 98765 43210', 'linkedin_url': 'linkedin.com/in/johndoe-dev', 'github_url': 'github.com/johndoe-dev', 'skills': ['Python', 'JavaScript', 'Bash', 'C++', 'LangChain', 'FastAPI', 'Streamlit', 'PyTorch', 'TensorFlow', 'Docker', 'GitHub Actions', 'AWS (EC2, S3, Lambda)', 'Jenkins', 'Git', 'VS Code', 'Linux', 'Postman', 'Prompt Engineering', 'API Integration', 'Data Visualization (Matplotlib, Seaborn)']}


In [8]:
print(response['name'])
print(response['email'])
print(response['phone'])
print(response['skills'])
print(response['github_url'])
print(response['linkedin_url'])

John Doe
johndoe.dev@gmail.com
+91 98765 43210
['Python', 'JavaScript', 'Bash', 'C++', 'LangChain', 'FastAPI', 'Streamlit', 'PyTorch', 'TensorFlow', 'Docker', 'GitHub Actions', 'AWS (EC2, S3, Lambda)', 'Jenkins', 'Git', 'VS Code', 'Linux', 'Postman', 'Prompt Engineering', 'API Integration', 'Data Visualization (Matplotlib, Seaborn)']
github.com/johndoe-dev
linkedin.com/in/johndoe-dev


#### So, the schema is, here, TypedDict, that is dictionary only which contains keys and their given datatyped value. In other words, if `phone: str` then it will give phone number in string only.

In [None]:
## From YouTube Video:  CampusX -GenAI with Langchain - Structured Outputs

# Enhanced version with Annotated and Optional fields.

# Annotated class will provide additional context for each field to the LLM, meaning it will work as a prompt, a hint, to LLM to generate things we need.


from typing import TypedDict, Annotated, Optional
from langchain.chat_models import init_chat_model
from dotenv import load_dotenv
from langchain.messages import HumanMessage

load_dotenv()

class candidate_info(TypedDict):
    name: Annotated[str, "Full name of the candidate"]
    email: Annotated[str, "Email address of the candidate"]
    skills: Annotated[Optional[list[str]], "List of technical skills possessed by the candidate"]
    github_url: Annotated[Optional[str], "GitHub profile URL of the candidate"]
    linkedin_url: Annotated[Optional[str], "LinkedIn profile URL of the candidate"]
    phone: Annotated[Optional[str], "Phone number of the candidate"]
    summary: Annotated[str, "A brief summary of the candidate's profile"]

chat_model = init_chat_model("openai:gpt-4o-mini")

message = HumanMessage("""John Doe
Bengaluru, India
Email: johndoe.dev@gmail.com
Phone: +91 98765 43210
LinkedIn: linkedin.com/in/johndoe-dev
GitHub: github.com/johndoe-dev

Objective:
Motivated and detail-oriented developer with a passion for building intelligent systems and automation tools. Seeking opportunities to apply skills in machine learning, AI-driven systems, and DevOps pipelines to real-world problems.

Education:
M.Tech in Artificial Intelligence & Data Science, National Institute of Technology, Trichy (Aug 2023 ‚Äì May 2025)
B.E. in Computer Science & Engineering, Visvesvaraya Technological University (Aug 2019 ‚Äì Jun 2023)

Skills:
Programming: Python, JavaScript, Bash, C++
Frameworks: LangChain, FastAPI, Streamlit, PyTorch, TensorFlow
DevOps: Docker, GitHub Actions, AWS (EC2, S3, Lambda), Jenkins
Tools: Git, VS Code, Linux, Postman
Other: Prompt Engineering, API Integration, Data Visualization (Matplotlib, Seaborn)

Projects:

CellSense: Multi-Agent System for Cell Growth Analysis

Built a LangChain-based multi-agent system analyzing biomaterial properties from multimodal data (text, images, tabular).

Agents collaborate to summarize research papers, interpret microscope images, and recommend optimal biomaterials.

Technologies: Python, LangChain, OpenAI GPT-4, Streamlit.
GitHub: github.com/johndoe-dev/cellsense

Personalized Learning Assistant

Developed a Streamlit app using LangChain that serves as an interactive mentor for learning AI frameworks.

Integrated memory and dynamic prompting to simulate adaptive teaching.

Deployed on Streamlit Cloud with OpenAI API and Hugging Face integration.
GitHub: github.com/johndoe-dev/langchain-learning-assistant

DevOps Automation Pipeline

Automated CI/CD workflow for a Flask web app using GitHub Actions and Docker.

Deployed on AWS EC2 with versioned updates triggered by Git commits.

Implemented monitoring using Prometheus and Grafana.
GitHub: github.com/johndoe-dev/devops-pipeline

Achievements:

AWS Certified Solutions Architect ‚Äî Associate (2025)

Published paper on Multi-Agent Collaboration in Scientific Data Interpretation at IEEE ICMLA 2024

Won 2nd place in Smart India Hackathon 2023 for AI-driven traffic safety solution

Languages:
English (Fluent), Hindi (Native)
                       """)
# The following line enables structured output only for the LLM providers who can generate structured output in given schema.

structure_model = chat_model.with_structured_output(candidate_info) 

response = structure_model.invoke([message])

print(response)

  from .autonotebook import tqdm as notebook_tqdm


{'name': 'John Doe', 'email': 'johndoe.dev@gmail.com', 'phone': '+91 98765 43210', 'linkedin_url': 'linkedin.com/in/johndoe-dev', 'github_url': 'github.com/johndoe-dev', 'summary': 'Motivated and detail-oriented developer with a passion for building intelligent systems and automation tools. Seeking opportunities to apply skills in machine learning, AI-driven systems, and DevOps pipelines to real-world problems.', 'skills': ['Python', 'JavaScript', 'Bash', 'C++', 'LangChain', 'FastAPI', 'Streamlit', 'PyTorch', 'TensorFlow', 'Docker', 'GitHub Actions', 'AWS (EC2, S3, Lambda)', 'Jenkins', 'Git', 'VS Code', 'Linux', 'Postman', 'Prompt Engineering', 'API Integration', 'Data Visualization (Matplotlib, Seaborn)']}


In [2]:
response['summary']

'Motivated and detail-oriented developer with a passion for building intelligent systems and automation tools. Seeking opportunities to apply skills in machine learning, AI-driven systems, and DevOps pipelines to real-world problems.'

## Dataclass- A decorator

In [None]:
## Dataclass- A decorator Works the same way as TypedDict just simpler to use.

from dataclasses import dataclass
from langchain.chat_models import init_chat_model
from dotenv import load_dotenv
from langchain.messages import HumanMessage

load_dotenv()

@dataclass
class candidate_info:
    name: str
    email: str
    skills: list[str]
    github_url: str
    linkedin_url: str
    phone: str

chat_model = init_chat_model("openai:gpt-4o-mini")

message = HumanMessage("""John Doe
Bengaluru, India
Email: johndoe.dev@gmail.com
Phone: +91 98765 43210
LinkedIn: linkedin.com/in/johndoe-dev
GitHub: github.com/johndoe-dev

Objective:
Motivated and detail-oriented developer with a passion for building intelligent systems and automation tools. Seeking opportunities to apply skills in machine learning, AI-driven systems, and DevOps pipelines to real-world problems.

Education:
M.Tech in Artificial Intelligence & Data Science, National Institute of Technology, Trichy (Aug 2023 ‚Äì May 2025)
B.E. in Computer Science & Engineering, Visvesvaraya Technological University (Aug 2019 ‚Äì Jun 2023)

Skills:
Programming: Python, JavaScript, Bash, C++
Frameworks: LangChain, FastAPI, Streamlit, PyTorch, TensorFlow
DevOps: Docker, GitHub Actions, AWS (EC2, S3, Lambda), Jenkins
Tools: Git, VS Code, Linux, Postman
Other: Prompt Engineering, API Integration, Data Visualization (Matplotlib, Seaborn)

Projects:

CellSense: Multi-Agent System for Cell Growth Analysis

Built a LangChain-based multi-agent system analyzing biomaterial properties from multimodal data (text, images, tabular).

Agents collaborate to summarize research papers, interpret microscope images, and recommend optimal biomaterials.

Technologies: Python, LangChain, OpenAI GPT-4, Streamlit.
GitHub: github.com/johndoe-dev/cellsense

Personalized Learning Assistant

Developed a Streamlit app using LangChain that serves as an interactive mentor for learning AI frameworks.

Integrated memory and dynamic prompting to simulate adaptive teaching.

Deployed on Streamlit Cloud with OpenAI API and Hugging Face integration.
GitHub: github.com/johndoe-dev/langchain-learning-assistant

DevOps Automation Pipeline

Automated CI/CD workflow for a Flask web app using GitHub Actions and Docker.

Deployed on AWS EC2 with versioned updates triggered by Git commits.

Implemented monitoring using Prometheus and Grafana.
GitHub: github.com/johndoe-dev/devops-pipeline

Achievements:

AWS Certified Solutions Architect ‚Äî Associate (2025)

Published paper on Multi-Agent Collaboration in Scientific Data Interpretation at IEEE ICMLA 2024

Won 2nd place in Smart India Hackathon 2023 for AI-driven traffic safety solution

Languages:
English (Fluent), Hindi (Native)
                       """)
# The following line enables structured output only for the LLM providers who can generate structured output in given schema.

structure_model = chat_model.with_structured_output(candidate_info) 

response = structure_model.invoke([message])

print(response)


{'name': 'John Doe', 'email': 'johndoe.dev@gmail.com', 'skills': ['Python', 'JavaScript', 'Bash', 'C++', 'LangChain', 'FastAPI', 'Streamlit', 'PyTorch', 'TensorFlow', 'Docker', 'GitHub Actions', 'AWS (EC2, S3, Lambda)', 'Jenkins', 'Git', 'VS Code', 'Linux', 'Postman', 'Prompt Engineering', 'API Integration', 'Data Visualization (Matplotlib, Seaborn)'], 'github_url': 'github.com/johndoe-dev', 'linkedin_url': 'linkedin.com/in/johndoe-dev', 'phone': '+91 98765 43210'}


In [4]:
print(response['skills'])

['Python', 'JavaScript', 'Bash', 'C++', 'LangChain', 'FastAPI', 'Streamlit', 'PyTorch', 'TensorFlow', 'Docker', 'GitHub Actions', 'AWS (EC2, S3, Lambda)', 'Jenkins', 'Git', 'VS Code', 'Linux', 'Postman', 'Prompt Engineering', 'API Integration', 'Data Visualization (Matplotlib, Seaborn)']


## Pydentic - its data validation and data parsing library for pytohn. It ensures the data you work with is correct, structured, and typesafe

In [4]:
from pydantic import BaseModel, Field

# This Helps if the data is not in the correct type then ut will throw an error.

class candidate_info(BaseModel):
    name: str
    age: int


IM = {"name": "John Doe", "age": "Twenty Five"}

df= candidate_info(**IM)  # This will throw an error if the data in IM is not in correct format.

ValidationError: 1 validation error for candidate_info
age
  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='Twenty Five', input_type=str]
    For further information visit https://errors.pydantic.dev/2.12/v/int_parsing

In [15]:
## From YouTube Video:  CampusX -GenAI with Langchain - Structured Outputs
# Here, Field subclass works same as Annotated class.
import re
from typing import Optional
from langchain.chat_models import init_chat_model
from dotenv import load_dotenv
from langchain.messages import HumanMessage
from pydantic import BaseModel, Field, EmailStr

load_dotenv()

class candidate_info(BaseModel):
    name: str = Field(description= "Full name of the candidate")
    email: EmailStr= Field(description= "Email address of the candidate")
    skills: Optional[list[str]]= Field(description = "List of technical skills possessed by the candidate")
    github_url: Optional[str]= Field(description="GitHub profile URL of the candidate")
    linkedin_url: Optional[str]= Field(description="LinkedIn profile URL")
    summary: str = Field(description= "A brief summary of the candidate's profile")
    Achievements: Optional[list[str]] = Field(description="List of notable achievements of the candidate")

chat_model = init_chat_model("openai:gpt-4o-mini")

message = HumanMessage("""John Doe
Bengaluru, India
Email: johndoe.dev@gmail.com
Phone: +91 98765 43210
LinkedIn: linkedin.com/in/johndoe-dev
GitHub: github.com/johndoe-dev

Objective:
Motivated and detail-oriented developer with a passion for building intelligent systems and automation tools. Seeking opportunities to apply skills in machine learning, AI-driven systems, and DevOps pipelines to real-world problems.

Education:
M.Tech in Artificial Intelligence & Data Science, National Institute of Technology, Trichy (Aug 2023 ‚Äì May 2025)
B.E. in Computer Science & Engineering, Visvesvaraya Technological University (Aug 2019 ‚Äì Jun 2023)

Skills:
Programming: Python, JavaScript, Bash, C++
Frameworks: LangChain, FastAPI, Streamlit, PyTorch, TensorFlow
DevOps: Docker, GitHub Actions, AWS (EC2, S3, Lambda), Jenkins
Tools: Git, VS Code, Linux, Postman
Other: Prompt Engineering, API Integration, Data Visualization (Matplotlib, Seaborn)

Achievements:

AWS Certified Solutions Architect ‚Äî Associate (2025)

Published paper on Multi-Agent Collaboration in Scientific Data Interpretation at IEEE ICMLA 2024

Won 2nd place in Smart India Hackathon 2023 for AI-driven traffic safety solution

Languages:
English (Fluent), Hindi (Native)
                       """)
# The following line enables structured output only for the LLM providers who can generate structured output in given schema.

structure_model = chat_model.with_structured_output(candidate_info) 

response = structure_model.invoke([message])

print(response.Achievements)

['AWS Certified Solutions Architect ‚Äî Associate (2025)', 'Published paper on Multi-Agent Collaboration in Scientific Data Interpretation at IEEE ICMLA 2024', 'Won 2nd place in Smart India Hackathon 2023 for AI-driven traffic safety solution']


## JSON-Schema

In [20]:
## From YouTube Video:  CampusX -GenAI with Langchain - Structured Outputs


from langchain.chat_models import init_chat_model
from dotenv import load_dotenv
from langchain.messages import HumanMessage


load_dotenv()

json_schema = {
    "title": "Candidate_Information",
    "description": "Extract key candidate information from a resume text.",
    "type": "object",
    "properties": {
        "name": {
            "type": "string",
            "description": "Full name of the candidate"
        },
        "email": {
            "type": "string",
            "description": "Email address of the candidate"
        },
        "skills": {
            "type": ["array", "null"],
            "items": {"type": "string"},
            "description": "List of technical skills possessed by the candidate"
        },
        "achievements": {
            "type": ["array", "null"],
            "items": {"type": "string"},
            "description": "List of notable achievements of the candidate"
        }
    },
    "required": ["name", "email"]
}


chat_model = init_chat_model("openai:gpt-4o-mini")

message = HumanMessage("""John Doe
Bengaluru, India
Email: johndoe.dev@gmail.com
Phone: +91 98765 43210
LinkedIn: linkedin.com/in/johndoe-dev
GitHub: github.com/johndoe-dev

Objective:
Motivated and detail-oriented developer with a passion for building intelligent systems and automation tools. Seeking opportunities to apply skills in machine learning, AI-driven systems, and DevOps pipelines to real-world problems.

Education:
M.Tech in Artificial Intelligence & Data Science, National Institute of Technology, Trichy (Aug 2023 ‚Äì May 2025)
B.E. in Computer Science & Engineering, Visvesvaraya Technological University (Aug 2019 ‚Äì Jun 2023)

Skills:
Programming: Python, JavaScript, Bash, C++
Frameworks: LangChain, FastAPI, Streamlit, PyTorch, TensorFlow
DevOps: Docker, GitHub Actions, AWS (EC2, S3, Lambda), Jenkins
Tools: Git, VS Code, Linux, Postman
Other: Prompt Engineering, API Integration, Data Visualization (Matplotlib, Seaborn)

Achievements:

AWS Certified Solutions Architect ‚Äî Associate (2025)

Published paper on Multi-Agent Collaboration in Scientific Data Interpretation at IEEE ICMLA 2024

Won 2nd place in Smart India Hackathon 2023 for AI-driven traffic safety solution

Languages:
English (Fluent), Hindi (Native)
                       """)
# The following line enables structured output only for the LLM providers who can generate structured output in given schema.

structure_model = chat_model.with_structured_output(json_schema) 

response = structure_model.invoke([message])

print(response)

{'name': 'John Doe', 'email': 'johndoe.dev@gmail.com', 'skills': ['Python', 'JavaScript', 'Bash', 'C++', 'LangChain', 'FastAPI', 'Streamlit', 'PyTorch', 'TensorFlow', 'Docker', 'GitHub Actions', 'AWS (EC2, S3, Lambda)', 'Jenkins', 'Git', 'VS Code', 'Linux', 'Postman', 'Prompt Engineering', 'API Integration', 'Data Visualization (Matplotlib, Seaborn)'], 'achievements': ['AWS Certified Solutions Architect ‚Äî Associate (2025)', 'Published paper on Multi-Agent Collaboration in Scientific Data Interpretation at IEEE ICMLA 2024', 'Won 2nd place in Smart India Hackathon 2023 for AI-driven traffic safety solution']}


# Output Parsers

It help to convert raw LLM reposnses(text data) into structured formats.


### 1. strOutputParser - helps to convert any LLM response into string.

In [None]:
from langchain.chat_models import init_chat_model
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

chat_model = init_chat_model("openai:gpt-3.5-turbo")


template1 = PromptTemplate(template="write a detailed report ion {topic}", input_variables=['topic'])
template2 = PromptTemplate(template = "Write a 5 line summary on the following text. /n {text}", input_variables=['text'])

parser = StrOutputParser()

chain = template1 | chat_model | template2 | chat_model | parser     # Chaining..
response = chain.invoke({'topic' : 'black_hole'})

print(response)




Black holes are regions of space with intense gravity where even light cannot escape. This report explores their formation, types (stellar, supermassive, intermediate), and key discoveries like Hawking radiation and gravitational waves. Hawking radiation suggests black holes emit radiation, while gravitational waves were detected in 2015 from a black hole merger by LIGO. Despite their mystery, ongoing research aims to unravel the secrets of these enigmatic cosmic structures.


In [None]:
# Understanding using opensource model Mistral-7B that doesn't support structured output

from langchain.chat_models import init_chat_model
from langchain_huggingface import HuggingFaceEndpoint,ChatHuggingFace
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

llm = HuggingFaceEndpoint(repo_id='mistralai/Mistral-7B-Instruct-v0.3', 
                          task='text_generation')

prompt =  PromptTemplate(template ="hey!, tell me about recent {topic} Updates.")

model = ChatHuggingFace(llm=llm)

chain = prompt | model    # Slowly slowly getting idea about how chain works

response = chain.invoke({'topic': "AI"})

print(response)


content=" Hello! I'd be happy to share some recent updates in the AI world. Please note that this list is not exhaustive, and the AI field evolves rapidly:\n\n1. **ChatGPT (December 2022)**: Developed by OpenAI, ChatGPT is a model that interacts in a conversational way. It provides responses to a wide range of prompts with a text-based interface. It has gained significant attention due to its ability to answer questions, write essays, summarize books, and more.\n\n2. **Google's Bard (February 2023)**: Google's response to ChatGPT, Bard, is a conversational AI model that aims to provide detailed and high-quality responses to a wide range of questions. It's still in the early stages of development and testing.\n\n3. **Stable Diffusion Models (SDMs) (2022)**: SDMs are a class of generative models that have shown promising results in image synthesis and text-to-image generation. They are an alternative to Generative Adversarial Networks (GANs) and have the potential to improve the quality 

#### üë©‚Äçüè´ key points:

* I use chains(haven't learned yet) by watching Youtube video, saying output parsers works better with chains. Also, it is giving me soe idea about how chains work.

* Here, when we use get repsonse, we get result along with meta info. we specifically write `response.content`

* Now, trying with string output parser


In [7]:
from langchain.chat_models import init_chat_model
from langchain_huggingface import HuggingFaceEndpoint,ChatHuggingFace
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

llm = HuggingFaceEndpoint(repo_id='mistralai/Mistral-7B-Instruct-v0.3', 
                          task='text_generation')

prompt =  PromptTemplate(template ="hey!, tell me about recent {topic} Updates.")

model = ChatHuggingFace(llm=llm)

parser = StrOutputParser()

chain = prompt | model | parser    # Slowly slowly getting idea about how chain works

response = chain.invoke({'topic': "AI"})

print(response)


 Hello! I'd be happy to share some recent updates in the AI world. Please note that this list is not exhaustive, and the AI field evolves rapidly:

1. **ChatGPT (December 2022)**: Developed by OpenAI, ChatGPT is a model that interacts in a conversational way. It provides responses to a wide range of prompts with a text-based interface. It has gained significant attention due to its ability to answer questions, write essays, summarize books, and more.

2. **Google's Bard (February 2023)**: Google's response to ChatGPT, Bard, is a conversational AI model that aims to provide detailed and high-quality responses to a wide range of questions. It's still in the early stages of development and testing.

3. **Stable Diffusion Models (SDMs) (2022)**: SDMs are a class of generative models that have shown promising results in image synthesis and text-to-image generation. They are an alternative to Generative Adversarial Networks (GANs) and have the potential to improve the quality and stability o

#### üë©‚Äçüè´ key points:

* So, we don't need to write `reponse.content` here.

* But, that cannot be the only case, when we want to call llm twice and send the first output as a prompt to the llm,  we can use the parser output directly using chains, without breaking the chain to extract textual content from the response.

In [None]:
from langchain.chat_models import init_chat_model
from langchain_huggingface import HuggingFaceEndpoint,ChatHuggingFace
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

load_dotenv()

llm = HuggingFaceEndpoint(repo_id='mistralai/Mistral-7B-Instruct-v0.3', 
                          task='text_generation')


model = ChatHuggingFace(llm=llm)


prompt1 =  PromptTemplate(template ="hey!, tell me about recent {topic} Updates.", input_variables=['topic'])

prompt2 = PromptTemplate(template="summarise the {text} with clrity.", input_variables=["text"])

parser = StrOutputParser()

chain = prompt1 | model | parser | prompt2 | model | parser   # Slowly slowly getting idea about how chain works

response = chain.invoke({'topic': "AI"})

print(response)


 In summary, recent advancements in AI include:

1. ChatGPT (December 2022) - a conversational AI model developed by OpenAI that interacts with a wide range of prompts.
2. Google's Bard (February 2023) - a conversational AI model in development by Google, similar to ChatGPT.
3. Stable Diffusion Models (SDMs) (2022) - a class of generative models showing promise in image synthesis and text-to-image generation, serving as an alternative to Generative Adversarial Networks (GANs).
4. Hugging Face's Model Hub (2021) - a platform for sharing and discovering pre-trained AI models, beneficial for developers.
5. Ethical AI Guidelines - ongoing efforts by organizations like the European Union to ensure AI systems are transparent, accountable, and respect user privacy and rights.
6. AI in Healthcare - AI is being utilized for tasks such as disease diagnosis, patient outcome prediction, and personalized treatment plans. Google's DeepMind Health is collaborating with the UK's National Health Servic

### 2. JsonOutputParser -  forces the model to output in json format.

In [10]:
from langchain.chat_models import init_chat_model
from langchain_huggingface import HuggingFaceEndpoint,ChatHuggingFace
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser, JsonOutputParser

load_dotenv()

llm = HuggingFaceEndpoint(repo_id='mistralai/Mistral-7B-Instruct-v0.3', 
                          task='text_generation')


model = ChatHuggingFace(llm=llm)

parser = JsonOutputParser()

template = PromptTemplate(template="give me the name, age and city of a fitional person \n {format_instruction}",
                          input_variables=[],
                          partial_variables={'format_instruction': parser.get_format_instructions()})


prompt = template.format()

print(prompt)



give me the name, age and city of a fitional person 
 Return a JSON object.


#### üë©‚Äçüè´ key points:

* When we create prompt, when we add partial variables(that runs before runtime), we tell the model via prompt that we want output in specific format.

* In above case, I am using `JsonOutputParser` so, the method `get_format_instruction()` get the information about what format we want our ouput to be in.

In [11]:
from langchain.chat_models import init_chat_model
from langchain_huggingface import HuggingFaceEndpoint,ChatHuggingFace
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser, JsonOutputParser

load_dotenv()

llm = HuggingFaceEndpoint(repo_id='mistralai/Mistral-7B-Instruct-v0.3', 
                          task='text_generation')


model = ChatHuggingFace(llm=llm)

parser = JsonOutputParser()

template = PromptTemplate(template="give me the name, age and city of a fitional person \n {format_instruction}",
                          input_variables=[],
                          partial_variables={'format_instruction': parser.get_format_instructions()})


prompt = template.format() # There is nothing to specify as it there is static prompt.

response = model.invoke(prompt)

response.content

' {\n  "name": "Alexander Blackwood",\n  "age": 35,\n  "city": "New York City"\n}\n\nPlease note that this is a fictional character and the information provided is for the purpose of this response only.'

In [13]:
from langchain.chat_models import init_chat_model
from langchain_huggingface import HuggingFaceEndpoint,ChatHuggingFace
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser, JsonOutputParser

load_dotenv()

llm = HuggingFaceEndpoint(repo_id='mistralai/Mistral-7B-Instruct-v0.3', 
                          task='text_generation')


model = ChatHuggingFace(llm=llm)

parser = JsonOutputParser()

template = PromptTemplate(template="give me the name, age and city of a fitional person \n {format_instruction}",
                          input_variables=[],
                          partial_variables={'format_instruction': parser.get_format_instructions()})


prompt = template.format() # There is nothing to specify as it there is static prompt.

response = model.invoke(prompt)

parsed = parser.parse(response.content)

parsed

{'name': 'Alexander Blackwood', 'age': 35, 'city': 'New York City'}

In [None]:
# we can write using chain also

from langchain.chat_models import init_chat_model
from langchain_huggingface import HuggingFaceEndpoint,ChatHuggingFace
from dotenv import load_dotenv
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser, JsonOutputParser

load_dotenv()

llm = HuggingFaceEndpoint(repo_id='mistralai/Mistral-7B-Instruct-v0.3', 
                          task='text_generation')


model = ChatHuggingFace(llm=llm)

parser = JsonOutputParser()

template = PromptTemplate(template="give me the name, age and city of a fitional person \n {format_instruction}",
                          input_variables=[],
                          partial_variables={'format_instruction': parser.get_format_instructions()})


prompt = template.format() # There is nothing to specify as it there is static prompt.

chain = template | model | parser
response = chain.invoke({})
print(response)


{'name': 'Alexander Blackwood', 'age': 35, 'city': 'New York City'}
