<a href="https://colab.research.google.com/github/saddarudin/google_colab/blob/main/GenAI_LangChain_OutputParser.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##CSV Parser

In [24]:
!pip install openai --quiet
!pip install langchain --quiet
!pip install langchain-cohere
!pip install langchain-community --quiet



In [25]:
from langchain.llms import Cohere
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

In [26]:
import os
from google.colab import userdata
os.environ['COHERE_API_KEY'] = userdata.get("COHERE_API_KEY")

In [27]:
from langchain.output_parsers import CommaSeparatedListOutputParser

In [28]:
output_parser = CommaSeparatedListOutputParser()
format_instructions = output_parser.get_format_instructions()
print(format_instructions)

Your response should be a list of comma separated values, eg: `foo, bar, baz` or `foo,bar,baz`


In [36]:
model = Cohere(temperature=0)

In [37]:
template = """
List down the top 10 concepts to learn from the following topic.
Topic name is: {topic_name}
{format_instructions}
"""

prompt = PromptTemplate(
    template=template,
    input_variables=["topic_name","format_instructions"]
)

chain = LLMChain(llm=model, prompt=prompt)

In [38]:
result = chain.invoke({"topic_name":"Acting","format_instructions":format_instructions})
print(result['text'])

 Acting, Theater, Stage, Film, Drama, Character, Emotion, Story, Script, Audition. 


In [39]:
result = chain.invoke({"topic_name":"Chess","format_instructions":format_instructions})
print(result['text'])

 chess game, rules of chess, how to play chess, chess board, chess pieces, queen, king, bishop, knight, rook, pawn


##DateTime Parser

In [40]:
from langchain.output_parsers import DatetimeOutputParser

output_parser = DatetimeOutputParser()
format_instructions = output_parser.get_format_instructions()
print(format_instructions)

Write a datetime string that matches the following pattern: '%Y-%m-%dT%H:%M:%S.%fZ'.

Examples: 936-03-25T06:18:04.824192Z, 275-11-29T07:37:58.397671Z, 414-01-15T16:40:29.512745Z

Return ONLY this string, no other words!


In [41]:
Server_Logs =[
    "[2024-04-01 13:48:11] ERROR: Failed to connect to database. Retrying in 60 seconds.",
    "[2023-08-04 12:01:00 AM - Warning: The system is running low on disk space.",
    "[04-01-2024 13:55:39] CRITICAL: System temperature exceeds safe threshold. Initiating shutdown",
    "[Monday, April 01, 2024 01:55:39 PM] DEBUG: User query executed in 0.45 seconds.",
    "[13:55:39 on 2024-04-01] ERROR: Unable to send email notification. SMTP server not responding."
]

In [43]:
template = """
Read the server log text and extract the date and time.
log text is: {log_text}
{format_instructions}
"""

prompt = PromptTemplate(
    template=template,
    input_variables = ["log_text","format_instructions"]
)

chain = LLMChain(llm=model, prompt=prompt)

In [44]:
for log_message in Server_Logs:
  result = chain.invoke({"log_text":log_message,"format_instructions":format_instructions})
  print(result['text'])

 2024-04-01T13:48:11.000000Z
 2023-08-04T12:01:00.000000Z
 2024-04-01T13:55:39.000Z
 2024-04-01T13:55:39.000Z
 2024-04-01T13:55:39.000Z


##CustomParser using Pydantic

###Pydantic is a handy tool for making sure the information or data your Python program receives is exactly what you expect. With Pydantic, you tell your program what kind of data it should accept (like numbers, text, or dates) using simple rules. If the data matches the rules, your program works smoothly. If not, Pydantic helps by pointing out the problem, making it easier to keep your program safe and error free.

In [45]:
from pydantic import BaseModel, ValidationError

In [46]:
class User(BaseModel):
  name:str
  age:int

In [47]:
user = User(name="Saddar U Din",age=21)
print(user)

name='Saddar U Din' age=21


In [49]:
#If you give incorrect format or data it will raise an error
User(name="SDB",age='twenty one')

ValidationError: 1 validation error for User
age
  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='twenty one', input_type=str]
    For further information visit https://errors.pydantic.dev/2.11/v/int_parsing

In [50]:
from langchain.output_parsers import PydanticOutputParser
from pydantic import BaseModel, Field

In [51]:
class Scientist(BaseModel):
  name:str=Field(description="Name of the Scientist")
  dob:str=Field(description="Date of Birth of the Scientist")
  bio:str=Field(description="Biography of the Scientist")

In [52]:
custom_output_parser = PydanticOutputParser(pydantic_object=Scientist)
print(custom_output_parser.get_format_instructions())

The output should be formatted as a JSON instance that conforms to the JSON schema below.

As an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}
the object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.

Here is the output schema:
```
{"properties": {"name": {"description": "Name of the Scientist", "title": "Name", "type": "string"}, "dob": {"description": "Date of Birth of the Scientist", "title": "Dob", "type": "string"}, "bio": {"description": "Biography of the Scientist", "title": "Bio", "type": "string"}}, "required": ["name", "dob", "bio"]}
```


In [53]:
template = """
Take the name of the Scientist {name} and try to fill the rest of the details.
{format_instructions}
"""

prompt = PromptTemplate(
    template=template,
    input_variables = ["name","format_instructions"]
)

chain = LLMChain(llm=model,prompt=prompt)

In [55]:
result = chain.invoke({"name":"Schrodinger","format_instructions":custom_output_parser.get_format_instructions()})
print(result['text'])

 {
    "properties": {
        "name": "Schrodinger",
        "dob": "1887-09-26",
        "bio": "Born in Vienna, Austria, in 1887, Erwin Schrodinger is most famous for his involvement in the development of quantum mechanics, winning a Nobel Prize for his contributions in 1933. His work on the wave function of hydrogen led to groundbreaking insights into the behavior of subatomic particles. Schrodinger was also known for his controversial interpretations of quantum mechanics and his criticism of the prevailing paradigm. He made contributions to many areas of physics and philosophy, making him one of the most influential scientists of the 20th century."
    },
    "required": ["name", "dob", "bio"]
}


In [56]:
result = chain.invoke({"name":"Hisenburgh","format_instructions":custom_output_parser.get_format_instructions()})
print(result['text'])

 {
    "properties": {
        "name": "Hisenburgh",
        "dob": "January 16, 1670",
        "bio": "Hisenburgh was born in Amsterdam in the Netherlands in 1670. He was a renowned scientist primarily interested in horticulture. He is most famous for his work with the Dutch East India Company. His discoveries and collections are now owned by the University of Amsterdam."
},
        "required": ["name", "dob", "bio"]
}
