<a href="https://colab.research.google.com/github/Sweta-Das/LangChain-HuggingFace-LLM/blob/main/2_model_prompts_parsing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [118]:
%%capture
%pip install langchain
%pip install huggingface_hub
%pip install transformers
%pip install accelerate
%pip install bitsandbytes

In [119]:
import os
from langchain import HuggingFaceHub
from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate

**PromptTemplate Class**: Creates reusable string templates for constructing prompts for LLM (similar to f-string in Python).<br>
**LLMChain Class**: Most basic building block for interacting with LLMs. It utilizes Prompt Template class to define prompt structure sent to the LLM. It takes the formatted prompt and interacts with the underlying LLM connected to Langchain.

### Using HuggingFace API Key

In [120]:
os.environ['HUGGINGFACEHUB_API_TOKEN'] = 'HUGGINGFACEHUB_API_TOKEN'

In [121]:
template = """Question: {question}
Answer: Let's think step by step.
"""

prompt = PromptTemplate(template=template, input_variables=["question"])

Using **Chain**: contains 2 components;<br>
  - prompt : uses prompt template <br>
  - llm : model to perform LLM function => [google/flan-t5.xl](https://huggingface.co/google/flan-t5-base)<br>

In [122]:
llm_chain = LLMChain(
    prompt = prompt,
    llm = HuggingFaceHub(
        repo_id="mistralai/Mixtral-8x7B-Instruct-v0.1",
        model_kwargs={
            "temperature": 0.1,
            "max_length": 100
        },
    )
)

In [123]:
question = "What is the capital of France?"
print(llm_chain.run(question))

Question: What is the capital of France?
Answer: Let's think step by step.

The capital of a country is the city where the government resides.

In France, the government resides in Paris.

Therefore, Paris is the capital of France.


In [124]:
question = "Who is Abraham Lincoln? What is he famous for?"
print(llm_chain.run(question))

Question: Who is Abraham Lincoln? What is he famous for?
Answer: Let's think step by step.

Abraham Lincoln was the 16th President of the United States, serving from March 1861 until his assassination in April 1865. He is famous for leading the nation through its greatest internal crisis, the American Civil War, and abolishing slavery.

He was born on February 12, 1809, in a log cabin in Hardin County, Kentucky. He grew up in a poor family and had little formal education.


### Using HuggingFace Model Locally

In [125]:
%%capture
%pip install accelerate
%pip install -i https://pypi.org/simple/ bitsandbytes

In [126]:
import torch
import accelerate
from langchain.llms import HuggingFacePipeline
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, AutoModelForSeq2SeqLM

In [127]:
model_name = 'google/flan-t5-small'
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name,
                                              device_map='auto')

The 'google/flan-t5-small' LLM model uses T5TokenizerFast as tokenizer.

In [128]:
pipeline = pipeline(
    "text2text-generation",
    model=model,
    tokenizer=tokenizer,
    max_length=128
)
local_llm = HuggingFacePipeline(pipeline=pipeline)

In [129]:
# Sending prompt directly
local_llm.invoke('What is the capital of China?')

'shanghai'

In [130]:
# Sending prompt through LLMChain
llm_chain = LLMChain(
    prompt = prompt,
    llm = local_llm
)
question = "What is the capital of England?"
print(llm_chain.run(question))

England is the capital of England. So, the answer is England.


### Chat Prompt Template

In [131]:
template = """
Interprete the text and evaluate the text.
sentiment: is the text in a positive, neutral or negative sentiment?
subject: What subject is the text about? Use exactly one word.

Format the output as JSON with the following keys:
sentiment
subject

text: {question}
"""

In [132]:
from langchain.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_template(template=template)
chain = LLMChain(
    prompt=prompt_template,
    llm=local_llm)

chain.predict(question="I ordered Pizza Salami and it was awesome!")

'positive'

### Real World Example with ResponseSchema, Templates, Chains, OutputParsers

In [133]:
from langchain.prompts import ChatPromptTemplate

template = """
Interprete the text and evaluate the text.
sentiment: is the text in a positive, neutral or negative sentiment?
subject: What subject is the text about? Use exactly one word.

Just return the JSON, do not add ANYTHING, NO INTERPRETATION!

text: {input}

{format_instructions}

"""

In [141]:
chat = LLMChain(
    prompt = ChatPromptTemplate.from_template(template=template),
    llm = HuggingFaceHub(
        repo_id="mistralai/Mixtral-8x7B-Instruct-v0.1",
        model_kwargs={
            "temperature": 0.1,
            "max_length": 100
        },
    )
)

In [142]:
from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser

sentiment_schema = ResponseSchema(
    name="sentiment",
    description="Is the text positive, neutral or negative? Only provide these words",
)
subject_schema = ResponseSchema(
    name="subject", description="What subject is the text about? Use exactly one word."
)
price_schema = ResponseSchema(
    name="price",
    description="How expensive was the product? Use None if no price was provided in the text",
)

response_schemas = [sentiment_schema, subject_schema, price_schema]

In [136]:
parser = StructuredOutputParser.from_response_schemas(response_schemas)
format_instructions = parser.get_format_instructions()
format_instructions

'The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":\n\n```json\n{\n\t"sentiment": string  // Is the text positive, neutral or negative? Only provide these words\n\t"subject": string  // What subject is the text about? Use exactly one word.\n\t"price": string  // How expensive was the product? Use None if no price was provided in the text\n}\n```'

In [143]:
format_instructions = parser.get_format_instructions()

messages = {
    "input": "I ordered Pizza Salami for 9.99$ and it was awesome!",
    "format_instructions": format_instructions,
}

response = chat.invoke(messages)
response

{'input': 'I ordered Pizza Salami for 9.99$ and it was awesome!',
 'format_instructions': 'The output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":\n\n```json\n{\n\t"sentiment": string  // Is the text positive, neutral or negative? Only provide these words\n\t"subject": string  // What subject is the text about? Use exactly one word.\n\t"price": string  // How expensive was the product? Use None if no price was provided in the text\n}\n```',
 'text': 'Human: \nInterprete the text and evaluate the text.\nsentiment: is the text in a positive, neutral or negative sentiment?\nsubject: What subject is the text about? Use exactly one word.\n\nJust return the JSON, do not add ANYTHING, NO INTERPRETATION!\n\ntext: I ordered Pizza Salami for 9.99$ and it was awesome!\n\nThe output should be a markdown code snippet formatted in the following schema, including the leading and trailing "```json" and "```":\n\n```json\n{

In [None]:
import json

response = json.dumps(response)
output_dict = parser.parse(response[2])
output_dict

{"sentiment": "positive", "subject": "pizza", "price": "9.99$"}