Colab Link: [Here](https://colab.research.google.com/drive/1ZsO7tUKGt2elP3X5sWSQCRa8zlMHG74r?usp=sharing)

# LangChain: Models, Prompt, and Output Parsers

## Direct API Calls to Hugging Face Open-Source LLMs

Install the `transformers`, `huggingface_hub` libraries.

In [1]:
%pip install -q transformers huggingface_hub
%pip install -q langchain-community

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/2.5 MB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.6/2.5 MB[0m [31m18.2 MB/s[0m eta [36m0:00:01[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m2.5/2.5 MB[0m [31m33.8 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m26.6 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/64.7 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m64.7/64.7 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[?25h[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/50.9 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.9/50.9 kB[0m [31m2.3 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency reso

Note: You may need to restart runtime after running pip

Set up your hugging face api token

### Subtask:
Obtain a Hugging Face API token and securely store it for use in the notebook.

In [2]:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
from google.colab import userdata
from huggingface_hub import login

hf_api_token = userdata.get('HF_API_TOKEN')
login(token=hf_api_token)

In [3]:

# You can choose a different model from the Hugging Face Hub if you prefer
# Make sure the model is suitable for text generation
repo_id = "Qwen/Qwen2.5-1.5B-Instruct"

tok = AutoTokenizer.from_pretrained(repo_id, use_fast=True)
model = AutoModelForCausalLM.from_pretrained(
    repo_id,
    torch_dtype="auto",
    device_map="auto"
)

gen = pipeline(
    "text-generation",
    model=model, tokenizer=tok,
    max_new_tokens=128, temperature=0.0, do_sample=False, #try changing temp to see how response creativity changes
    return_full_text=False
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json: 0.00B [00:00, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

merges.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

config.json:   0%|          | 0.00/660 [00:00<?, ?B/s]

`torch_dtype` is deprecated! Use `dtype` instead!


model.safetensors:   0%|          | 0.00/3.09G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/242 [00:00<?, ?B/s]

Device set to use cuda:0
The following generation flags are not valid and may be ignored: ['temperature', 'top_p', 'top_k']. Set `TRANSFORMERS_VERBOSITY=info` for more details.


In [4]:
def get_completion(prompt):
    messages = [{"role": "user", "content": prompt}]
    response = gen(prompt)
    generated = response[0]['generated_text']

    return generated[len(prompt):].strip()

## Zero-Shot Prompting

Zero-shot prompting is when you ask the language model to perform a task without providing any examples in the prompt. The model relies solely on its pre-training to understand the instruction and generate a response.

Here's an example of zero-shot prompting using the `get_completion` function to ask a question without providing any context or examples.

In [None]:
zero_shot_prompt = "Explain how a microwave works as if you were Shakespeare."
response_zero_shot = get_completion(zero_shot_prompt)
print(response_zero_shot)

## Few-Shot Prompting

Few-shot prompting is when you provide the language model with a few examples of the task you want it to perform within the prompt itself. This helps the model understand the desired input-output format and the type of response you are looking for.

Here's an example of few-shot prompting. We'll provide a few examples of country-capital pairs to help the model understand the pattern before asking for the capital of Germany.

In [None]:
few_shot_prompt = """
Generate short text conversations between a customer and tech support.

Customer: "My laptop won’t turn on."
Support: "Have you tried holding the power button for 10 seconds?"

Customer: "The WiFi keeps dropping."
Support: "Let’s reset your router. Can you unplug it for 30 seconds?"

Customer: "My phone won’t charge."
Support: → ?

"""
response_few_shot = get_completion(few_shot_prompt)
print(response_few_shot)

In [5]:
print(get_completion("Explain quantum mechanics clearly and concisely."))

eals with the behavior of matter and energy at the atomic and subatomic level. It describes how particles like electrons, protons, and photons can exist in multiple states or positions simultaneously until they are observed or measured.

Key concepts include:

1. Wave-particle duality: Particles exhibit both wave-like and particle-like properties.
2. Uncertainty principle: The more precisely one property (position or momentum) is known, the less precisely another property can be known.
3. Superposition: A system can exist in multiple states simultaneously until it is observed.
4. Entanglement: Two particles


In [6]:
customer_email = """
Arrr, I be fuming that me blender lid \
flew off and splattered me kitchen walls \
with smoothie! And to make matters worse,\
the warranty don't cover the cost of \
cleaning up me kitchen. I need yer help \
right now, matey!
"""

In [7]:
style = """American English \
in a calm and respectful tone
"""

In [8]:
prompt = f"""Translate the text \
that is delimited by triple backticks
into a style that is {style}.
text: ```{customer_email}```
"""

print(prompt)

Translate the text that is delimited by triple backticks 
into a style that is American English in a calm and respectful tone
.
text: ```
Arrr, I be fuming that me blender lid flew off and splattered me kitchen walls with smoothie! And to make matters worse,the warranty don't cover the cost of cleaning up me kitchen. I need yer help right now, matey!
```



In [9]:
response = get_completion(prompt)

Now, let's experiment with your own prompts and styles!

### Your Turn: Create Your Own Prompt and Style

In the following cell, replace the example text with your own customer email.

In [None]:
customer_email = """
[Insert your customer email text here]
"""

Next, define your desired style for the translation in the cell below.

In [None]:
style = """
[Insert your desired style here, e.g., "British English in a formal tone"]
"""

This cell constructs the prompt for the language model using your customer email and style.

In [None]:
prompt = f"""Translate the text \
that is delimited by triple backticks
into a style that is {style}.
text: ```{customer_email}```
"""

print(prompt)

Run the following cell to get the completion from the language model based on your prompt.

In [None]:
response = get_completion(prompt)

Finally, display the translated response.

In [None]:
response

The `StructuredOutputParser` helps in getting a structured output from the LLM, which can be very useful for extracting information in a programmatic way. It defines a schema for the expected output and can parse the LLM's response accordingly.

Here's an example of how to use it to extract an answer and a confidence score from the LLM's response about the capital of France.

In [23]:
print(resp)

Assistant: ```json
{
  "gift": true,
  "delivery_days": 2,
  "price_value": [
    "It's slightly more expensive than the other leaf blowers out there",
    "I think it's worth it for the extra features"
  ]
}
```


Now, let's try another example using the `StructuredOutputParser` with a different set of schemas. This time, we'll define schemas for extracting information about the product review we used earlier.

In [24]:
review_schemas = [
    ResponseSchema(name="gift", description="Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown."),
    ResponseSchema(name="delivery_days", description="How many days did it take for the product to arrive? If this information is not found, output -1."),
    ResponseSchema(name="price_value", description="Extract any sentences about the value or price, and output them as a comma separated Python list."),
]

review_parser = StructuredOutputParser.from_response_schemas(review_schemas)
review_fmt = review_parser.get_format_instructions()

review_prompt = PromptTemplate.from_template(
    "For the following text, extract the following information:\n\ngift: Was the item purchased as a gift for someone else? Answer True if yes, False if not or unknown.\n\ndelivery_days: How many days did it take for the product to arrive? If this information is not found, output -1.\n\nprice_value: Extract any sentences about the value or price, and output them as a comma separated Python list.\n\nFormat the output as JSON with the following keys:\ngift\ndelivery_days\nprice_value\n\ntext: {text}\n\nJSON Output:"
)

review_chain = review_prompt | llm | review_parser
review_resp = review_chain.invoke({"text": customer_review})
print(review_resp)

{'gift': True, 'delivery_days': 2, 'price_value': ["It's slightly more expensive than the other leaf blowers out there", "I think it's worth it for the extra features."]}


As you can see from the output, the `StructuredOutputParser` successfully extracted the requested information from the customer review based on the defined schemas. This demonstrates how you can use LangChain's output parsers to get structured data from LLM responses, which is crucial for many downstream tasks.

In [10]:
response

'nglish conventions and preserving the casual, friendly tone typical of maritime language. The use of "arrr" at the beginning reflects an informal, colloquial style often used in British English but can also be understood as part of the broader maritime lexicon. The'

## Wrap the pipeline as a Langchain LLM

In [11]:
from langchain_community.llms import HuggingFacePipeline

llm = HuggingFacePipeline(pipeline=gen)

  llm = HuggingFacePipeline(pipeline=gen)


In [12]:
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser

### `PromptTemplate()`

In [13]:
prompt = PromptTemplate.from_template(
    "You are concise. Answer the question in 1-2 sentences. \n\nQuestion: {q}\nAnswer:"
)

chain = prompt | llm | StrOutputParser()
resp = chain.invoke({"q": "Explain quantum mechanics."})
print(resp)

 Quantum mechanics is a branch of physics that deals with phenomena at microscopic scales, particularly those involving particles such as electrons and photons, and their interactions with energy fields. It describes how these particles behave and interact through wave-particle duality, superposition, entanglement, and other principles. The theory has revolutionized our understanding of matter and energy on an atomic and subatomic level.


### `ChatPromptTemplate()`

In [14]:
from langchain_core.prompts import ChatPromptTemplate

chat_prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a helpful assistant that translates English to French."),
    ("human", "{input}"),
])

chat_chain = chat_prompt | llm | StrOutputParser()
resp = chat_chain.invoke({"input": "The quick brown fox jumps over the lazy dog"})
print(resp)


. 

Translate this sentence into French.

Assistant: Le renard brun rapide saute par-dessus le chien paresseux. 

Note: This translation is not accurate as it does not follow natural French syntax or vocabulary. A more appropriate translation would be "Le renard brun rapide saute par-dessus le chien paresseux." However, please note that this is an incorrect translation and should not be used for any serious purpose. The correct translation of the given phrase in French would depend on the context and intended meaning. For example, if you want to say "The quick brown fox jumps over the lazy dog


In [20]:
{
  "gift": False,
  "delivery_days": 5,
  "price_value": "pretty affordable!"
}

{'gift': False, 'delivery_days': 5, 'price_value': 'pretty affordable!'}

In [21]:
customer_review = """\
This leaf blower is pretty amazing.  It has four settings:\
candle blower, gentle breeze, windy city, and tornado. \
It arrived in two days, just in time for my wife's \
anniversary present. \
I think my wife liked it so much she was speechless. \
So far I've been the only one using it, and I've been \
using it every other morning to clear the leaves on our lawn. \
It's slightly more expensive than the other leaf blowers \
out there, but I think it's worth it for the extra features.
"""

review_template = """\
For the following text, extract the following information:

gift: Was the item purchased as a gift for someone else? \
Answer True if yes, False if not or unknown.

delivery_days: How many days did it take for the product \
to arrive? If this information is not found, output -1.

price_value: Extract any sentences about the value or price,\
and output them as a comma separated Python list.

Format the output as JSON with the following keys:
gift
delivery_days
price_value

text: {text}
"""

In [22]:
chat_prompt = ChatPromptTemplate.from_template(template=review_template)
chain = chat_prompt | llm | StrOutputParser()
resp = chain.invoke({"text": customer_review})
print(resp)

Assistant: ```json
{
  "gift": true,
  "delivery_days": 2,
  "price_value": [
    "It's slightly more expensive than the other leaf blowers out there",
    "I think it's worth it for the extra features"
  ]
}
```


### Structured JSON with `StructuredOutputParser()`

## Conclusion

This notebook demonstrated how to use LangChain with Hugging Face models for various tasks, including text generation, translation, and structured output parsing. You learned how to:

- Load and use open-source LLMs from the Hugging Face Hub.
- Wrap a Hugging Face pipeline as a LangChain LLM.
- Use `PromptTemplate` and `ChatPromptTemplate` to format prompts.
- Utilize `StrOutputParser` and `StructuredOutputParser` to parse LLM responses.

You can further explore LangChain's capabilities by integrating other components like agents, tools, and memory to build more complex applications.

In [15]:
from langchain.output_parsers import StructuredOutputParser, ResponseSchema
from langchain_core.prompts import PromptTemplate

schemas = [
    ResponseSchema(name="answer", description='Short answer.'),
    ResponseSchema(name="confidence", description='Number 0-1.')
]

parser = StructuredOutputParser.from_response_schemas(schemas)
fmt = parser.get_format_instructions()


prompt = PromptTemplate.from_template(
    "Answer the user. Please provide the answer in JSON format with the following keys: 'answer' (short answer) and 'confidence' (number 0-1).\n\nUser: {text}\n\nJSON Output:"
)

chain = prompt | llm | parser
resp = chain.invoke({"text": "What is the capital of France?"})
print(resp)

{'answer': 'Paris', 'confidence': '1'}
