[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/jeljov/NAP2025/blob/main/LangChain_Intro.ipynb)

# Introduction to LangChain

[LangChain](https://www.langchain.com/) is a popular framework that allows users to quickly build apps and pipelines around **L**arge **L**anguage **M**odels (LLMs). It can be used for chatbots, **G**enerative **Q**uestion-**A**nwering (GQA), summarization, and much more.

The core idea of the framework is to allow for _"chaining"_ together different components to support advanced use-cases around LLMs (see an illustration [here](https://miro.medium.com/v2/resize:fit:1400/1*jJE0uZTBadEYqe0hEXlWuQ.png)). Chains may consist of multiple components from several modules among which the main are the following:

* **Models**: Modern LLMs are typically accessed through a chat model interface that takes a list of messages as input and returns a message as output. The latest generation of chat models natively support **tool calling APIs**, which enable LLMs to interact with external services, APIs, and databases.

* **Prompt templates**: Prompt template is an object that takes user input and combines it with a template for a particular prompt type into the final string or message (prompt). There are different type of prompt templates, suitable for different kinds of models and different tasks.

* **Output parsers**: These are responsible for taking in the output of a model (strings or a message) and transforming it into a more usable form, such as JSON that matches a given schema.


In [None]:
!pip -q install langchain

# Using large language models in LangChain

LangChain supports several large language model (LLM) providers, including both those that offer open and free models (such as Hugging Face) and those that offer proprietary models that require payment (e.g., OpenAI). The use of these different models in LangChain is almost identical, one only needs to instantiate a differnt class to use different models (e.g., ChatGroq vs ChatOpenAI).

We will explore the use of LangChain with open LLMs and all we do can be easily mapped to the work with proprietary models.

### Access to open models via Groq

We will use [Groq](https://groq.com/) API to access a state-of-the-art open model, namely Meta's **LLama 3.1 8B** model. It is a small model in LLama 3.1 group of models and thus not as powerful as its larger 'cousins' (with 70B or 405B parameters), but for the introductory examples it should be fine. Model details can be found [here](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md).

Groq is a company specialised in accelerating the inference process for pre-trained AI models, such as transformer-based models, using specialized hardware and software. It is providing a platform that allows for fast and  efficient running of AI models. It also offers certain number of calls free of charge. To learn about the number of free calls (not worth stating it here as it is prone to change), take a look at [this page](https://console.groq.com/docs/rate-limits). You can sign up [here](https://console.groq.com/keys) for an (free) API key, required to use Groq's API.

My Groq API key is stored in the **Colab Secrets**, which is a recommended way of securely storing access tokens and API keys. To learn how to do that and how then to access API tokens / keys stored as Secrets, see, for example, [this short article](https://labs.thinktecture.com/secrets-in-google-colab-the-new-way-to-protect-api-keys/).

In [None]:
import os
from google.colab import userdata

os.environ["GROQ_API_KEY"] = userdata.get('GROQ_API_KEY')

GroqAPI can be accessed via LangChain, which allows for easy integration of an LLM into an overall application workflow. To make use of this integration, we first need to install the `langchain_groq` library:

In [None]:
!pip -q install langchain_groq

### Set up a LLama model and create simple chains

Next we create a LLama 3.1 based inference engine, via Groq, by instantiating the ChatGroq class with a model supported by Groq (see the full list [here](https://console.groq.com/docs/models)):

In [None]:
from langchain_groq import ChatGroq

creative_llm = ChatGroq(
    model="llama-3.1-8b-instant",
    temperature=0.9, # setting high temperature to "foster" creativity of the model
    )

See the [ChatGroq API](https://reference.langchain.com/python/integrations/langchain_groq/?h=chatgroq#langchain_groq.ChatGroq) for the details on the constructor call, the available methods and the response formats

Note that we are using a Llama model that was fine-tuned for chat, which means that the interaction with the model is typically structured as follows:
```
System message : You are a helpful and kind assistant that helps users make their travel plans.
Your suggestions are concise and to the point.

Human: I like big bustling cities, where should I go?

AI: You should go to Sydney, Australia

Human: Sounds good. What should I do when I'm there?

AI:
```

In [None]:
DEFAULT_SYSTEM_PROMPT = """
You are a helpful, respectful and honest assistant. Always answer as helpfully as possible, while never including any harmful, unethical, or illegal content.
If a question does not make any sense, or is not factually coherent, explain why instead of giving an incorrect answer.
If you don't know the answer to a question, just say "I do not know".
"""

In [None]:
from langchain_core.prompts import ChatPromptTemplate

prompt_template = ChatPromptTemplate.from_messages(
    [
        ("system", DEFAULT_SYSTEM_PROMPT),
        ("human", "{question}"),
    ]
)

In [None]:
cookie_name_question = "What would be a good name for a company that makes and sells hand-made healthy cookies?"

print(prompt_template.format(question=cookie_name_question))

In [None]:
creative_question_chain = prompt_template | creative_llm

response = creative_question_chain.invoke({"question": cookie_name_question})

In [None]:
type(response)

Let's explore the response in more detail

In [None]:
print(response.content)

In [None]:
from pprint import pprint

pprint(response.response_metadata)

While not all of these metadata elements are relevant to us now, it would be useful to keep track of the tokens exchanged, as that is directly related to the pricing, that is, the use of the free quota.

In [None]:
def get_tokens_used(groq_response):
  token_usage = groq_response.response_metadata['token_usage']
  return {
      'prompt_tokens':token_usage['prompt_tokens'],
      'completion_tokens':token_usage['completion_tokens'],
      'total_tokens':token_usage['total_tokens']
  }

get_tokens_used(response)

To make the interaction with an LLM more specific to the task at hand, we can alter the system prompt and provide more task-specific guidance in the user message:

In [None]:
branding_system_msg = """
You are a highly creative assistant, with a plenty of original ideas. You especially excell in inventing catchy brand and product promotional messages.
"""

branding_question = """
What would be a good name for a company that makes and sells {product}? Suggest three distinct names.
Structure the results as a python dictionary with the company name as the key and an explanation for the suggested name as the value.
Return only this dictionary and nothing else.
"""

branding_prompt_template = ChatPromptTemplate.from_messages(
    [
        ("system", branding_system_msg),
        ("human", branding_question),
    ]
)

Build a simple chain using the above created prompt template and LLM

In [None]:
branding_chain = branding_prompt_template | creative_llm

branding_response = branding_chain.invoke({'product': 'colorful running gear'})

In [None]:
print(branding_response.content)

In [None]:
get_tokens_used(branding_response)

We can also request from the LLM to generate the output in the JSON format and then use LangChain's `JsonOutputParser` to parse the content:

In [None]:
from langchain_core.output_parsers import JsonOutputParser

branding_question_json = """
What would be a good name for a company that makes and sells {product}? Suggest three distinct names.
Structure the results as a JSON list of dictionaries, where each dictionary has the company name as its key and an explanation for the suggested name as its value.
Return only this JSON list and nothing else.
"""

branding_prompt_template_json = ChatPromptTemplate.from_messages(
    [
        ("system", branding_system_msg),
        ("human", branding_question_json),
    ]
)

branding_chain_json = branding_prompt_template_json | creative_llm | JsonOutputParser()

In [None]:
branding_json = branding_chain_json.invoke({'product': 'colorful running gear'})

In [None]:
from pprint import pprint

pprint(branding_json)

In [None]:
type(branding_json)

### Simple chain sequencing

We can also explore a simple sequantial chain:

In [None]:
from langchain_core.output_parsers import StrOutputParser

# Chain 1
chain1_prompt = ChatPromptTemplate.from_messages([
    ("system", branding_system_msg),
    ("human", "Please suggest a catchy name for a company that makes and sells {product}. Return just the company name and nothing else.")
])

chain1 = chain1_prompt | creative_llm | StrOutputParser()

In [None]:
# Chain 2
chain2_prompt = ChatPromptTemplate.from_messages([
    ("system", branding_system_msg),
    ("human", "Please write a short promotional message for the following company: {company_name}.")
])

chain_seq = {"company_name": chain1} | chain2_prompt | creative_llm | StrOutputParser()

company_desc = chain_seq.invoke({"product":"hand-made healthy cookies"})

Note that in the `chain_seq` code above, the dictionary `{"company_name": chain1}` should be interpreted as the output of `chain1` is stored in the variable `company_name`

In [None]:
print(company_desc)

### Answering factual questions

We can also ask the LLM some factual questions. But, for that, we will need a new model instance, since to answer factual questions the model's playfulness (temperature) needs to be reduced to zero


In [None]:
factual_llm = ChatGroq(
    model="llama-3.3-70b-versatile", #"llama-3.1-8b-instant"
    temperature=0
)

In [None]:
questions = [
    "What is the capital of Serbia?",
    "What is the largest city in Europe in terms of population?",
    "What movie got the highest number of the U.S. Academy Awards (aka Oscars)?",
    "Where the next Winter Olympic Games will be held?"
]

In [None]:
factual_questions_system_msg = """
Please answer the user's questions by giving just a direct, factual response and include a short explanation for each response.
If you do not know the answer, respond with 'I don't know'.
"""

questions_str = "\n".join([f"{i+1}) {q}" for i, q in enumerate(questions)])

In [None]:
factual_questions_prompt = ChatPromptTemplate.from_messages([
    ("system", factual_questions_system_msg),
    ("human", "QUESTIONS:\n{questions}")
])

print(factual_questions_prompt.format(questions=questions_str))

In [None]:
factual_questions_chain = factual_questions_prompt | factual_llm

factual_responses = factual_questions_chain.invoke({'questions': questions_str})

In [None]:
print(factual_responses.content)

In [None]:
get_tokens_used(factual_responses)

We will now challenge the LLM with questions it should not be able to answer:

In [None]:
factual_responses2 = factual_questions_chain.invoke({'questions': "Who is the current president of the USA?\nWho is currently the best tennis player accoridng to the ATP list?"})
print(factual_responses2.content)

### Instantiating and using proprietary LLMs: an OpenAI example

Let's try the same tasks with OpenAI's models. LangChain offers extensive support for working with OpenAI models, available through the `langchain_openai` package. For an overview, check [this page of the LangChain documentation](https://python.langchain.com/docs/integrations/providers/openai/).

We'll start by installing additional libraries and setting up a few additional prerequisites:

In [None]:
!pip install -q langchain-core openai langchain-openai

To use OpenAI's generative models, we need to get our API key which can be retrieved by signing up for an account on the
[OpenAI's page for developers](https://platform.openai.com/docs/overview).

**Note**: Obtaining an OpenAI API key is free of charge. It is running of their models that is charged, so you can open an account and obtain the API key for free. However, you won't be able to run a model (or any LangChain component that integrates an OpenAI model) until you deposit some money on your OpenAI account. You may also want to check their [pricing sheme](https://platform.openai.com/docs/pricing) to get a better undersanding of how they charge for API calls.

Once you have your OpenAI API key, you can add it to the Colab Secrets or enter it in some other privacy protected way. Here, we use Colab Secrets to retrieve API key:

In [None]:
import os
from google.colab import userdata

os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')

As above, we will access OpenAI's models via LangChain's ChatModel abstraction, since chat (= back and forth interaction with the user) is the interaction mode typical for the latest generation of LLMs. More precisely, this means that we will be using the [`ChatOpenAI` class](https://python.langchain.com/docs/integrations/chat/openai/) from the `langchain_openai` module.

As for the specific OpenAI model, we will use **gpt-4o-mini** since we will be working with text only and that model has the best price / performance ratio (as can be seen [here](https://platform.openai.com/docs/pricing))

In [None]:
from langchain_openai import ChatOpenAI

openai_creative = ChatOpenAI(model='gpt-4o-mini', temperature=0.9)

We'll use the same simple questions as before with the LLama 3.1 example:

In [None]:
branding_system_msg = """You are a highly creative assistant, with a plenty of original ideas. You especially excell in inventing catchy brand and product promotional messages."""
branding_question = """What would be a good name for a company that makes and sells {product}? Suggest three distinct names."""

messages = [
    ("system", branding_system_msg),
    ("human", branding_question)
]


branding_prompt_template = ChatPromptTemplate.from_messages(messages)

branding_openai_chain = branding_prompt_template | openai_creative

branding_openai_response = branding_openai_chain.invoke({'product':'colorful running gear'})

In [None]:
type(branding_openai_response)

The response of OpenAI's GPT-4o-mini model has the same structure as the one we saw above for LLama 3.1 model, that is, it includes the `content` and `response_metadata` components. The difference is in the metadata component, which is more complex in the case of GPT-4o-mini as it can work with multiple modalities.

In [None]:
print(branding_openai_response.content)

In [None]:
from pprint import pprint

pprint(branding_openai_response.response_metadata)

Let's now see how GPT-4o-mini will respond to factual questions.
For this task, we will set its temperature to zero, thus decreasing the chance for hallucinations and allowing for replicating the results.

In [None]:
openai_factual = ChatOpenAI(
    model="gpt-4o-mini",
    temperature=0.0
)

In [None]:
questions = [
    "What is the capital of Serbia?",
    "What is the largest city in Europe in terms of population?",
    "What movie got the highest number of the U.S. Academy Awards (aka Oscars)?",
    "Where the next Winter Olympic Games will be held?",
    "What is the longest commercial flight and between which cities?"
]

questions_str = "\n".join([f"{i+1}) {q}" for i, q in enumerate(questions)])

factual_system_msg = """
Please answer the questions given below, by giving a direct response followed by a short explanation. If you do not know the answer, respond with 'I don't know'.
"""

In [None]:
factual_questions_prompt = ChatPromptTemplate.from_messages([
    ("system", factual_system_msg),
    ("human", "QUESTIONS:\n{questions}")
])

factual_openai_chain = factual_questions_prompt | openai_factual | StrOutputParser()

factual_openai_responses = factual_openai_chain.invoke({"questions":questions_str})

In [None]:
print(factual_openai_responses)

In [None]:
difficult_factual = factual_openai_chain.invoke({'questions': "Who is the current president of the USA?\nWho is currently the best tennis player accoridng to the ATP list?"})
print(difficult_factual)

### Choosing an LLM

The pace of large language model development and deprecation is extremely fast. Furthermore, there are so many different LLMs, both open and proprietary ones. So, a question that naturally arises is how to choose a model to use in a particular LLM-based application.

While there is no easy and straightforward answer to that question, what one can do is to:
* The [Chatbot Arena](https://lmarena.ai/leaderboard/) is another well known leaderboard that unlike the previously mentioned one includes both open and proprietary models
* Vellum.ai maintains [a few leaderboards](https://www.vellum.ai/llm-leaderboard) - in addition to the main one, there is one for open models, one for coding models, as well as an option for direct comparison of two selected models
* Keep track of models offered by [OpenAI](https://platform.openai.com/docs/models/overview) and [Google](https://deepmind.google/models/), as the major providers of proprietary models
* read online available articles (from credible sources) offering a comparison of current LLMs