# LangChain: The LLM Application Framework

LangChain, an open-source library, empowers developers by providing a standardized and structured interface for building and integrating various components of an LLM Application. Its model-agnostic nature allows for compatibility with models from multiple LLM providers, including OpenAI, HuggingFace, and others. 

Using Langchain allows us to build ("like a chain") reusable components as part of complex multi-step LLM-based applications clearly and succinctly. 

You can learn about different [LangChain components here.](https://python.langchain.com/v0.2/docs/concepts/#components)

This tutorial will focus on a few LangChain components and learn about `chaining,` one of its powerful features.

## [Prompt templates](https://python.langchain.com/v0.2/docs/concepts/#prompt-templates)

Prompt Templates provides templates for designing prompts fed as inputs to the LLM models.
It helps us design templates with multiple inputs that are parameterized and reusable.

Below is an example of how to use a prompt template.

In [1]:
import langchain

In [2]:
from langchain import PromptTemplate
from llama_cpp import Llama
from ssec_tutorials import OLMO_MODEL
from ssec_tutorials.scipy_conf import *  # Contains helper methods for tutorial

### [String PromptTemplates](https://python.langchain.com/v0.2/docs/concepts/#string-prompttemplates)

Prompt templates to format a single string

In [3]:
prompt_template = PromptTemplate.from_template(
    "{planet_name} in the solar system is the "
)

prompt_template.format(planet_name="Mars")

'Mars in the solar system is the '

In [4]:
olmo = Llama(model_path=str(OLMO_MODEL), verbose=False)

In [5]:
model_response = olmo(
    prompt=prompt_template.format(planet_name="Mars"),
    temperature=0.2,
    max_tokens=8,
    echo=True,
)  # Generate a completion, can also call olmo.create_completion

In [6]:
print(parse_text_generation_response(model_response))

Mars in the solar system is the 
* fourth planet from the Sun.


In [7]:
# Another example
prompt_template = PromptTemplate.from_template(
    "{entity_1} of the planet {entity_2} is "
)
prompt_template.format(entity_1="Size", entity_2="Earth")

'Size of the planet Earth is '

In [8]:
model_response = olmo(
    prompt=prompt_template.format(entity_1="Size", entity_2="Earth"),
    temperature=0.2,
    echo=True,
)

In [9]:
print(parse_text_generation_response(model_response))

Size of the planet Earth is 
5,147 km in diameter. The diameter of the Moon is approximately 1


## LLM Interface

LangChain provides us with a standardized interface for loading the LLM model. Once standardized, we can use the same methods across models from different providers to call/invoke functions, enabling reusability.

Loading the model via [LangChain's LlamaCpp](https://python.langchain.com/v0.1/docs/integrations/llms/llamacpp/) abstraction enables us to use the `chaining` feature.

In [10]:
from langchain_community.llms import LlamaCpp

In [11]:
olmo = LlamaCpp(
    model_path=str(OLMO_MODEL),
    temperature=0.8,
    # stop=["."],
    verbose=False,
)

In [12]:
# Create a prompt template using OLMo's tokenizer chat template we saw in module 1.
prompt_template = PromptTemplate.from_template(
    template=olmo.client.metadata["tokenizer.chat_template"], template_format="jinja2"
)

In [13]:
# This is how you can format the prompt message
prompt_template.format(
    messages=[
        {
            "role": "user",
            "content": "You are a helpful assistant. Tell me a joke about cats",
        }
    ],
    add_generation_prompt=True,
    eos_token="<|endoftext|>",
)

'<|endoftext|>\n\n<|user|>\nYou are a helpful assistant. Tell me a joke about cats\n\n\n<|assistant|>\n\n'

## Chain in LangChain

Chaining allows us to combine multiple components, as described above, in series or parallel to develop a multi-step LLM pipeline.
As shown in the image below, any number of components can be linked together to form a chain.

![LancChain Chain](../../images/langchain-chain.webp)


Image Source: [www.analyticsvidhya.com](https://www.analyticsvidhya.com/blog/2023/10/a-comprehensive-guide-to-using-chains-in-langchain/)

Internally, the chain works like below:

STEP 1: Dictionary is processed as an input to the prompt template.  
STEP 2: Prompt Template reads the variables to form the prompt text as output - "What are stars and moon?"  
STEP 3: The prompt is given as input to the LLM model.  
STEP 4: LLM Model produces output.  
STEP 5: The output goes through StrOutputParser that parses it into string and gives the result.  

We can use the pipe operator ("|"), which is part of the [LCEL(Lang Chain Expression Language)](https://python.langchain.com/v0.2/docs/concepts/#langchain-expression-language-lcel). The pipe operator sequentially arranges each component, similar to the above image.

In [14]:
llm_chain = prompt_template | olmo

In [15]:
# Construct the prompt as expected by OLMo
llm_chain.invoke(
    {
        "messages": [
            {
                "role": "user",
                "content": "You are a helpful assistant. Tell me a joke about cats",
            }
        ],
        "add_generation_prompt": True,
        "eos_token": "<|endoftext|>",
    }
)

" Sure, here's a classic cat-themed joke for you:\n\n\n\nWhy don't cats play poker in the jungle? Too many cheetahs! 😜🐱😂🦁 #joke #funnycats #cheetahpokerface 👉🎢 [Link to an article about funny cat pictures instead, if that's more appropriate based on the current context.]"

Instead of having to invoke `llm_chain` repeatedly with `add_generation_prompt` and `eos_token`, we can update our `prompt_template`.

In [16]:
# Create a prompt template using OLMo's tokenizer chat template we saw in module 1, but this time use partial variables.
prompt_template = PromptTemplate.from_template(
    template=olmo.client.metadata["tokenizer.chat_template"],
    template_format="jinja2",
    partial_variables={"add_generation_prompt": True, "eos_token": "<|endoftext|>"},
)

In [17]:
llm_chain = prompt_template | olmo

Let's stream the output instead of waiting for OLMo to generate and display the text. We can use [Callbacks](https://python.langchain.com/v0.2/docs/concepts/#callbacks) to subscribe to various events in your LLM application pipeline. Check [this out](https://python.langchain.com/v0.1/docs/modules/callbacks/#callback-handlers) for a list of events. 

In [18]:
from langchain_core.callbacks import StreamingStdOutCallbackHandler

In [19]:
llm_chain.invoke(
    {
        "messages": [
            {
                "role": "user",
                "content": "You are a helpful assistant. Tell me a joke about cats",
            }
        ]
    },
    config={"callbacks": [StreamingStdOutCallbackHandler()]},
)

 Why don't cats play poker in the forest? Too many squirrels!

I hope that made you smile. Do you need any more information about cats or jokes? I'm here to help.

" Why don't cats play poker in the forest? Too many squirrels!\n\nI hope that made you smile. Do you need any more information about cats or jokes? I'm here to help."

We will cover more LangChain concepts in upcoming notebooks. 