# Langchain
LangChain is an popular LLM orchestration library to help set up systems that have one or more LLM components. The library is, for better or worse, extremely popular and changes rapidly based on new developments in the field, meaning that somebody can have a lot of experience in some parts of LangChain while having little-to-no familiarity with other parts (either because there are just so many different features or the area is new and the features have only recently been implemented).

This experiment will be using the **LangChain Expression Language (LCEL)** to ramp up from basic chain specification to more advanced dialog management practices, so hopefully the journey will be enjoyable and even seasoned LangChain developers might learn something new!

<!-- > <img style="max-width: 400px;" src="imgs/langchain-diagram.png" /> -->
> <img src="https://dli-lms.s3.amazonaws.com/assets/s-fx-15-v1/imgs/langchain-diagram.png" width=400px/>
<!-- > <img src="https://drive.google.com/uc?export=view&id=1NS7dmLf5ql04o5CyPZnd1gnXXgO8-jbR" width=400px/> -->

### **Environment Setup:**

In [8]:
## Necessary for Colab, not necessary for course environment
%pip install -q langchain langchain-nvidia-ai-endpoints gradio

import os
os.environ["NVIDIA_API_KEY"] = "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"

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m997.8/997.8 kB[0m [31m31.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m16.8/16.8 MB[0m [31m49.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m318.7/318.7 kB[0m [31m20.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.3/10.3 MB[0m [31m51.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m62.8/62.8 kB[0m [31m4.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m93.2/93.2 kB[0m [31m6.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m72.0/72.0 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m130.2/130.2 kB[0m [31m9.1 MB/s[0m eta [36m0:00:00[0m
[?25h

## Chains and Runnables

When exploring a new library, it's important to note what are the core systems of the library and how are they used.

In LangChain, the main building block *used to be* the classic **Chain**: a small module of functionality that does something specific and can be linked up with other chains to make a system. So for all intents and purposes, it is a "building-block system" abstraction where the building blocks are easy to create, have consistent methods (`invoke`, `generate`, `stream`, etc), and can be linked up to work together as a system. Some example legacy chains include `LLMChain`, `ConversationChain`, `TransformationChain`, `SequentialChain`, etc.

More recently, a new recommended specification has emerged that is significantly easier to work with and extremely compact, the **LangChain Expression Language (LCEL)**. This new format relies on a different kind of primitive - a **Runnable** - which is simply an object that wraps a function. Allow dictionaries to be implicitly converted to Runnables and let a **pipe |** operator create a Runnable that passes data from the left to the right (i.e. `fn1 | fn2` is a Runnable), and you have a simple way to specify complex logic!

Here are some very representative example Runnables, created via the `RunnableLambda` class:

In [10]:
from langchain.schema.runnable import RunnableLambda, RunnablePassthrough
from functools import partial

## Very simple "take input and return it"
identity = RunnableLambda(lambda x: x)  ## Or RunnablePassthrough works

## Given an arbitrary function, you can make a runnable with it
def print_and_return(x, preface=""):
    print(f"{preface}{x}")
    return x

rprint0 = RunnableLambda(print_and_return)

## You can also pre-fill some of values using functools.partial
rprint1 = RunnableLambda(partial(print_and_return, preface="1: "))

## And you can use the same idea to make your own custom Runnable generator
def RPrint(preface=""):
    return RunnableLambda(partial(print_and_return, preface=preface))


## Chaining two runnables
chain1 = identity | rprint0
chain1.invoke("Hello!")
print()

## Chaining that one in as well
output = (
    chain1           ## Prints "Welcome Home!" & passes "Welcome Home!" onward
    | rprint1        ## Prints "1: Welcome Home!" & passes "Welcome Home!" onward
    | RPrint("2: ")  ## Prints "2: Welcome Home!" & passes "Welcome Home!" onward
).invoke("Welcome Home!")

## Final Output Is Preserved As "Welcome Home!"
print("\nOutput:", output)


Hello!

Welcome Home!
1: Welcome Home!
2: Welcome Home!

Output: Welcome Home!


## Dictionary Pipelines with Chat Models

There's a lot you can do with runnables, but it's important to formalize some best practices. At the moment, it's easiest to use *dictionaries* as our default variable containers for a few key reasons:

**Passing dictionaries helps us keep track of our variables by name.**

Since dictionaries allow us to propagate named variables (values referenced by keys), using them is great for locking in our chain components' outputs and expectations.

**LangChain prompts expect dictionaries of values.**

It's quite intuitive to specify an LLM Chain in LCEL to take in a dictionary and produce a string, and equally easy to raise said string back up to be a dictionary. This is very intentional and is partially due to the above reason.

<br>

### **Example 1:** A Simple LLM Chain

One of the most fundamental components of classical LangChain is the `LLMChain` that accepts a **prompt** and an **LLM**:

- A prompt, usually retrieved from a call like `PromptTemplate.from_template("string with {key1} and {key2}")`, specifies a template for creating a string as output. A dictionary `{"key1" : 1, "key2" : 2}` could be passed in to get the output `"string with 1 and 2"`.
    - For chat models like `ChatNVIDIA`, you would use `ChatPromptTemplate.from_messages` instead.
- An LLM takes in a string and returns a generated string.
    - Chat models like `ChatNVIDIA` work with messages instead, but it's the same idea! Using an **StrOutputParser** at the end will extract the content from the message.

The following is a lightweight example of a simple chat chain as described above. All it does is take in an input dictionary and use it fill in a system message to specify the overall meta-objective and a user input to specify query the model.

In [11]:
from langchain_nvidia_ai_endpoints import ChatNVIDIA
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate

## Simple Chat Pipeline
chat_llm = ChatNVIDIA(model="meta/llama3-8b-instruct")

prompt = ChatPromptTemplate.from_messages([
    ("system", "Only respond in rhymes"),
    ("user", "{input}")
])

rhyme_chain = prompt | chat_llm | StrOutputParser()

print(rhyme_chain.invoke({"input" : "Tell me about dogs!"}))

Dogs are sweet, with tails so neat,
They bring joy to our feet, and a trick to repeat.
With fur so soft and eyes so bright,
They're loving friends, day and night.

Their barks are loud, their snuggles tight,
They chase their tails with furry delight.
From big to small, from old to new,
Dogs bring love, and that's what they do!


In addition to just using the code command as-is, we can try using a [**Gradio interface**](https://www.gradio.app/guides/creating-a-chatbot-fast) to play around with our model. Gradio is a popular tool that provides simple building blocks for creating custom generative AI interfaces! The below example shows how you can make an easy gradio chat interface with this particular example chain:

In [None]:
import gradio as gr

## Streaming Interface

def rhyme_chat_stream(message, history):
    ## This is a generator function, where each call will yield the next entry
    buffer = ""
    for token in rhyme_chain.stream({"input" : message}):
        buffer += token
        yield buffer

## Uncomment when you're ready to try this.
demo = gr.ChatInterface(rhyme_chat_stream).queue()
window_kwargs = {} # or {"server_name": "0.0.0.0", "root_path": "/7860/"}
demo.launch(share=True, debug=True, **window_kwargs)

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Running on public URL: https://3cf816ad6c28d8925e.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from Terminal to deploy to Spaces (https://huggingface.co/spaces)
