<a href="https://colab.research.google.com/github/yassineselmi/langchain-workshop/blob/main/lab/01_langchain_intro.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Intro to LangChain

LangChain is a popular framework that allow users to quickly build apps and pipelines around **L**arge **L**anguage **M**odels. It can be used to for chatbots, **G**enerative **Q**uestion-**A**nwering (GQA), summarization, and much more.

The core idea of the library is that we can _"chain"_ together different components to create more advanced use-cases around LLMs. Chains may consist of multiple components from several modules:

* **Prompt templates**: Prompt templates are, well, templates for different types of prompts. Like "chatbot" style templates, ELI5 question-answering, etc

* **LLMs**: Large language models like GPT-3, BLOOM, etc

* **Agents**: Agents use LLMs to decide what actions should be taken, tools like web search or calculators can be used, and all packaged into logical loop of operations.

* **Memory**: Short-term memory, long-term memory.

In [None]:
!pip install -qU langchain

# Using LLMs in LangChain

LangChain supports several LLM providers, like Hugging Face and OpenAI.

Let's start our exploration of LangChain by learning how to use a few of these different LLM integrations.

## Hugging Face

We first need to install additional prerequisite libraries:

In [None]:
!pip install -qU huggingface_hub

For Hugging Face models we need a Hugging Face Hub API token. We can find this by first getting an account at [HuggingFace.co](https://huggingface.co/) and clicking on our profile in the top-right corner > click *Settings* > click *Access Tokens* > click *New Token* > set *Role* to *write* > *Generate* > copy and paste the token below:

In [None]:
import os
from getpass import getpass

HUGGINGFACEHUB_API_TOKEN = getpass("Huggingface API Token: ")

os.environ['HUGGINGFACEHUB_API_TOKEN'] = HUGGINGFACEHUB_API_TOKEN

We can then generate text using a HF Hub model (we'll use `NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO`) using the Inference API built into Hugging Face Hub.

_(The default Inference API doesn't use specialized hardware and so can be slow and cannot run larger models like `bigscience/bloom-560m` or `google/flan-t5-xxl`)_

In [None]:
from langchain import PromptTemplate, HuggingFaceHub, LLMChain
from langchain import HuggingFaceTextGenInference

# initialize HF LLM
llm = HuggingFaceHub(
    repo_id="NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO",
    model_kwargs={"temperature":1, "max_length": 4096}
)


# build prompt template for simple question-answering
template = """
{question}
"""

prompt = PromptTemplate(template=template, input_variables=["question"])

llm_chain = LLMChain(
    prompt=prompt,
    llm=llm
)

question = "Write a poem about open-source projects."
answer = llm_chain.run(question)
print(answer)

## OpenAI

Start by installing additional prerequisites:

In [None]:
!pip install -qU langchain-openai

We can also use OpenAI's generative models. The process is similar, we need to
give our API key which can be retrieved by signing up for an account on the
[OpenAI website](https://openai.com/api/) (see top-right of page). We then pass the API key below:

In [None]:
from getpass import getpass
OPENAI_API_KEY = getpass("OpenAI API key: ")


Then we decide on which model we'd like to use, there are several options but we will go with `gpt-3.5-turbo`:

In [None]:
from langchain_openai import ChatOpenAI

gpt35_llm = ChatOpenAI(
    openai_api_key=OPENAI_API_KEY,
    temperature=0,
    model='gpt-3.5-turbo'
)

We'll use the same simple question-answer prompt template as before with the Hugging Face example. The only change is that we now pass our OpenAI LLM `gpt35_llm`:

In [None]:
llm_chain = LLMChain(
    prompt=prompt,
    llm=gpt35_llm
)

print(llm_chain.invoke(question))

## Ollama

[Ollama](https://ollama.com/) allows you to run open-source large language models, such as Llama 2 and Mistral, locally.

Ollama bundles model weights, configuration, and data into a single package, defined by a Modelfile.

It optimizes setup and configuration details, including GPU usage.

For a complete list of supported models and model variants, see the [Ollama model library](https://ollama.com/library).

Let's see how we can use a model served by Ollama.

In [None]:
from langchain_community.llms import Ollama

ollama_base_url = input("Ollama base URL: ")

ollama_llm = Ollama(model="mistral", base_url=ollama_base_url)

We will now use the `ollama_llm` with the previous `LLMChain`.

In [None]:
llm_chain = LLMChain(
    prompt=prompt,
    llm=ollama_llm
)

print(llm_chain.invoke(question))

---