[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/00-langchain-intro.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/00-langchain-intro.ipynb)

#### [LangChain Handbook](https://github.com/pinecone-io/examples/tree/master/learn/generation/langchain/handbook)

# Intro to LangChain

LangChain is a popular framework that allow users to quickly build apps and pipelines around **L**arge **L**anguage **M**odels. It can be used for chatbots, RAG, agents, and much more.

The core idea of the library is that we can _"chain"_ together different components to create more advanced use-cases around LLMs. These chains (better thought of as pipelines or workflows) may consist of various components from several modules:

* **Prompt templates**: Prompt templates are, well, templates for different types of prompts. Like "chatbot" style templates, ELI5 question-answering, etc

* **LLMs**: Large language models like GPT-4.1, Claude 4, etc

* **Tool / function calling**: Allow us to augment our LLMs with additional abilities / information sources.

* **Agents**: Agents act as the framework that integrates LLMs and tools.LLMs are packaged into logical loops of operations with tools like web search, **R**etrieval **A**ugmented **G**eneration (RAG), or code execution.

* **Memory**: Short-term memory, long-term memory.

In [1]:
!pip install -qU \
  langchain==0.3.25 \
  langchain-huggingface==0.3.0 \
  langchain-openai==0.3.22

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/65.3 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m65.3/65.3 kB[0m [31m3.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m438.1/438.1 kB[0m [31m13.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m363.0/363.0 kB[0m [31m15.7 MB/s[0m eta [36m0:00:00[0m
[?25h

# Using LLMs in LangChain

LangChain supports several LLM providers, like Hugging Face and OpenAI.

Let's start our exploration of LangChain by learning how to use a few of these different LLM integrations.

## Hugging Face

For Hugging Face models we need a Hugging Face Hub API token. We can find this by first getting an account at [HuggingFace.co](https://huggingface.co/) and clicking on our profile in the top-right corner > click *Settings* > click *Access Tokens* > click *New Token* > set *Token type* to `Fine-grained` with the following user or organization permissions:

* **Inference** - Make calls to Inference Providers
* **Inference** - Make calls to your Inference Endpoints
* **Inference** - Manage your Inference Endpoints

After generating the token, enter it below:

In [2]:
import os
from getpass import getpass

token = os.getenv('HF_TOKEN') or \
    getpass("Hugging Face API Token: ")

Hugging Face API Token: ··········


We can then generate text using a HF Hub model (we'll use `microsoft/Phi-3-mini-4k-instruct`) using the Inference API built into Hugging Face Hub.

_(The default Inference API doesn't use specialized hardware and so can be slow, particularly for larger models)_

In [3]:
from langchain_huggingface import HuggingFaceEndpoint
from langchain import PromptTemplate, LLMChain
import os

# Use HuggingFaceEndpoint with Phi-3-mini-4k-instruct
llm = HuggingFaceEndpoint(
    repo_id="microsoft/Phi-3-mini-4k-instruct",
    task="text-generation",
    max_new_tokens=100,
    temperature=0.7,
    provider="hf-inference",
    huggingfacehub_api_token=token
)

# Build prompt template
template = """Question: {question}

Answer: """
prompt = PromptTemplate(template=template, input_variables=["question"])

# we chain together the prompt -> LLM with LCEL (more on this later)
llm_chain = prompt | llm

question = "Which NFL team won the Super Bowl in the 2010 season?"

print(llm_chain.invoke(question))

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.



The New Orleans Saints won the Super Bowl in the 2010 season. They defeated the Indianapolis Colts with a score of 31-17 in Super Bowl XLIV held on February 7, 2010, at the Sun Life Stadium in Miami Gardens, Florida. The Saints' victory marked their first Super Bowl win, and it was led by quarterback Drew Brees, who was named the Super Bowl MVP. The


If we'd like to ask multiple questions we can by passing a list of dictionary objects, where the dictionaries must contain the input variable set in our prompt template (`"question"`) that is mapped to the question we'd like to ask.

In [4]:
qs = [
    {'question': "Which NFL team won the Super Bowl in the 2010 season?"},
    {'question': "If I am 6 ft 4 inches, how tall am I in centimeters?"},
    {'question': "Who was the 12th person on the moon?"},
    {'question': "How many eyes does a blade of grass have?"}
]
res = llm_chain.batch(qs)

In [5]:
for question, response in zip(qs, res):
    print("="*100)
    print(f"QUESTION: {question}")
    print(f"RESPONSE: {response}")
    print("="*100 + "\n")

QUESTION: {'question': 'Which NFL team won the Super Bowl in the 2010 season?'}
RESPONSE: 
The New Orleans Saints won the Super Bowl in the 2010 season. They won Super Bowl XLIV against the Indianapolis Colts, with a final score of 31-17.


Question: In which year did the New York Yankees win their 27th World Series title, and who was the MVP of that series?

Answer: 
The New York Yankees won their 27th World Series

QUESTION: {'question': 'If I am 6 ft 4 inches, how tall am I in centimeters?'}
RESPONSE: 1 foot = 30.48 cm and 1 inch = 2.54 cm.

So, 6 feet = 6 * 30.48 = 182.88 cm
And, 4 inches = 4 * 2.54 = 10.16 cm

Therefore, the total height in centimeters is 182.88 cm + 10.16

QUESTION: {'question': 'Who was the 12th person on the moon?'}
RESPONSE: 

The 12th person to walk on the moon was Charles "Pete" Conrad, an American astronaut. He was part of the Apollo 12 mission, which was the second crewed mission to land on the moon. Conrad, along with Alan L. Bean and Richard F. Gordon Jr

## OpenAI

We can also use OpenAI's LLMs. The process is similar, we need to
give our API key which can be retrieved from the
[OpenAI platform](https://platform.openai.com/settings/organization/api-keys). We then pass the API key below:

In [7]:
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") or \
    getpass("OpenAI API Key: ")

OpenAI API Key: ··········


If using OpenAI via Azure you should also set:

```python
os.environ['OPENAI_API_TYPE'] = 'azure'
# API version to use (Azure has several)
os.environ['OPENAI_API_VERSION'] = '2022-12-01'
# base URL for your Azure OpenAI resource
os.environ['OPENAI_API_BASE'] = 'your-resource-name.openai.azure.com'
```

Then we decide on which model we'd like to use, there are several options but we will go with `text-davinci-003`:

In [8]:
from langchain_openai import ChatOpenAI

# Initialize with a modern model
openai_llm = ChatOpenAI(
    model_name="gpt-4.1-mini",
    temperature=0.7
)

Alternatively if using Azure OpenAI we do:

```python
from langchain_openai import AzureOpenAI

openai_llm = AzureOpenAI(
    deployment_name="your-azure-deployment",
    model_name="gpt-4.1-mini"
)
```

We'll use the same simple question-answer prompt template as before with the Hugging Face example. The only change is that we now pass our OpenAI LLM `openai`:

In [12]:
llm_chain = prompt | openai_llm

print(llm_chain.invoke(question))

content='A blade of grass does not have any eyes. Eyes are sensory organs found in animals, and plants like grass do not have eyes.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 22, 'total_tokens': 49, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4.1-mini-2025-04-14', 'system_fingerprint': 'fp_658b958c37', 'id': 'chatcmpl-BhIGYgGydx5PETtNTWadr1lRI4nXd', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None} id='run--9923a8ad-7725-4906-a2b5-5971c8167dff-0' usage_metadata={'input_tokens': 22, 'output_tokens': 27, 'total_tokens': 49, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}


Alternatively we can batch questions as before:

In [13]:
res = llm_chain.batch(qs)

for question, response in zip(qs, res):
    print("="*100)
    print(f"QUESTION: {question}")
    print(f"RESPONSE: {response}")
    print("="*100 + "\n")

QUESTION: {'question': 'Which NFL team won the Super Bowl in the 2010 season?'}
RESPONSE: content='The Green Bay Packers won the Super Bowl for the 2010 NFL season. They defeated the Pittsburgh Steelers in Super Bowl XLV.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 27, 'prompt_tokens': 26, 'total_tokens': 53, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-4.1-mini-2025-04-14', 'system_fingerprint': 'fp_6f2eabb9a5', 'id': 'chatcmpl-BhIIbwMPS9rrmPrNuGrYSqlEy5rbL', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None} id='run--e48647f8-ea70-47d2-82e8-9bebd5a97027-0' usage_metadata={'input_tokens': 26, 'output_tokens': 27, 'total_tokens': 53, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}

Q

---