The model component in LangChain is a crucial part of the framework, designed to facilitate interactions with various language models and embedding models.

It abstracts the complexity of working directly with different LLMs, chat models, and embedding models, providing a uniform interface to communicate with them. This makes it easier to build applications that rely on AI-generated text, text embeddings for similarity search, and retrieval augmented generation (RAG)

Models
1. Language Models (Input (Text) -> Output(Text))
2. Embedding Models (Input (Text) => Ouput (Vectors/Embeddings))

## Language Models

Language Models are AI systems designed to process, generate and understand natural language text.

2 types of Language Models

1. LLMs (General purpose Models, can be used for any NLP application eg. summarization, question answering) Traditionally older models and are not used much now.
2. Chat Models (LLMs specifically designed for Conversational Tasks) They take a sequence of messages as inputs and return chat messages as outputs(as opposed to using plain text) These are traditionally newer models and used more in comparison to LLMs.

| Feature        | LLM(Base Models) | Chat Models(Instruction-Tuned)      |
|-------------|-----|-------------|
| Purpose      | Free-form text generation  | Optimized for multi turn conversation   |
| Training Data      | General text corpora (books, articles)  | Fine tuned on chat datasets (dialogues, user-assistant conversations) |
| Memory & Context     | No built-in memory | Supports structured conversation history     |
| Role Awareness       | No understanding of user and assistant roles | Understands 'system', 'user', and 'assistant' roles  |
| Example Models       | GPT-3, Llama-2-7B, Mistral-7B  | GPT-4, GPT-3.5-turbo, Llama-2-Chat, Mistral-Instruct, Claude      |
| Use Cases     | Text generation, summarization, translation, creative writing, code generation  | Conversational AI, chatbots, virtual assistants customer support, AI tutors       |







In [4]:
!pip install -q google-generativeai
!pip install -q langchain-huggingface
!pip install -q transformers
!pip install -q huggingface-hub
!pip install -q numpy
!pip install -q scikit-learn

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/467.1 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━[0m [32m399.4/467.1 kB[0m [31m12.5 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m467.1/467.1 kB[0m [31m9.0 MB/s[0m eta [36m0:00:00[0m
[?25h[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain 0.3.27 requires langchain-core<1.0.0,>=0.3.72, but you have langchain-core 1.0.1 which is incompatible.[0m[31m
[0m

In [5]:
import langchain
langchain.__version__

'0.3.27'

In [8]:
!pip install langchain_openai

Collecting langchain_openai
  Downloading langchain_openai-1.0.1-py3-none-any.whl.metadata (1.8 kB)
Downloading langchain_openai-1.0.1-py3-none-any.whl (81 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m81.9/81.9 kB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: langchain_openai
Successfully installed langchain_openai-1.0.1


In [6]:
import os
from getpass import getpass

os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API key: ")

Enter your OpenAI API key: ··········


In [9]:
from langchain_openai import OpenAI

In [15]:
llm = OpenAI(model='gpt-3.5-turbo-instruct')
result = llm.invoke('Hey! How are you?')
print(result)



I am doing well, thank you. How about you?


## Chat Models

In [18]:
from langchain_openai import ChatOpenAI

In [19]:
model = ChatOpenAI(model='gpt-4')
result = model.invoke("What is the capital of India?")
print(result)


content='The capital of India is New Delhi.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 8, 'prompt_tokens': 14, 'total_tokens': 22, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-4-0613', 'system_fingerprint': None, 'id': 'chatcmpl-CUSASiDzlaoYPvgtE7bTfLOaeEuuG', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None} id='lc_run--ef5a5d8f-e40e-47c2-b28b-20ca34e87e3c-0' usage_metadata={'input_tokens': 14, 'output_tokens': 8, 'total_tokens': 22, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}


If you look at output, the way its written in chatmodels is different from regular llm. The response is there in content, plus there is a lot of meta data as well.

In [20]:
print(result.content)

The capital of India is New Delhi.


**Temperature**: Its a parameter that controls the randomness of a language model's output. It affects how creative or deterministic the responses are.



*   Lower Values (0.0 - 0.3) -> More deterministic and predictable
*   Higher Values (0.7 - 1.5) -> More random, creative and diverse


| Use case        | Recommended Temperature |
|:------------|:---:|
| Factual answers (math, code, facts)   | 0.0-0.3 |
| Balanced response (general QA, explanations)       | 0.5-0.7  |
| Creative Writing, storytelling, jokes      | 0.9-1.2 |
| Maximum Randomness - brainstorming     | 1.5+|


In [24]:
model = ChatOpenAI(model='gpt-4', temperature =0 )
result = model.invoke("Write a sonnet on the topic 'Indian Politicians' ")
print(result.content)

Upon the stage of India's vast expanse,
In robes of power, politicians stand,
Their words like arrows, in the dance,
Of democracy, they command.

They promise progress, peace, and prosperity,
In speeches woven with eloquence and grace,
Yet often, their actions lack sincerity,
As corruption's shadow mars their face.

Yet, among them, some true leaders rise,
With hearts aflame for their nation's cause,
Their vision clear, they cut through lies,
To uphold the constitution and its laws.

Indian politicians, a varied lot,
In the crucible of power, their character is wrought.


In [25]:
model = ChatOpenAI(model='gpt-4', temperature =1.2 )
result = model.invoke("Write a sonnet on the topic 'Indian Politicians' ")
print(result.content)

From Ganges' plains to Himalayan heights,
Across varied cultures, hues and sights,
Lies an arena flashed in the nation’s limelight,
Aiye, let me carve in verse, our leader's hypothetical scout.

Indian politicians, bearers of our colorful outspread,
Blending measures in saffron, white, green thread.
They vow in parties’ garb and in people’s name,
Host the stage, in a blaring democracy's game.

A tough joust of words by day's tumultuous reign,
Yet hushed bills and pacts in the night's quiet lane.
Charisma draped in promises ever so hollow,
People reach for change, for a better tomorrow.

Yet, in every verdict lies a dormant hope anew,
In this myriad realm, they remain our chosen few.


Much more creative as we increase the temperature

**Max Completion Tokens** : Helps you decide the max number of output tokens.
Since billing is per token, settign this upper limit helps.

In [26]:
model = ChatOpenAI(model='gpt-4', temperature =1.2, max_completion_tokens = 50)
result = model.invoke("Write a sonnet on the topic 'Indian Politicians' ")
print(result.content)

In chambers of power their markers do rest,
Indian politicians in peacock grand vest.
A country of colors so fervently traced,
Their choices eternal, in history encased.

Phantom of ambition awash in folklore,
Playing the game known even to


Since the max token is set, it did not complete the sonnet.

## Claude

In [27]:
ANTHROPIC_API_KEY = ''


In [30]:
!pip install -q langchain_anthropic

In [31]:
from langchain_anthropic import ChatAnthropic

In [37]:
# I do not have the Anthropic Creds, so this will not work without it.
model_claude = ChatAnthropic(model='claude-3-5-sonnet-20241022')
result = model_claude.invoke('Whats the capital of France?')
print(result.content)


TypeError: "Could not resolve authentication method. Expected either api_key or auth_token to be set. Or for one of the `X-Api-Key` or `Authorization` headers to be explicitly omitted"

## Gemini

In [39]:
import os
from getpass import getpass

os.environ["GOOGLE_API_KEY"] = getpass("Enter your Google API key: ")


Enter your Google API key: ··········


In [41]:
!pip install langchain_google_genai

Collecting langchain_google_genai
  Downloading langchain_google_genai-3.0.0-py3-none-any.whl.metadata (7.1 kB)
Collecting google-ai-generativelanguage<1.0.0,>=0.7.0 (from langchain_google_genai)
  Downloading google_ai_generativelanguage-0.9.0-py3-none-any.whl.metadata (10 kB)
Collecting filetype<2.0.0,>=1.2.0 (from langchain_google_genai)
  Downloading filetype-1.2.0-py2.py3-none-any.whl.metadata (6.5 kB)
Downloading langchain_google_genai-3.0.0-py3-none-any.whl (57 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m57.8/57.8 kB[0m [31m1.8 MB/s[0m eta [36m0:00:00[0m
[?25hDownloading filetype-1.2.0-py2.py3-none-any.whl (19 kB)
Downloading google_ai_generativelanguage-0.9.0-py3-none-any.whl (1.4 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.4/1.4 MB[0m [31m24.0 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: filetype, google-ai-generativelanguage, langchain_google_genai
  Attempting uninstall: google-ai-generativelanguage


In [44]:
from langchain_google_genai import ChatGoogleGenerativeAI
model_gemini = ChatGoogleGenerativeAI(model='gemini-1.5-pro')
result = model_gemini.invoke('What is the capital of France?')
print(result.content)

The capital of France is Paris.


## Open Source Models

These Language models are freely available AI models that can be downloaded, modifed, finetuned and deployed without restrictions from a central provider. Unlike closed source models such as OpenAI's GPT-4, Anthropic's Claude, or Google's Gemini, open source models allow full control and customization

| Feature      | Open Source Models | Closed Source Models     |
|:------------|:---:|------------:|
| Cost     | Free to use (no API cost) | Paid API Usage  |
| Control     | Can modify, fine-tune, and deplpoy anywhere  | Locked to provider's infrastructure   |
| Data Privacy  | Runs locally (no data sent to external servers)  | Send queries to provider's servers     |
| Customization   | Can fine-tune on specific datasets  | No access to finetuning in most cases      |
| Deployment     | Can be deployed on on-premise servers or cloud | Must use vendor's API   |


| Model       | Develoepr | Parameters       | Best Use case     |
|:------------|:---:|------------:|------------:|
| LLaMA-2-7B/13B/70B     | Meta AI  | 7B-70B | General purpose text generation
| Mistral-8x-7B      | Mistral AI  | 8x7B(MoE)      |Efficient & Fast response



HuggingFace - The largest repository of open-source models
Two ways to use it
1. Using HF Inference API
2. Running Locally

Disadvantages

| Disadvantage       | Details |
|:------------|:---:|
| High Hardware Requirements     | Running large models (eg LLAMA-2-70B) requires expemnsive GPUs |
| Setup Complexity     | Requires installation of dependencies like PyTorch, CUDA, Transformers |
| Lack of RLHF   | Most open-source models don't have fine-tuning with human feedback, making them weaker in instruction following |
| Limited Multimodal Abilities  | Open Models don't support images, audio or video alot |


In [51]:

os.environ["HUGGINGFACEHUB_API_TOKEN"] = getpass("Enter your Hugging Face API key: ")

Enter your Hugging Face API key: ··········


In [48]:
!pip install langchain-huggingface



In [52]:
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint


In [55]:
# Accessing the model via API endpoint
llm = HuggingFaceEndpoint(
    repo_id='mistralai/Mistral-7B-Instruct-v0.3',
    task = 'text-generation'
)

model = ChatHuggingFace(llm=llm)
result = model.invoke('What is the capital of France?')
print(result.content)

 The capital of France is Paris. It is one of the most famous cities in the world, known for its rich history, art, culture, and landmarks such as the Eiffel Tower, Louvre Museum, and Notre-Dame Cathedral. Paris is also the political, economic, and cultural center of France.


# Downloading the model and using it locally

In [56]:
from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline


In [57]:
llm = HuggingFacePipeline.from_model_id(
    model_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    task = 'text-generation',
    pipeline_kwargs= dict(
        temperature = 0.5,
        max_new_tokens = 100
    )

)
model = ChatHuggingFace(llm=llm)

tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/551 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/608 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.20G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Device set to use cpu


In [58]:
result = model.invoke("What is the capital of France?")
print(result.content)

<|user|>
What is the capital of France?</s>
<|assistant|>
The capital of France is Paris.


## Embedding Models

In [59]:
from langchain_openai import OpenAIEmbeddings

In [61]:
# The default dimension for large model is 3072, but we keeping it at 32, for the sake of time and cost
embedding_obj = OpenAIEmbeddings(model='text-embedding-3-large', dimensions=32)
embedding = embedding_obj.embed_query("Paris is the capital of France.")
print(str(embedding))



[-0.06186187267303467, 0.4578055441379547, -0.07168781012296677, 0.3645283579826355, -0.22530454397201538, -0.09362316876649857, 0.13265012204647064, -0.1238621398806572, -0.12746037542819977, -0.20648303627967834, -0.03819659352302551, 0.10144239664077759, 0.038404181599617004, -0.03316253051161766, -0.0022272695787250996, -0.09659863263368607, -0.09389995783567429, 0.14088453352451324, -0.06847015768289566, 0.15527744591236115, 0.005808200221508741, -0.14780420064926147, -0.4298500716686249, -0.13085101544857025, -0.048783693462610245, 0.19596512615680695, 0.09583746641874313, 0.08801823854446411, 0.01526652742177248, 0.17977309226989746, 0.31775137782096863, -0.0330241359770298]


Lets. now generate the embeding of an entire file, and not just a query

In [62]:
embedding_obj = OpenAIEmbeddings(model='text-embedding-3-large', dimensions=32)
documents = [
    "Delhi is the capital of India.",
    "Ranchi is the capital of Jharkhand.",
    "Patna. is the capital of Bihar"
]

result = embedding_obj.embed_documents(documents)
print(str(result))

[[-0.07267331331968307, 0.19324959814548492, -0.012327593751251698, 0.43522101640701294, -0.1720617413520813, 0.05143427848815918, -0.14166177809238434, 0.05793393403291702, -0.1444254070520401, -0.1371580809354782, -0.11832442879676819, 0.1397169977426529, 0.00717137148603797, -0.21024081110954285, -0.22252362966537476, -0.10614397376775742, -0.19273780286312103, 0.18833646178245544, 0.0856214389204979, -0.2296885997056961, -0.0029779423493891954, -0.03459658846259117, -0.08188541978597641, -0.04396223649382591, -0.16336141526699066, 0.4794391393661499, 0.2843471169471741, -0.050026874989271164, -0.021891554817557335, 0.021686840802431107, 0.21515394747257233, 0.06484301388263702], [0.062414295971393585, 0.14284509420394897, -0.07086489349603653, 0.33596479892730713, -0.21585480868816376, 0.06219981238245964, -0.04675710201263428, 0.0035845323000103235, -0.2889502942562103, 0.1706419736146927, -0.13821227848529816, 0.031593214720487595, -0.2141389399766922, -0.04133070260286331, -0.27

In [64]:
from langchain_huggingface import HuggingFaceEmbeddings
embedding_obj = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2')

text = 'Paris is the capital of France.'
embedding = embedding_obj.embed_query(text)
print(str(embedding))

[0.10070480406284332, 0.037152860313653946, 0.02822548896074295, -0.041667670011520386, 0.0671916976571083, -0.047735545784235, 0.009368831291794777, -0.010811394080519676, 0.021422477439045906, 0.02683991938829422, -0.014043694362044334, -0.07074397057294846, 0.006867417600005865, -0.06559625267982483, -0.039673712104558945, -0.12240415811538696, -0.009188590571284294, -0.019201550632715225, 0.04009631276130676, 0.017608467489480972, -0.0028437036089599133, -0.034690938889980316, 0.07244998961687088, 0.01213846355676651, -0.04890365153551102, -0.0017426727572456002, -0.06909015029668808, -0.018630987033247948, -0.019689982756972313, 0.010831773281097412, 0.06691782176494598, -0.01983049511909485, -0.06030932813882828, -0.001621779752895236, -0.04283401370048523, -0.03296345844864845, -0.0006109338719397783, -0.02035326510667801, 0.026338139548897743, 0.04921884089708328, 0.002372274873778224, -0.04640386998653412, -0.05580700933933258, -0.02362196147441864, -0.025786254554986954, 0.02

In [65]:
documents = [
    "Delhi is the capital of India.",
    "Ranchi is the capital of Jharkhand.",
    "Patna. is the capital of Bihar"
]

embedding = embedding_obj.embed_documents(documents)
print(str(embedding))

[[0.03895364701747894, 0.026190299540758133, -0.04022999480366707, 0.030604945495724678, -0.023605724796652794, -0.060136809945106506, 0.07395178824663162, 0.016435787081718445, -0.01170290820300579, -0.02273545227944851, 0.012308388948440552, -0.11706360429525375, 0.06992068141698837, -0.06936796009540558, 0.02924441732466221, -0.08110898733139038, 0.03378975763916969, 0.07373359054327011, 0.07176914811134338, -0.07642310112714767, -0.013756537809967995, 0.03879550099372864, 0.0038667942862957716, -0.03243941441178322, 0.0009917138377204537, 0.055515874177217484, 0.005611722823232412, -0.02097156085073948, -0.002045021392405033, 0.0007507732952944934, 0.06470665335655212, 0.018523888662457466, -0.029603630304336548, 0.005121386144310236, -0.05871567502617836, 0.018705857917666435, -0.1311699002981186, 0.06285741180181503, 0.17376883327960968, -0.11450103670358658, 0.04580598697066307, -0.0022412140388041735, 0.07444492727518082, -0.026802586391568184, 0.032853081822395325, -0.02097082