The model component in LangChain is a crucial part of the framework, designed to facilitate interactions with various language models and embedding models.

It abstracts the complexity of working directly with different LLMs, chat models, and embedding models, providing a uniform interface to communicate with them. This makes it easier to build applications that rely on AI-generated text, text embeddings for similarity search, and retrieval augmented generation (RAG)

Models
1. Language Models (Input (Text) -> Output(Text))
2. Embedding Models (Input (Text) => Ouput (Vectors/Embeddings))

## Language Models

Language Models are AI systems designed to process, generate and understand natural language text.

2 types of Language Models

1. LLMs (General purpose Models, can be used for any NLP application eg. summarization, question answering) Traditionally older models and are not used much now.
2. Chat Models (LLMs specifically designed for Conversational Tasks) They take a sequence of messages as inputs and return chat messages as outputs(as opposed to using plain text) These are traditionally newer models and used more in comparison to LLMs.

| Feature        | LLM(Base Models) | Chat Models(Instruction-Tuned)      |
|-------------|-----|-------------|
| Purpose      | Free-form text generation  | Optimized for multi turn conversation   |
| Training Data      | General text corpora (books, articles)  | Fine tuned on chat datasets (dialogues, user-assistant conversations) |
| Memory & Context     | No built-in memory | Supports structured conversation history     |
| Role Awareness       | No understanding of user and assistant roles | Understands 'system', 'user', and 'assistant' roles  |
| Example Models       | GPT-3, Llama-2-7B, Mistral-7B  | GPT-4, GPT-3.5-turbo, Llama-2-Chat, Mistral-Instruct, Claude      |
| Use Cases     | Text generation, summarization, translation, creative writing, code generation  | Conversational AI, chatbots, virtual assistants customer support, AI tutors       |







In [5]:
%%capture output
!pip install -q google-generativeai
!pip install -q langchain-huggingface
!pip install -q transformers
!pip install -q huggingface-hub
!pip install -q numpy
!pip install -q scikit-learn
!pip install langchain_openai

In [6]:
import os
from getpass import getpass

os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API key: ")

Enter your OpenAI API key: ··········


In [7]:
from langchain_openai import OpenAI

In [8]:
llm = OpenAI(model='gpt-3.5-turbo-instruct')
result = llm.invoke('Hey! How are you?')
print(result)



I'm doing well, thanks for asking. How about you? 



## Chat Models

In [9]:
from langchain_openai import ChatOpenAI

In [10]:
model = ChatOpenAI(model='gpt-4')
result = model.invoke("What is the capital of India?")
print(result)


content='The capital of India is New Delhi.' additional_kwargs={'refusal': None} response_metadata={'token_usage': {'completion_tokens': 8, 'prompt_tokens': 14, 'total_tokens': 22, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_provider': 'openai', 'model_name': 'gpt-4-0613', 'system_fingerprint': None, 'id': 'chatcmpl-CUutsvrXTUilVn2TyoXYE7MVsOIdJ', 'service_tier': 'default', 'finish_reason': 'stop', 'logprobs': None} id='lc_run--8e9d7957-e736-4532-9fb7-4cd622c769fa-0' usage_metadata={'input_tokens': 14, 'output_tokens': 8, 'total_tokens': 22, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}}


If you look at output, the way its written in chatmodels is different from regular llm. The response is there in content, plus there is a lot of meta data as well.

In [11]:
print(result.content)

The capital of India is New Delhi.


**Temperature**: Its a parameter that controls the randomness of a language model's output. It affects how creative or deterministic the responses are.



*   Lower Values (0.0 - 0.3) -> More deterministic and predictable
*   Higher Values (0.7 - 1.5) -> More random, creative and diverse


| Use case        | Recommended Temperature |
|:------------|:---:|
| Factual answers (math, code, facts)   | 0.0-0.3 |
| Balanced response (general QA, explanations)       | 0.5-0.7  |
| Creative Writing, storytelling, jokes      | 0.9-1.2 |
| Maximum Randomness - brainstorming     | 1.5+|


In [12]:
model = ChatOpenAI(model='gpt-4', temperature =0 )
result = model.invoke("Write a sonnet on the topic 'Indian Politicians' ")
print(result.content)

Upon the stage of India's vast expanse,
In robes of power, politicians stand,
Their words like arrows, in the air they lance,
To rule the heart and mind of this great land.

They promise change, they vow to break the chains,
Of poverty, corruption, and despair,
Yet often, their ambition only stains,
The sacred trust, the people choose to bear.

Yet, some among them, true to their oath,
Serve with honor, integrity, and grace,
Their vision clear, their purpose firm and both,
In service of the people, find their place.

Indian politicians, in their varied hue,
Shape the nation's course, to old and new.


In [13]:
model = ChatOpenAI(model='gpt-4', temperature =1.2 )
result = model.invoke("Write a sonnet on the topic 'Indian Politicians' ")
print(result.content)

In the land where Ganges steepl grey,
There thrive elites in opulence and play.
Politicians chief among this brass array,
Speak in tongues, oblique they shipshape say. 

Their promises as diverse as the Indian sphere,
Sweet whispers breathed in every fearful ear. 
Yet the poor man's hopes are still unclear,
Whilst stalwarts revel in fame, un-sincere.

From early morn till the midnight hour, 
They mess with powers veiled, subdued and sour.
Morality quivers, ethics bow in tones dour,
Yet in Indian heart a silent uproar.

Awake, ye shining daughters, sons of this lot,
Outshine they must- these lords in vanity nought. 

Churchill once scarred, with a tossmate thought,
On Indian politicians, words from battles fought.
"Crook in their politics, rogueish in their sport,"
Radiate we must, leaders of a different sort. 

From every alley, each village, each old-known fort,
Springs a young India, unaware they report.
"You polished lot wear cloaks not in our sport,
True Indians we, in reconciled r

Much more creative as we increase the temperature

**Max Completion Tokens** : Helps you decide the max number of output tokens.
Since billing is per token, settign this upper limit helps.

In [14]:
model = ChatOpenAI(model='gpt-4', temperature =1.2, max_completion_tokens = 50)
result = model.invoke("Write a sonnet on the topic 'Indian Politicians' ")
print(result.content)

In hues of saffron, white and green, they rise,
Across the vast expanse of India’s plains,
Upon their promises, a nation ties,
Hope or despair, within their reign remains.

Every caste and creed holds their alliance,
From


Since the max token is set, it did not complete the sonnet.

## Claude

In [17]:
os.environ["ANTHROPIC_API_KEY"] = getpass("Enter your ANTHROPIC API key: ")


Enter your ANTHROPIC API key: ··········


In [15]:
%%capture output
!pip install -q langchain_anthropic

In [18]:
from langchain_anthropic import ChatAnthropic

In [19]:
# I do not have the Anthropic Creds, so this will not work without it.
model_claude = ChatAnthropic(model='claude-3-5-sonnet-20241022')
result = model_claude.invoke('Whats the capital of France?')
print(result.content)


The capital of France is Paris.


## Gemini

In [20]:
import os
from getpass import getpass

os.environ["GOOGLE_API_KEY"] = getpass("Enter your Google API key: ")


Enter your Google API key: ··········


In [None]:
%%capture output
!pip install langchain_google_genai

In [23]:
from langchain_google_genai import ChatGoogleGenerativeAI
model_gemini = ChatGoogleGenerativeAI(model='gemini-1.5-pro')
result = model_gemini.invoke('What is the capital of France?')
print(result.content)

The capital of France is Paris.


## Open Source Models

These Language models are freely available AI models that can be downloaded, modifed, finetuned and deployed without restrictions from a central provider. Unlike closed source models such as OpenAI's GPT-4, Anthropic's Claude, or Google's Gemini, open source models allow full control and customization

| Feature      | Open Source Models | Closed Source Models     |
|:------------|:---:|------------:|
| Cost     | Free to use (no API cost) | Paid API Usage  |
| Control     | Can modify, fine-tune, and deplpoy anywhere  | Locked to provider's infrastructure   |
| Data Privacy  | Runs locally (no data sent to external servers)  | Send queries to provider's servers     |
| Customization   | Can fine-tune on specific datasets  | No access to finetuning in most cases      |
| Deployment     | Can be deployed on on-premise servers or cloud | Must use vendor's API   |


| Model       | Develoepr | Parameters       | Best Use case     |
|:------------|:---:|------------:|------------:|
| LLaMA-2-7B/13B/70B     | Meta AI  | 7B-70B | General purpose text generation
| Mistral-8x-7B      | Mistral AI  | 8x7B(MoE)      |Efficient & Fast response



HuggingFace - The largest repository of open-source models
Two ways to use it
1. Using HF Inference API
2. Running Locally

Disadvantages

| Disadvantage       | Details |
|:------------|:---:|
| High Hardware Requirements     | Running large models (eg LLAMA-2-70B) requires expemnsive GPUs |
| Setup Complexity     | Requires installation of dependencies like PyTorch, CUDA, Transformers |
| Lack of RLHF   | Most open-source models don't have fine-tuning with human feedback, making them weaker in instruction following |
| Limited Multimodal Abilities  | Open Models don't support images, audio or video alot |


In [24]:

os.environ["HUGGINGFACEHUB_API_TOKEN"] = getpass("Enter your Hugging Face API key: ")

Enter your Hugging Face API key: ··········


In [None]:
%%capture output
!pip install langchain-huggingface

In [None]:
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint


In [None]:
# Accessing the model via API endpoint
llm = HuggingFaceEndpoint(
    repo_id='mistralai/Mistral-7B-Instruct-v0.3',
    task = 'text-generation'
)

model = ChatHuggingFace(llm=llm)
result = model.invoke('What is the capital of France?')
print(result.content)

 The capital of France is Paris. It is one of the most famous cities in the world, known for its rich history, art, culture, and landmarks such as the Eiffel Tower, Louvre Museum, and Notre-Dame Cathedral. Paris is also the political, economic, and cultural center of France.


# Downloading the model and using it locally

In [29]:
from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline


In [30]:
llm = HuggingFacePipeline.from_model_id(
    model_id="TinyLlama/TinyLlama-1.1B-Chat-v1.0",
    task = 'text-generation',
    pipeline_kwargs= dict(
        temperature = 0.5,
        max_new_tokens = 100
    )

)
model = ChatHuggingFace(llm=llm)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.model:   0%|          | 0.00/500k [00:00<?, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/551 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/608 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/2.20G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/124 [00:00<?, ?B/s]

Device set to use cpu


In [31]:
result = model.invoke("What is the capital of France?")
print(result.content)

<|user|>
What is the capital of France?</s>
<|assistant|>
The capital of France is Paris.


## Embedding Models

In [None]:
from langchain_openai import OpenAIEmbeddings

In [None]:
# The default dimension for large model is 3072, but we keeping it at 32, for the sake of time and cost
embedding_obj = OpenAIEmbeddings(model='text-embedding-3-large', dimensions=32)
embedding = embedding_obj.embed_query("Paris is the capital of France.")
print(str(embedding))



Lets. now generate the embeding of an entire file, and not just a query

In [34]:
embedding_obj = OpenAIEmbeddings(model='text-embedding-3-large', dimensions=32)
documents = [
    "Delhi is the capital of India.",
    "Ranchi is the capital of Jharkhand.",
    "Patna. is the capital of Bihar"
]

result = embedding_obj.embed_documents(documents)
print(str(result))

[[-0.07238976657390594, 0.193175807595253, -0.01223335973918438, 0.43464556336402893, -0.1725076287984848, 0.051440220326185226, -0.1413007378578186, 0.05796297639608383, -0.14437027275562286, -0.1366964429616928, -0.11827925592660904, 0.14007292687892914, 0.006772152613848448, -0.2103651762008667, -0.22264330089092255, -0.10610345005989075, -0.1924595832824707, 0.18805992603302002, 0.08543527871370316, -0.2293962687253952, -0.003131880657747388, -0.03419968858361244, -0.082570381462574, -0.044124506413936615, -0.16391295194625854, 0.48007461428642273, 0.284443199634552, -0.050110090523958206, -0.022138992324471474, 0.021448347717523575, 0.21507179737091064, 0.06435783207416534], [0.06283925473690033, 0.1430075615644455, -0.07094616442918777, 0.3357717990875244, -0.21584104001522064, 0.06241031736135483, -0.04653965309262276, 0.0035816230811178684, -0.2891034483909607, 0.1705453097820282, -0.13811767101287842, 0.031376734375953674, -0.21446844935417175, -0.04098492115736008, -0.2714312

In [35]:
from langchain_huggingface import HuggingFaceEmbeddings
embedding_obj = HuggingFaceEmbeddings(model_name='sentence-transformers/all-MiniLM-L6-v2')

text = 'Paris is the capital of France.'
embedding = embedding_obj.embed_query(text)
print(str(embedding))

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

[0.10070480406284332, 0.037152860313653946, 0.02822548896074295, -0.041667670011520386, 0.0671916976571083, -0.047735545784235, 0.009368831291794777, -0.010811394080519676, 0.021422477439045906, 0.02683991938829422, -0.014043694362044334, -0.07074397057294846, 0.006867417600005865, -0.06559625267982483, -0.039673712104558945, -0.12240415811538696, -0.009188590571284294, -0.019201550632715225, 0.04009631276130676, 0.017608467489480972, -0.0028437036089599133, -0.034690938889980316, 0.07244998961687088, 0.01213846355676651, -0.04890365153551102, -0.0017426727572456002, -0.06909015029668808, -0.018630987033247948, -0.019689982756972313, 0.010831773281097412, 0.06691782176494598, -0.01983049511909485, -0.06030932813882828, -0.001621779752895236, -0.04283401370048523, -0.03296345844864845, -0.0006109338719397783, -0.02035326510667801, 0.026338139548897743, 0.04921884089708328, 0.002372274873778224, -0.04640386998653412, -0.05580700933933258, -0.02362196147441864, -0.025786254554986954, 0.02

In [36]:
documents = [
    "Delhi is the capital of India.",
    "Ranchi is the capital of Jharkhand.",
    "Patna. is the capital of Bihar"
]

embedding = embedding_obj.embed_documents(documents)
print(str(embedding))

[[0.03895364701747894, 0.026190299540758133, -0.04022999480366707, 0.030604945495724678, -0.023605724796652794, -0.060136809945106506, 0.07395178824663162, 0.016435787081718445, -0.01170290820300579, -0.02273545227944851, 0.012308388948440552, -0.11706360429525375, 0.06992068141698837, -0.06936796009540558, 0.02924441732466221, -0.08110898733139038, 0.03378975763916969, 0.07373359054327011, 0.07176914811134338, -0.07642310112714767, -0.013756537809967995, 0.03879550099372864, 0.0038667942862957716, -0.03243941441178322, 0.0009917138377204537, 0.055515874177217484, 0.005611722823232412, -0.02097156085073948, -0.002045021392405033, 0.0007507732952944934, 0.06470665335655212, 0.018523888662457466, -0.029603630304336548, 0.005121386144310236, -0.05871567502617836, 0.018705857917666435, -0.1311699002981186, 0.06285741180181503, 0.17376883327960968, -0.11450103670358658, 0.04580598697066307, -0.0022412140388041735, 0.07444492727518082, -0.026802586391568184, 0.032853081822395325, -0.02097082