# **Langchain Models**

In LangChain, **models** refer to the core components that handle natural language processing tasks. They can be categorized into three main types:

1. **Language Models (LLMs)**  
   These are foundational models that generate text-based responses given a prompt. They can be general-purpose models like GPT-4, Claude, or proprietary models integrated through APIs.

2. **Chat Models**  
   Unlike general LLMs, chat models are designed specifically for conversational AI, following structured turn-based interactions. They are optimized for multi-turn dialogue generation.

3. **Text Embedding Models**  
   These models convert text into numerical vector representations, which are useful for tasks like similarity search, retrieval-augmented generation (RAG), and document retrieval.

LangChain supports various **providers** (e.g., OpenAI, Hugging Face, Cohere) and allows users to customize models for specific applications.

## **Language Models**

The primary difference between **LLMs (Large Language Models)** and **Chat Models** in LangChain lies in their **structure, interaction style, and optimization for specific tasks**.

### **1. LLMs (Large Language Models)**
General-purpose models that is used for raw text generation. They take a string (or plain text) as input and returns a string (plain text). These are traditionally older models and are not used much now.

- **Purpose**: General-purpose text generation.
- **Input/Output**: Accepts raw text prompts and generates responses.
- **Structure**: Not optimized for structured multi-turn conversations.
- **Use Cases**: Content generation, text summarization, code generation, and language translation.

### **2. Chat Models**
Language models that are specialized for conversational tasks. They take a sequence of message as inputs and return chat messages as outputs (as opposed to plain text). These are traditionally newer models and used more in comparison to the LLMs.

- **Purpose**: Designed for interactive, turn-based conversations.
- **Input/Output**: Accepts structured messages (e.g., system, user, assistant messages) and maintains context across turns.
- **Structure**: Optimized for dialogue-based interactions, often incorporating memory and chat history.
- **Use Cases**: Conversational agents, customer support bots, and interactive AI assistants.

#### **Key Differences**
| Feature      | LLMs | Chat Models |
|-------------|------|------------|
| **Purpose** | Free-form text generation | Optimized for multi-turn conversation |
| **Training Data** | General text corpora (book, articles) | Fine-tuned on chat datasets (dialogues, user-assisted conversations) |
| **Memory & Context** | No built-in memory | Supports structured conversation history |
| **Role awarness** | No understanding of "user" and "assistant" roles | Understands "system", "user", and "assistant" roles |
| **Example Models** | GPT-3, Llama-2-7B, Mistral-7B, OPT-1.38 | GPT-4, GPT-3.5 turbo, Llama-2-Chat, Mistral-Instruct, Claude |
| **Use Cases** | Text generation, summarization, translation, creative writing, code generation | Conversational AI, chatbots, virtual assistants, customer support, AI tutors |

In LangChain, **LLMs** can still be used for conversations, but **Chat Models** provide a more structured approach for better user interactions.

### **Open Source Models**
Open-source language models are freely available AI models that can be downloaded, modified, fine-tuned, and deployed without restrictions from a central provider. Unlike closed-source models such as OpenAI's GPT-4, Anthropic's Claude, or Google Gemini, open-source models allow full control and customization.

| Feature      | Open-Source Models | Closed-Source Models |
|-------------|------|------------|
| **Cost** | Free to use (no API costs) | Paid API usage |
| **Control** | Can modify, fine-tune, and deploy anywhere | Locked to provider's infrastructure |
| **Data Privacy** | Runs locally (no data sent to external servers) | Send queries to provider's servers |
| **Customization** | Can fine-tune on specific datasets | No access to fine-tuning in most cases |
| **Deployment** | Can be deployed on **on-premise** servers or cloud | Must use vendor's API |

**Some Faamous Open Source Models**
| Model Name       | Organization  | Parameters       | License       | Description |
|-----------------|--------------|----------------|--------------|-------------|
| LLaMA 2        | Meta         | 7B, 13B, 65B  | Meta AI License | A family of open-weight LLMs optimized for efficiency. |
| Mistral 7B     | Mistral AI   | 7B            | Apache 2.0    | A dense model with high efficiency and strong reasoning abilities. |
| Mixtral        | Mistral AI   | 12.9B (2 of 8x12.9B active) | Apache 2.0 | A MoE (Mixture of Experts) model with improved performance. |
| Falcon         | TII          | 7B, 40B       | Apache 2.0    | A highly optimized transformer model trained on a diverse dataset. |
| BLOOM          | BigScience   | 7B, 176B      | RAIL-M        | A multilingual open-weight model supporting 46 languages. |
| Gemma         | Google DeepMind | 2B, 7B      | Apache 2.0    | Lightweight models designed for efficiency and accessibility. |
| GPT-NeoX       | EleutherAI   | 20B           | Apache 2.0    | A large-scale autoregressive transformer model. |
| Pythia         | EleutherAI   | 70M-12B       | Apache 2.0    | A suite of models for transparency and interpretability research. |
| OpenLLaMA      | Together AI  | 3B, 7B, 13B   | Apache 2.0    | An open reproduction of LLaMA models trained on openly available data. |
| RedPajama      | Together AI  | 3B, 7B, 13B   | Apache 2.0    | Models trained on datasets replicating LLaMA pretraining. |
| DeepSeek LLM   | DeepSeek    | 1.3B, 7B, 67B | Apache 2.0    | A family of open LLMs trained on large-scale multilingual datasets. |
| DeepSeek MoE   | DeepSeek    | 16B (2 of 64x16B active) | Apache 2.0 | A sparse Mixture of Experts model for optimized efficiency. |

**Where to find them?**
HuggingFace - The largest repository of open-source LLMs.

**Ways to use Open-source Models**
- Open-Source Models
  - **Using HuggingFace Inference API**
  - **Running Locally**

**Disadvantages**
|Disavantage    | Details   |
|---------------|-----------|
| High Hardware Requirements | Running large models (e.g., LLaMa-2-70B) requires expensive GPUs. |
| Setupe Complexity | Requires installation of dependencies like PyTorch, CUDA, transformers. |
| Lack of RLHF | Most open-source models don't have fine-tuning with human feedback, making them weaker in instruction-following. |
| Limited Multimodal Abilities | Open models don't support images, audio, or video like GPT-4V. |
