## 1. Models
In LangChain, "Models" are the standardized interfaces to interact with various LLM providers.

*   **LLMs vs Chat Models**:
    *   **LLMs**: Take a text string as input and return a text string.
    *   **Chat Models**: Take a list of messages (User/System/AI) and return a message.
*   **Integrations**: LangChain supports many providers like Hugging Face, OpenAI, Groq, and Ollama.

### Model Initialization Examples
The following code demonstrates how to initialize different types of models:

*   **HuggingFaceEndpoint**: access models hosted on the Hugging Face Hub (requires API token).
*   **ChatHuggingFace**: Wrapper to make the endpoint behave like a chat model.
*   **ChatGroq**: Interface for Groq's fast inference API.
*   **ChatOllama**: Interface for running local models via Ollama.

In [None]:
#Models
from langchain_huggingface import HuggingFaceEndpoint, ChatHuggingFace
from dotenv import load_dotenv
from langchain_groq import ChatGroq
from langchain_ollama import ChatOllama

load_dotenv()

llm = HuggingFaceEndpoint(
    repo_id="meta-llama/Llama-3.2-1B-Instruct",
    task="text-generation",
    temperature=0.5
)

model1 = ChatHuggingFace(llm=llm)

llm1 = HuggingFaceEndpoint(
    repo_id="MiniMaxAI/MiniMax-M2.1",
    task="text-generation"
)

model = ChatHuggingFace(llm = llm1)

# Another model
model2 = ChatGroq(
    model="llama-3.1-8b-instant",
    temperature=0.5
)
# Ollama Local Model: first do ollama pull phi3 -> keep it in terminal
model3 = ChatOllama(
    model='phi3',
    temperature=0.5
)

### Running Models Locally (HuggingFace Pipeline)

This demonstrates running a model entirely on your local machine using the `transformers` library pipeline.

*   **HuggingFacePipeline**: Loads the model weights into local memory (RAM/VRAM).
*   **Pros**: Privacy, no API costs, offline access.
*   **Cons**: Requires good hardware (GPU recommended), initial download size.

In [None]:
from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline
import os

os.environ['HF_HOME'] = 'D:/huggingface_cache'

llm = HuggingFacePipeline.from_model_id(
    model_id='TinyLlama/TinyLlama-1.1B-Chat-v1.0',
    task='text-generation',
    pipeline_kwargs=dict(
        temperature=0.5,
        max_new_tokens=100
    )
)
model4 = ChatHuggingFace(llm=llm)

result = model4.invoke("What is the capital of India")

print(result.content)