# Using Open Source LLMs with LangChain

Here we will see briefly how you can use popular commercial LLM APIs with LangChain including

- OpenAI GPT (Paid)
- Google Gemini (Paid and Free)

In [15]:
from dotenv import load_dotenv
import os

load_dotenv()

True

## Use LLMs locally with LangChain and Hugging Face

In [16]:
from langchain_huggingface import ChatHuggingFace, HuggingFacePipeline

model_id = "moonshotai/kimi-k2-instruct-0905"

llm = HuggingFacePipeline.from_model_id(
    model_id=model_id,
    task="text-generation",
    pipeline_kwargs=dict(
        max_new_tokens=1000,
        do_sample=False,
        temperature=0,
        return_full_text=False,
    ),
    device_map="auto"  # Automatically selects best device (MPS on Mac, CUDA on GPU, CPU as fallback)
)
llm.pipeline.tokenizer.pad_token_id = llm.pipeline.tokenizer.eos_token_id
chat_llama = ChatHuggingFace(llm=llm)

A new version of the following files was downloaded from https://huggingface.co/moonshotai/kimi-k2-instruct-0905:
- configuration_deepseek.py
. Make sure to double-check they do not contain any added malicious code. To avoid downloading new versions of the code file, you can pin a revision.
You are using a model of type kimi_k2 to instantiate a model of type deepseek_v3. This is not supported for all configurations of models and can yield errors.


RuntimeError: No GPU or XPU found. A GPU or XPU is needed for FP8 quantization.

In [None]:
from langchain_core.prompts import ChatPromptTemplate

PROMPT = "Explain {topic} in 2 bullets"
prompt = ChatPromptTemplate.from_template(PROMPT)

chain = (
         prompt
           |
         chat_llama
)

response = chain.invoke({"topic": "AI"})
print(response.content)

Here are 2 bullets that explain AI:

• **Artificial Intelligence (AI) is a computer system that enables machines to perform tasks that typically require human intelligence, such as learning, problem-solving, and decision-making.**

• **AI systems use algorithms and data to analyze and process information, allowing them to make predictions, classify objects, and generate insights that can be used in various applications, including image recognition, natural language processing, and predictive analytics.**


## Use LLMs with LangChain and Hugging Face Inference APIs

In [None]:
from langchain_huggingface import ChatHuggingFace, HuggingFaceEndpoint

model_id = "moonshotai/kimi-k2-instruct-0905"

llm_api = HuggingFaceEndpoint(
    repo_id=model_id,
    task="text-generation",
    max_new_tokens=1000,
    do_sample=False,
    temperature=0,
)

chat_llama = ChatHuggingFace(llm=llm_api)

In [7]:
from langchain_core.prompts import ChatPromptTemplate

PROMPT = "Explain {topic} in 2 bullets"
prompt = ChatPromptTemplate.from_template(PROMPT)

chain = (
         prompt
           |
         chat_llama
)

response = chain.invoke({"topic": "AI"})
print(response.content)

HfHubHTTPError: 401 Client Error: Unauthorized for url: https://router.huggingface.co/novita/v3/openai/chat/completions (Request ID: Root=1-697cacb3-6f4a3ca0633847822ee5ded2;032c8a60-d54b-4b9a-85fc-c6a938cd6448)

Invalid username or password.

## Load Groq API Credentials


In [8]:
from dotenv import load_dotenv
import os

load_dotenv()

True

## Use LLMs with LangChain and Groq API

In [9]:
from langchain_groq import ChatGroq

chat_llama = ChatGroq(
    model="llama-3.2-3b-preview",
    temperature=0,
    max_tokens=1000,
)

In [10]:
from langchain_core.prompts import ChatPromptTemplate

PROMPT = "Explain {topic} in 2 bullets"
prompt = ChatPromptTemplate.from_template(PROMPT)

chain = (
         prompt
           |
         chat_llama
)

response = chain.invoke({"topic": "AI"})
print(response.content)

BadRequestError: Error code: 400 - {'error': {'message': 'The model `llama-3.2-3b-preview` has been decommissioned and is no longer supported. Please refer to https://console.groq.com/docs/deprecations for a recommendation on which model to use instead.', 'type': 'invalid_request_error', 'code': 'model_decommissioned'}}