# RunPod LLM

This guide covers how to use the LangChain `RunPod` LLM class to interact with text generation models hosted on [RunPod Serverless](https://www.runpod.io/serverless-gpu).

## Setup

1. **Install the package:**
   ```bash
   pip install -qU langchain-runpod
   ```
2. **Deploy an LLM Endpoint:** Follow the setup steps in the [RunPod Provider Guide](/docs/integrations/providers/runpod#setup) to deploy a compatible text generation endpoint on RunPod Serverless and get its Endpoint ID.
3. **Set Environment Variables:** Make sure `RUNPOD_API_KEY` and `RUNPOD_ENDPOINT_ID` are set.

In [None]:
import os
import getpass
# Make sure environment variables are set (or pass them directly to RunPod)
if "RUNPOD_API_KEY" not in os.environ:
    os.environ["RUNPOD_API_KEY"] = getpass.getpass("Enter your RunPod API Key: ")
if "RUNPOD_ENDPOINT_ID" not in os.environ:
    os.environ["RUNPOD_ENDPOINT_ID"] = input("Enter your RunPod Endpoint ID: ")

## Instantiation

Initialize the `RunPod` class. You can pass model-specific parameters via `model_kwargs` and configure polling behavior.

In [None]:
from langchain_runpod import RunPod

llm = RunPod(
    # runpod_endpoint_id can be passed here if not set in env
    model_kwargs={
        "max_new_tokens": 256,
        "temperature": 0.6,
        "top_k": 50,
        # Add other parameters supported by your endpoint handler
    },
    # Optional: Adjust polling
    # poll_interval=0.3,
    # max_polling_attempts=100
)

## Basic Usage

Use the standard LangChain `invoke`, `ainvoke`, `stream`, `astream` methods.

In [None]:
prompt = "Write a tagline for an ice cream shop on the moon."

# Invoke (Sync)
try:
    response = llm.invoke(prompt)
    print("--- Sync Invoke Response ---")
    print(response)
except Exception as e:
    print(f"Error invoking LLM: {e}. Ensure endpoint ID/API key are correct and endpoint is active/compatible.")

In [None]:
# Stream (Sync, simulated via polling /stream)
print("\n--- Sync Stream Response ---")
try:
    for chunk in llm.stream(prompt):
        print(chunk, end="", flush=True)
    print()  # Newline
except Exception as e:
    print(f"\nError streaming LLM: {e}. Ensure endpoint handler supports streaming output format.")