# How to configure runtime chain internals

# 1 Configurable Fields

configurable_fields allow you to define parameters (like the temperature of an LLM) that can be configured at runtime.

# Concepts
Configurable Field:
 A configurable field allows you to define certain parameters of a model (like temperature) that can be modified dynamically at runtime without hardcoding them. This is useful when you want flexibility to change parameters as needed.

Temperature: In language models, temperature controls the randomness of the output:

* A lower temperature (e.g., 0) makes the model's output more deterministic (predictable).
* A higher temperature (e.g., 0.9) makes the output more random and creative.
* with_config: This method allows you to override the configuration of the model at runtime with new parameters, like changing the temperature

In [8]:
%pip install python-dotenv

Note: you may need to restart the kernel to use updated packages.



[notice] A new release of pip is available: 24.2 -> 24.3.1
[notice] To update, run: python.exe -m pip install --upgrade pip


In [9]:
import os
from dotenv import load_dotenv
from getpass import getpass

# Load environment variables from the .env file
load_dotenv()

# Check if the OPENAI_API_KEY is in the environment variables
if "OPENAI_API_KEY" not in os.environ:
    os.environ["OPENAI_API_KEY"] = getpass("Enter your OpenAI API Key: ")

# Now, you can use the API key as needed
openai_api_key = os.environ.get("OPENAI_API_KEY")


In [1]:
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import ConfigurableField
from langchain_openai import ChatOpenAI

model = ChatOpenAI(temperature=0).configurable_fields(
    temperature=ConfigurableField(
        id="llm_temperature",
        name="LLM Temperature",
        description="The temperature of the LLM",
    )
)

model.invoke("pick a random number")

AIMessage(content='27', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 1, 'prompt_tokens': 11, 'total_tokens': 12, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-b45cea8a-c462-43ce-8033-3a873b20c127-0', usage_metadata={'input_tokens': 11, 'output_tokens': 1, 'total_tokens': 12, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

* Above, we defined temperature as a ConfigurableField that we can set at runtime. To do so, we use the with_config method like this:

In [2]:
model.with_config(configurable={"llm_temperature": 0.9}).invoke("pick a random number")

AIMessage(content='13', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 1, 'prompt_tokens': 11, 'total_tokens': 12, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-9fc2235d-a56a-4ff6-aaf6-1cccc7438dfa-0', usage_metadata={'input_tokens': 11, 'output_tokens': 1, 'total_tokens': 12, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

* Note that the passed llm_temperature entry in the dict has the same key as the id of the ConfigurableField.

* We can also do this to affect just one step that's part of a chain:

In [3]:
prompt = PromptTemplate.from_template("Pick a random number above {x}")
chain = prompt | model

chain.invoke({"x": 0})

AIMessage(content='27', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 1, 'prompt_tokens': 14, 'total_tokens': 15, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-37844074-3d1a-4974-98dc-f0cc01840681-0', usage_metadata={'input_tokens': 14, 'output_tokens': 1, 'total_tokens': 15, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

In [4]:
chain.with_config(configurable={"llm_temperature": 0.9}).invoke({"x": 0})



AIMessage(content='77', additional_kwargs={'refusal': None}, response_metadata={'token_usage': {'completion_tokens': 1, 'prompt_tokens': 14, 'total_tokens': 15, 'completion_tokens_details': {'accepted_prediction_tokens': 0, 'audio_tokens': 0, 'reasoning_tokens': 0, 'rejected_prediction_tokens': 0}, 'prompt_tokens_details': {'audio_tokens': 0, 'cached_tokens': 0}}, 'model_name': 'gpt-3.5-turbo-0125', 'system_fingerprint': None, 'finish_reason': 'stop', 'logprobs': None}, id='run-e5973b6c-08b3-4f45-a459-396c259d90ce-0', usage_metadata={'input_tokens': 14, 'output_tokens': 1, 'total_tokens': 15, 'input_token_details': {'audio': 0, 'cache_read': 0}, 'output_token_details': {'audio': 0, 'reasoning': 0}})

# With HubRunnables
This is useful to allow for switching of prompts

In [5]:
from langchain.runnables.hub import HubRunnable

prompt = HubRunnable("rlm/rag-prompt").configurable_fields(
    owner_repo_commit=ConfigurableField(
        id="hub_commit",
        name="Hub Commit",
        description="The Hub commit to pull from",
    )
)

prompt.invoke({"question": "foo", "context": "bar"})

ChatPromptValue(messages=[HumanMessage(content="You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.\nQuestion: foo \nContext: bar \nAnswer:", additional_kwargs={}, response_metadata={})])

* HubRunnable: A LangChain class that allows you to load pre-built components from Hugging Face's Model Hub. These components could be models, prompts, or other tasks designed to work with LangChain workflows.


* HubRunnable("rlm/rag-prompt"):

    * "rlm/rag-prompt" refers to a specific prompt template or configuration available on Hugging Face's Hub.
    * This will pull the resource from the Hub for usage.

* .configurable_fields(...):

     * Adds dynamic configuration to your HubRunnable. In this case, the field owner_repo_commit allows customization of the exact commit of the resource to use from the Hub.

* ConfigurableField:

     * Defines metadata for a configurable parameter. Here:
          * id="hub_commit": The internal identifier for this field.
          * name="Hub Commit": The human-readable name for the parameter.
          * description="The Hub commit to pull from": Explains the purpose of the parameter.

* invoke(...):
   * This method executes the HubRunnable with the given input. It passes a dictionary of parameters ({"question": "foo", "context": "bar"}) into the loaded component.
   * For the "rlm/rag-prompt", the question and context are inputs to generate a result.
* Inputs Explained:
   * "question": "foo": Represents the question you want to ask (e.g., "What is the capital of France?").
   * "context": "bar": Supplies relevant context for the question, which might be required in a Retrieval-Augmented Generation (RAG) system.

In [8]:
prompt.with_config(configurable={"hub_commit": "rlm/rag-prompt-llama"}).invoke(
    {"question": "foo", "context": "bar"}
)

ChatPromptValue(messages=[HumanMessage(content="[INST]<<SYS>> You are an assistant for question-answering tasks. Use the following pieces of retrieved context to answer the question. If you don't know the answer, just say that you don't know. Use three sentences maximum and keep the answer concise.<</SYS>> \nQuestion: foo \nContext: bar \nAnswer: [/INST]", additional_kwargs={}, response_metadata={})])

1. Dynamic Configuration with with_config
* By using with_config, you can change how the HubRunnable operates without re-instantiating it.
* For instance, in this case:
   * The base HubRunnable is rlm/rag-prompt.
   * The configuration dynamically specifies which commit or model version (rlm/rag-prompt-llama) to use.
2. RAG and Context
   * Retrieval-Augmented Generation (RAG) workflows often require structured inputs:
       * question: A user query.
       * context: Background information to ground the generation.
3. Modularity
  * By separating configuration (with_config) from execution (invoke), you can easily adjust and reuse the same HubRunnable with different settings for various use cases.
