diff --git a/src/docs.json b/src/docs.json
index c102d67fc3..341be676ba 100644
--- a/src/docs.json
+++ b/src/docs.json
@@ -970,6 +970,7 @@
"langsmith/sample-traces"
]
},
+ "langsmith/cost-tracking",
{
"group": "Advanced tracing techniques",
"pages": [
@@ -1005,8 +1006,7 @@
"langsmith/compare-traces",
"langsmith/share-trace",
"langsmith/platform-logs",
- "langsmith/data-export",
- "langsmith/calculate-token-based-costs"
+ "langsmith/data-export"
]
},
{
diff --git a/src/langsmith/calculate-token-based-costs.mdx b/src/langsmith/calculate-token-based-costs.mdx
deleted file mode 100644
index 55a1987613..0000000000
--- a/src/langsmith/calculate-token-based-costs.mdx
+++ /dev/null
@@ -1,109 +0,0 @@
----
-title: Calculate token-based costs for traces
-sidebarTitle: Calculate token-based costs for traces
----
-
-
-* [Providing token counts for LLM runs (spans)](/langsmith/log-llm-trace#provide-token-and-cost-information)
-
-
-LangSmith allows you to track token-based costs for LLM runs. The costs are rolled up to the trace and project level.
-
-There are two ways costs can be tracked:
-
-1. Derived from token counts and model prices
-2. Directly specified as part of the run data
-
-In most cases it is easier to include token counts in the run data and specify model pricing in LangSmith. LangSmith assumes that costs are linear in token counts, broken down by token type. For the few models with non-linear pricing (e.g., above X input tokens the per-token price changes), we recommend computing costs client-side and sending them as part of the run data.
-
-## Send token counts
-
-For LangSmith to accurately derive costs for an LLM run, you need to provide token counts:
-
-* If you are using the LangSmith Python or TS/JS SDK with OpenAI or Anthropic models, the [built-in wrappers](/langsmith/annotate-code#wrap-the-openai-client) will automatically send up token counts, model provider and model name data to LangSmith.
-* If you are using the LangSmith SDK's with other model providers, you should carefully read through [this guide](/langsmith/log-llm-trace#provide-token-and-cost-information).
-* If you are using LangChain Python or TS/JS, token counts, model provider, and model name are automatically sent up to LangSmith for most chat model integrations. If there is a chat model integration that is missing token counts and for which the underlying API includes token counts in the model response, please open a GitHub issue in the [LangChain repo](https://github.com/langchain-ai/langchain).
-
-Token counts must be explicitly provided for accurate cost tracking. See the guide on [providing token and cost information](/langsmith/log-llm-trace#provide-token-and-cost-information) for details on how to include token counts.
-
-## Specify model name
-
-LangSmith reads the LLM model name from the `ls_model_name` field in [run metadata](/langsmith/add-metadata-tags). The [SDK built-in wrappers](/langsmith/annotate-code#wrap-the-openai-client) and any LangChain integrations will automatically handle specifying this metadata for you.
-
-## Set model prices
-
-To compute costs from token counts and model names, we need to know the per-token prices for the model you're using. LangSmith has a [model pricing table](https://smith.langchain.com/settings/workspaces/models) for this. The table comes with pricing information for most OpenAI, Anthropic, and Gemini models. You can add prices for other models, or overwrite pricing for default models.
-
-You can specify prices for prompt (input) and completion (output) tokens. If needed you can provide a more detailed breakdown of prices. For example, some model providers have different pricing for multimodal or cached tokens.
-
-
-
-Hovering over the `...` next to the prompt/completion prices shows you the price breakdown by token type. You can see, for example, if `audio` and `image` prompt tokens have different prices versus default text prompt tokens.
-
-To create a *new entry* in the model pricing map, click on the `Add new model` button in the top right corner.
-
-
-
-Here, you can specify the following fields:
-
-* **Model Name**: The human-readable name of the model.
-* **Match Pattern**: A regex pattern to match the model name. This is used to match the value for `ls_model_name` in the run metadata.
-* **Prompt (Input) Price**: The cost per 1M input tokens for the model. This number is multiplied by the number of tokens in the prompt to calculate the prompt cost.
-* **Completion (Output) Price**: The cost per 1M output tokens for the model. This number is multiplied by the number of tokens in the completion to calculate the completion cost.
-* **Prompt (Input) Price Breakdown** (Optional): The breakdown of price for each different type of prompt token, e.g. `cache_read`, `video`, `audio`, etc.
-* **Completion (Output) Price Breakdown** (Optional): The breakdown of price for each different type of completion token, e.g. `reasoning`, `image`, etc.
-* **Model Activation Date** (Optional): The date from which the pricing is applicable. Only runs after this date will apply this model price.
-* **Provider** (Optional): The provider of the model. If specified, this is matched against `ls_provider` in the run metadata.
-
-Once you have set up the model pricing map, LangSmith will automatically calculate and aggregate the token-based costs for traces based on the token counts provided in the LLM invocations.
-
-
-Please note that updates to the model pricing map are will not be reflected in the costs for traces already logged. We do not currently support backfilling model pricing changes.
-
-
-For specifying pricing breakdowns, here are the detailed token count types used by LangChain chat model integrations and LangSmith SDK wrappers:
-
-```python
-# Standardized
-cache_read
-cache_creation
-reasoning
-audio
-image
-video
-# Anthropic-only
-ephemeral_1h_input_tokens
-ephemeral_5m_input_tokens
-```
-
-## Cost formula
-
-The cost for a run is computed greedily from most-to-lease specific token type. Suppose we set a price of \$2 per 1M prompt tokens with a detailed price of \$1 per 1M `cache_read` prompt tokens, and \$3 per 1M completion tokens. If we uploaded the following usage metadata:
-
-```python
-{
- "input_tokens": 20,
- "input_token_details": {"cache_read": 5},
- "output_tokens": 10,
- "output_token_details": {},
- "total_tokens": 30,
-}
-```
-
-then we'd compute the token costs as follows:
-
-```python
-# A.K.A. prompt_cost
-# Notice that we compute the cache_read cost and then for any
-# remaining input_tokens we apply the default input price.
-input_cost = 5 * 1e-6 + (20 - 5) * 2e-6 # 3.5e-5
-# A.K.A. completion_cost
-output_cost = 10 * 3e-6 # 3e-5
-total_cost = input_cost + output_cost # 6.5e-5
-```
-
-## Send costs directly
-
-If you are tracing an LLM call that returns token cost information, are tracing an API with a non-token based pricing scheme, or otherwise have accurate information around costs at runtime, you may instead populate a `usage_metadata` dict while tracing rather than relying on LangSmith's built-in cost calculations.
-
-See [this guide](/langsmith/log-llm-trace#provide-token-and-cost-information) to learn how to manually provide cost information for a run.
diff --git a/src/langsmith/cost-tracking.mdx b/src/langsmith/cost-tracking.mdx
new file mode 100644
index 0000000000..7f5aaee39a
--- /dev/null
+++ b/src/langsmith/cost-tracking.mdx
@@ -0,0 +1,671 @@
+---
+title: Cost tracking
+sidebarTitle: Cost tracking
+---
+
+Building agents at scale introduces non-trivial, usage-based costs that can be difficult to track. LangSmith automatically records LLM token usage and costs for major providers, and also allows you to submit custom cost data for any additional components.
+
+This gives you a single, unified view of costs across your entire application, which makes it easy to monitor, understand, and debug your spend.
+
+This guide covers:
+- [Viewing costs in the LangSmith UI](#viewing-costs-in-the-langsmith-ui)
+- [How cost tracking works](#cost-tracking)
+- [How to send custom cost data](#send-custom-cost-data)
+
+
+## Viewing costs in the LangSmith UI
+In the [LangSmith UI](https://smith.langchain.com), you can explore usage and spend in three main ways: first by understanding how tokens and costs are broken down, then by viewing those details within individual traces, and finally by inspecting aggregated metrics in project stats and dashboards.
+
+### Token and cost breakdowns
+
+Token usage and costs are broken down into three categories:
+- **Input**: Tokens in the prompt sent to the model. Subtypes include: cache reads, text tokens, image tokens, etc
+- **Output**: Tokens generated in the response from the model. Subtypes include: reasoning tokens, text tokens, image tokens, etc
+- **Other**: Costs from tool calls, retrieval steps or any custom runs.
+
+You can view detailed breakdowns by hovering over cost sections in the UI. When available, each section is further categorized by subtype.
+
+
+
+
+You can inspect these breakdowns throughout the LangSmith UI, described in the following section.
+
+### Where to view token and cost breakdowns
+
+
+ The trace tree shows the most detailed view of token usage and cost (for a single trace). It displays the total usage for the entire trace, aggregated values for each parent run and token and cost breakdowns for each child run.
+
+ Open any run inside a tracing project to view its trace tree.
+
+
+
+
+
+ The project stats panel shows the total token usage and cost for all traces in a project.
+
+
+
+
+
+ Dashboards help you explore cost and token usage trends over time. The [prebuilt dashboard](/langsmith/dashboards/#prebuilt-dashboards) for a tracing project shows total costs and a cost breakdown by input and output tokens.
+
+ You may also configure custom cost tracking charts in [custom dashboards](https://docs.langchain.com/langsmith/dashboards#custom-dashboards).
+
+
+
+
+
+
+
+
+## Cost tracking
+
+You can track costs in two ways:
+
+1. Costs for LLM calls can be **automatically derived from token counts and model prices**
+2. Cost for LLM calls or any other run type can be **manually specified as part of the run data**
+
+The approach you use will depend on on what you're tracking and how your model pricing is structured:
+
+
+| Method | Run type: LLM | Run type: Other |
+|--------|---------------|-----------------|
+| **Automatically** |
- Calling LLMs with [LangChain](/oss/python/langchain/overview)
- Tracing LLM calls to OpenAI, Anthropic or models that follow an OpenAI-compliant format with `@traceable`
- Using LangSmith wrappers for [OpenAI](/langsmith/trace-openai) or [Anthropic](/langsmith/trace-anthropic)
- For other model providers, read the [token and cost information guide](/langsmith/log-llm-trace#provide-token-and-cost-information)
| Not applicable. |
+| **Manually** | If LLM call costs are non-linear (eg. follow a custom cost function) | Send costs for any run types, e.g. tool calls, retrieval steps |
+
+
+### LLM calls: Automatically track costs based on token counts
+
+To compute cost automatically from token usage, you need to provide **token counts**, the **model and provider** and the **model price**.
+
+
+
+Follow the instructions below if you’re using model providers whose responses don’t follow the same patterns as one of OpenAI or Anthropic.
+
+These steps are **only required** if you are *not*:
+- Calling LLMs with [LangChain](/oss/python/langchain/overview)
+- Using `@traceable` to trace LLM calls to OpenAI, Anthropic or models that follow an OpenAI-compliant format
+- Using LangSmith wrappers for [OpenAI](/langsmith/trace-openai) or [Anthropic](/langsmith/trace-anthropic).
+
+
+**1. Send token counts**
+
+Many models include token counts as part of the response. You must extract this information and include it in your run using one of the following methods:
+
+
+
+Set a `usage_metadata` field on the run's metadata. The advantage of this approach is that you do not need to change your traced function’s runtime outputs
+
+
+
+```python Python
+from langsmith import traceable, get_current_run_tree
+
+inputs = [
+ {"role": "system", "content": "You are a helpful assistant."},
+ {"role": "user", "content": "I'd like to book a table for two."},
+]
+
+@traceable(
+ run_type="llm",
+ metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
+)
+def chat_model(messages: list):
+ # Imagine this is the real model output format your application expects
+ assistant_message = {
+ "role": "assistant",
+ "content": "Sure, what time would you like to book the table for?"
+ }
+
+ # Token usage you compute or receive from the provider
+ token_usage = {
+ "input_tokens": 27,
+ "output_tokens": 13,
+ "total_tokens": 40,
+ "input_token_details": {"cache_read": 10}
+ }
+
+ # Attach token usage to the LangSmith run
+ run = get_current_run_tree()
+ run.set(usage_metadata=token_usage)
+
+ return assistant_message
+
+chat_model(inputs)
+```
+
+```typescript TypeScript
+import { traceable, getCurrentRunTree } from "langsmith/traceable";
+
+const inputs = [
+ { role: "system", content: "You are a helpful assistant." },
+ { role: "user", content: "I'd like to book a table for two." },
+];
+
+const chatModel = traceable(
+ async ({ messages }) => {
+ // The output your application expects
+ const assistantMessage = {
+ role: "assistant",
+ content: "Sure, what time would you like to book the table for?",
+ };
+
+ // Token usage you compute or receive from the provider
+ const tokenUsage = {
+ input_tokens: 27,
+ output_tokens: 13,
+ total_tokens: 40,
+ input_token_details: { cache_read: 10 },
+ };
+
+ // Attach usage to the LangSmith run
+ const runTree = getCurrentRunTree();
+ runTree.metadata.usage_metadata = tokenUsage;
+
+ return assistantMessage;
+ },
+ {
+ run_type: "llm",
+ name: "chat_model",
+ metadata: {
+ ls_provider: "my_provider",
+ ls_model_name: "my_model",
+ },
+ }
+);
+
+await chatModel({ messages: inputs });
+```
+
+
+
+
+
+Include the `usage_metadata` key directly within the object returned by your traced function. LangSmith will extract it from the output.
+
+
+
+```python Python
+from langsmith import traceable
+
+inputs = [
+ {"role": "system", "content": "You are a helpful assistant."},
+ {"role": "user", "content": "I'd like to book a table for two."},
+]
+output = {
+ "choices": [
+ {
+ "message": {
+ "role": "assistant",
+ "content": "Sure, what time would you like to book the table for?"
+ }
+ }
+ ],
+ "usage_metadata": {
+ "input_tokens": 27,
+ "output_tokens": 13,
+ "total_tokens": 40,
+ "input_token_details": {"cache_read": 10}
+ },
+}
+
+@traceable(
+ run_type="llm",
+ metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
+)
+def chat_model(messages: list):
+ return output
+
+chat_model(inputs)
+```
+
+```typescript TypeScript
+import { traceable } from "langsmith/traceable";
+
+const messages = [
+ { role: "system", content: "You are a helpful assistant." },
+ { role: "user", content: "I'd like to book a table for two." }
+];
+const output = {
+ choices: [
+ {
+ message: {
+ role: "assistant",
+ content: "Sure, what time would you like to book the table for?",
+ },
+ },
+ ],
+ usage_metadata: {
+ input_tokens: 27,
+ output_tokens: 13,
+ total_tokens: 40,
+ },
+};
+
+const chatModel = traceable(
+ async ({
+ messages,
+ }: {
+ messages: { role: string; content: string }[];
+ model: string;
+ }) => {
+ return output;
+ },
+ {
+ run_type: "llm",
+ name: "chat_model",
+ metadata: {
+ ls_provider: "my_provider",
+ ls_model_name: "my_model"
+ }
+ }
+);
+
+await chatModel({ messages });
+```
+
+
+
+
+In either case, the usage metadata should contain a subset of the following LangSmith-recognized fields:
+
+
+
+The following fields in the `usage_metadata` dict are recognized by LangSmith. You can view the full [Python types](https://github.com/langchain-ai/langsmith-sdk/blob/e705fbd362be69dd70229f94bc09651ef8056a61/python/langsmith/schemas.py#L1196-L1227) or [TypeScript interfaces](https://github.com/langchain-ai/langsmith-sdk/blob/e705fbd362be69dd70229f94bc09651ef8056a61/js/src/schemas.ts#L637-L689) directly.
+
+
+
+Number of tokens used in the model input. Sum of all input token types.
+
+
+
+Number of tokens used in the model response. Sum of all output token types.
+
+
+
+Number of tokens used in the input and output. Optional, can be inferred. Sum of input_tokens + output_tokens.
+
+
+
+Breakdown of input token types. Keys are token-type strings, values are counts. Example `{"cache_read": 5}`.
+
+Known fields include: `audio`, `text`, `image`, `cache_read`, `cache_creation`. Additional fields are possible depending on the model or provider.
+
+
+
+Breakdown of output token types. Keys are token-type strings, values are counts. Example `{"reasoning": 5}`.
+
+Known fields include: `audio`, `text`, `image`, `reasoning`. Additional fields are possible depending on the model or provider.
+
+
+
+Cost of the input tokens.
+
+
+
+Cost of the output tokens.
+
+
+
+Cost of the tokens. Optional, can be inferred. Sum of input_cost + output_cost.
+
+
+
+Details of the input cost. Keys are token-type strings, values are cost amounts.
+
+
+
+Details of the output cost. Keys are token-type strings, values are cost amounts.
+
+
+**Cost Calculations**
+
+The cost for a run is computed greedily from most-to-least specific token type. Suppose you set a price of \$2 per 1M input tokens with a detailed price of \$1 per 1M `cache_read` input tokens, and \$3 per 1M output tokens. If you uploaded the following usage metadata:
+
+```python
+{
+ "input_tokens": 20,
+ "input_token_details": {"cache_read": 5},
+ "output_tokens": 10,
+ "total_tokens": 30,
+}
+```
+
+Then, the token costs would be computed as follows:
+
+```python
+# Notice that LangSmith computes the cache_read cost and then for any
+# remaining input_tokens, the default input price is applied.
+input_cost = 5 * 1e-6 + (20 - 5) * 2e-6 # 3.5e-5
+output_cost = 10 * 3e-6 # 3e-5
+total_cost = input_cost + output_cost # 6.5e-5
+```
+
+
+
+
+**2. Specify model name**
+
+When using a custom model, the following fields need to be specified in a [run's metadata](/langsmith/add-metadata-tags) in order to associate token counts with costs. It's also helpful to provide these metadata fields to identify the model when viewing traces and when filtering.
+
+- `ls_provider`: The provider of the model, e.g., “openai”, “anthropic”
+- `ls_model_name`: The name of the model, e.g., “gpt-4o-mini”, “claude-3-opus-20240229”
+
+
+**3. Set model prices**
+
+A model pricing map is used to map model names to their per-token prices to compute costs from token counts. LangSmith's [model pricing table](https://smith.langchain.com/settings/workspaces/models) is used for this.
+
+
+The table comes with pricing information for most OpenAI, Anthropic, and Gemini models. You can [add prices for other models](/langsmith/cost-tracking#create-a-new-model-price-entry), or [overwrite pricing for default models](/langsmith/cost-tracking#update-an-existing-model-price-entry) if you have custom pricing.
+
+
+For models that have different pricing for different token types (e.g., multimodal or cached tokens), you can specify a breakdown of prices for each token type. Hovering over the `...` next to the input/output prices shows you the price breakdown by token type.
+
+
+
+
+
+Updates to the model pricing map are not reflected in the costs for traces already logged. We do not currently support backfilling model pricing changes.
+
+
+
+ To modify the default model prices, create a new entry with the same model, provider and match pattern as the default entry.
+
+ To create a *new entry* in the model pricing map, click on the `+ Model` button in the top right corner.
+
+
+
+
+ Here, you can specify the following fields:
+
+ * **Model Name**: The human-readable name of the model.
+ * **Input Price**: The cost per 1M input tokens for the model. This number is multiplied by the number of tokens in the prompt to calculate the prompt cost.
+ * **Input Price Breakdown** (Optional): The breakdown of price for each different type of input token, e.g. `cache_read`, `video`, `audio`
+ * **Output Price**: The cost per 1M output tokens for the model. This number is multiplied by the number of tokens in the completion to calculate the completion cost.
+ * **Output Price Breakdown** (Optional): The breakdown of price for each different type of output token, e.g. `reasoning`, `image`, etc.
+ * **Model Activation Date** (Optional): The date from which the pricing is applicable. Only runs after this date will apply this model price.
+ * **Match Pattern**: A regex pattern to match the model name. This is used to match the value for `ls_model_name` in the run metadata.
+ * **Provider** (Optional): The provider of the model. If specified, this is matched against `ls_provider` in the run metadata.
+
+ Once you have set up the model pricing map, LangSmith will automatically calculate and aggregate the token-based costs for traces based on the token counts provided in the LLM invocations.
+
+
+### LLM calls: Sending costs directly
+
+If your model follows a non-linear pricing scheme, we recommend calculating costs client-side and sending them to LangSmith as `usage_metadata`.
+
+
+Gemini 3 Pro Preview and Gemini 2.5 Pro follow a pricing scheme with a stepwise cost function. We support this pricing scheme for Gemini by default. For any other models with non-linear pricing, you will need to follow these instructions to calculate costs.
+
+
+
+
+
+```python Python
+from langsmith import traceable, get_current_run_tree
+
+inputs = [
+ {"role": "system", "content": "You are a helpful assistant."},
+ {"role": "user", "content": "I'd like to book a table for two."},
+]
+
+@traceable(
+ run_type="llm",
+ metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
+)
+def chat_model(messages: list):
+ llm_output = {
+ "choices": [
+ {
+ "message": {
+ "role": "assistant",
+ "content": "Sure, what time would you like to book the table for?"
+ }
+ }
+ ],
+ "usage_metadata": {
+ # Specify cost (in dollars) for the inputs and outputs
+ "input_cost": 1.1e-6,
+ "input_cost_details": {"cache_read": 2.3e-7},
+ "output_cost": 5.0e-6,
+ },
+ }
+ run = get_current_run_tree()
+ run.set(usage_metadata=llm_output["usage_metadata"])
+ return llm_output["choices"][0]["message"]
+
+chat_model(inputs)
+```
+
+```typescript TypeScript
+import { traceable, getCurrentRunTree } from "langsmith/traceable";
+
+const messages = [
+ { role: "system", content: "You are a helpful assistant." },
+ { role: "user", content: "I'd like to book a table for two." }
+];
+
+const chatModel = traceable(
+ async (messages: { role: string; content: string }[]) => {
+ const llmOutput = {
+ choices: [
+ {
+ message: {
+ role: "assistant",
+ content: "Sure, what time would you like to book the table for?",
+ },
+ },
+ ],
+ // Specify cost (in dollars) for the inputs and outputs
+ usage_metadata: {
+ input_cost: 1.1e-6,
+ input_cost_details: { cache_read: 2.3e-7 },
+ output_cost: 5.0e-6,
+ },
+ };
+
+ // Attach usage metadata to the run
+ const runTree = getCurrentRunTree();
+ runTree.metadata.usage_metadata = llmOutput.usage_metadata;
+
+ // Return only the assistant message
+ return llmOutput.choices[0].message;
+ },
+ {
+ run_type: "llm",
+ name: "chat_model",
+ metadata: {
+ ls_provider: "my_provider",
+ ls_model_name: "my_model",
+ },
+ }
+);
+
+await chatModel(messages);
+```
+
+
+
+
+### Other runs: Sending costs
+
+You can also send cost information for any non-LLM runs, such as tool calls.The cost must be specified in the `total_cost` field under the runs `usage_metadata`.
+
+
+
+Set a `total_cost` field on the run’s `usage_metadata`. The advantage of this approach is that you do not need to change your traced function’s runtime outputs
+
+
+
+```python Python
+from langsmith import traceable, get_current_run_tree
+
+# Example tool: get_weather
+@traceable(run_type="tool", name="get_weather")
+def get_weather(city: str):
+ # Your tool logic goes here
+ result = {
+ "temperature_f": 68,
+ "condition": "sunny",
+ "city": city,
+ }
+
+ # Cost for this tool call (computed however you like)
+ tool_cost = 0.0015
+
+ # Attach usage metadata to the LangSmith run
+ run = get_current_run_tree()
+ run.set(usage_metadata={"total_cost": tool_cost})
+
+ # Return only the actual tool result (no usage info)
+ return result
+
+tool_response = get_weather("San Francisco")
+```
+
+```typescript TypeScript
+import { traceable, getCurrentRunTree } from "langsmith/traceable";
+
+// Example tool: get_weather
+const getWeather = traceable(
+ async ({ city }) => {
+ // Your tool logic goes here
+ const result = {
+ temperature_f: 68,
+ condition: "sunny",
+ city,
+ };
+
+ // Cost for this tool call (computed however you like)
+ const toolCost = 0.0015;
+
+ // Attach usage metadata to the LangSmith run
+ const runTree = getCurrentRunTree();
+ runTree.metadata.usage_metadata = {
+ total_cost: toolCost,
+ };
+
+ // Return only the actual tool result (no usage info)
+ return result;
+ },
+ {
+ run_type: "tool",
+ name: "get_weather",
+ }
+);
+
+const toolResponse = await getWeather({ city: "San Francisco" });
+```
+
+
+
+
+
+Include the `usage_metadata` key directly within the object returned by your traced function. LangSmith will extract it from the output.
+
+
+
+```python Python
+from langsmith import traceable
+
+# Example tool: get_weather
+@traceable(run_type="tool", name="get_weather")
+def get_weather(city: str):
+ # Your tool logic goes here
+ result = {
+ "temperature_f": 68,
+ "condition": "sunny",
+ "city": city,
+ }
+
+ # Attach tool call costs here
+ return {
+ **result,
+ "usage_metadata": {
+ "total_cost": 0.0015, # <-- cost for this tool call
+ },
+ }
+
+tool_response = get_weather("San Francisco")
+```
+
+```typescript TypeScript
+import { traceable } from "langsmith/traceable";
+
+// Example tool: get_weather
+const getWeather = traceable(
+ async ({ city }) => {
+ // Your tool logic goes here
+ const result = {
+ temperature_f: 68,
+ condition: "sunny",
+ city,
+ };
+
+ // Attach tool call costs here
+ return {
+ ...result,
+ usage_metadata: {
+ total_cost: 0.0015, // <-- cost for this tool call
+ },
+ };
+ },
+ {
+ run_type: "tool",
+ name: "get_weather",
+ }
+);
+
+const toolResponse = await getWeather({ city: "San Francisco" });
+```
+
+
diff --git a/src/langsmith/images/cost-tooltip-dark.png b/src/langsmith/images/cost-tooltip-dark.png
new file mode 100644
index 0000000000..b2e8d68acf
Binary files /dev/null and b/src/langsmith/images/cost-tooltip-dark.png differ
diff --git a/src/langsmith/images/cost-tooltip-light.png b/src/langsmith/images/cost-tooltip-light.png
new file mode 100644
index 0000000000..388fa3747a
Binary files /dev/null and b/src/langsmith/images/cost-tooltip-light.png differ
diff --git a/src/langsmith/images/cost-tracking-chart-dark.png b/src/langsmith/images/cost-tracking-chart-dark.png
new file mode 100644
index 0000000000..e48f41f560
Binary files /dev/null and b/src/langsmith/images/cost-tracking-chart-dark.png differ
diff --git a/src/langsmith/images/cost-tracking-chart-light.png b/src/langsmith/images/cost-tracking-chart-light.png
new file mode 100644
index 0000000000..3be18a0996
Binary files /dev/null and b/src/langsmith/images/cost-tracking-chart-light.png differ
diff --git a/src/langsmith/images/model-price-map-dark.png b/src/langsmith/images/model-price-map-dark.png
new file mode 100644
index 0000000000..54d1afc0d2
Binary files /dev/null and b/src/langsmith/images/model-price-map-dark.png differ
diff --git a/src/langsmith/images/model-price-map-light.png b/src/langsmith/images/model-price-map-light.png
new file mode 100644
index 0000000000..9fd6cec1e8
Binary files /dev/null and b/src/langsmith/images/model-price-map-light.png differ
diff --git a/src/langsmith/images/model-price-map.png b/src/langsmith/images/model-price-map.png
deleted file mode 100644
index d64bca90c5..0000000000
Binary files a/src/langsmith/images/model-price-map.png and /dev/null differ
diff --git a/src/langsmith/images/new-price-map-entry-light.png b/src/langsmith/images/new-price-map-entry-light.png
new file mode 100644
index 0000000000..c232833e21
Binary files /dev/null and b/src/langsmith/images/new-price-map-entry-light.png differ
diff --git a/src/langsmith/images/stats-pane-cost-tracking-dark.png b/src/langsmith/images/stats-pane-cost-tracking-dark.png
new file mode 100644
index 0000000000..d7ccc9b7bd
Binary files /dev/null and b/src/langsmith/images/stats-pane-cost-tracking-dark.png differ
diff --git a/src/langsmith/images/stats-pane-cost-tracking-light.png b/src/langsmith/images/stats-pane-cost-tracking-light.png
new file mode 100644
index 0000000000..c01217e3b3
Binary files /dev/null and b/src/langsmith/images/stats-pane-cost-tracking-light.png differ
diff --git a/src/langsmith/images/trace-tree-costs-dark.png b/src/langsmith/images/trace-tree-costs-dark.png
new file mode 100644
index 0000000000..7e0a722baf
Binary files /dev/null and b/src/langsmith/images/trace-tree-costs-dark.png differ
diff --git a/src/langsmith/images/trace-tree-costs-light.png b/src/langsmith/images/trace-tree-costs-light.png
new file mode 100644
index 0000000000..81ca4ad4b2
Binary files /dev/null and b/src/langsmith/images/trace-tree-costs-light.png differ
diff --git a/src/langsmith/log-llm-trace.mdx b/src/langsmith/log-llm-trace.mdx
index bde1282e39..35ba49d891 100644
--- a/src/langsmith/log-llm-trace.mdx
+++ b/src/langsmith/log-llm-trace.mdx
@@ -549,234 +549,8 @@ To learn more about how to use the `metadata` fields, refer to the [Add metadata
## Provide token and cost information
-LangSmith calculates costs automatically by using the [model pricing table](https://smith.langchain.com/settings/workspaces/models) when token counts are provided. To learn how LangSmith calculates token-based costs, see [this guide](/langsmith/calculate-token-based-costs).
+LangSmith calculates costs derived from token counts and model prices automatically. Learn about [how to provide tokens and/or costs in a run](/langsmith/cost-tracking#cost-tracking) and [viewing costs in the LangSmith UI](/langsmith/cost-tracking#viewing-costs-in-the-langsmith-ui).
-Many models include token counts as part of the response. You can provide token counts to LangSmith in one of two ways:
-
-1. Extract usage within your traced function and set a `usage_metadata` field on the run's metadata.
-2. Return a `usage_metadata` field in your traced function outputs.
-
-In both cases, the usage metadata you send should contain a subset of the following LangSmith-recognized fields:
-
-
-You cannot set any fields other than the ones listed below. You do not need to include all fields.
-
-
-```python
-class UsageMetadata(TypedDict, total=False):
- input_tokens: int
- """The number of tokens used for the prompt."""
- output_tokens: int
- """The number of tokens generated as output."""
- total_tokens: int
- """The total number of tokens used."""
- input_token_details: dict[str, float]
- """The details of the input tokens."""
- output_token_details: dict[str, float]
- """The details of the output tokens."""
- input_cost: float
- """The cost of the input tokens."""
- output_cost: float
- """The cost of the output tokens."""
- total_cost: float
- """The total cost of the tokens."""
- input_cost_details: dict[str, float]
- """The cost details of the input tokens."""
- output_cost_details: dict[str, float]
- """The cost details of the output tokens."""
-```
-
-Note that the usage data can also include cost information, in case you do not want to rely on LangSmith's token-based cost formula. This is useful for models with pricing that is not linear by token type.
-
-### Setting run metadata
-
-You can [modify the current run's metadata](/langsmith/add-metadata-tags) with usage information within your traced function. The advantage of this approach is that you do not need to change your traced function's runtime outputs. Here's an example:
-
-
-Requires `langsmith>=0.3.43` (Python) and `langsmith>=0.3.30` (JS/TS).
-
-
-
-
-```python Python
-from langsmith import traceable, get_current_run_tree
-
-inputs = [
- {"role": "system", "content": "You are a helpful assistant."},
- {"role": "user", "content": "I'd like to book a table for two."},
-]
-
-@traceable(
- run_type="llm",
- metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
-)
-def chat_model(messages: list):
- llm_output = {
- "choices": [
- {
- "message": {
- "role": "assistant",
- "content": "Sure, what time would you like to book the table for?"
- }
- }
- ],
- "usage_metadata": {
- "input_tokens": 27,
- "output_tokens": 13,
- "total_tokens": 40,
- "input_token_details": {"cache_read": 10},
- # If you wanted to specify costs:
- # "input_cost": 1.1e-6,
- # "input_cost_details": {"cache_read": 2.3e-7},
- # "output_cost": 5.0e-6,
- },
- }
- run = get_current_run_tree()
- run.set(usage_metadata=llm_output["usage_metadata"])
- return llm_output["choices"][0]["message"]
-
-chat_model(inputs)
-```
-
-```typescript TypeScript
-import { traceable, getCurrentRunTree } from "langsmith/traceable";
-
-const messages = [
- { role: "system", content: "You are a helpful assistant." },
- { role: "user", content: "I'd like to book a table for two." }
-];
-
-const chatModel = traceable(
- async ({
- messages,
- }: {
- messages: { role: string; content: string }[];
- model: string;
- }) => {
- const llmOutput = {
- choices: [
- {
- message: {
- role: "assistant",
- content: "Sure, what time would you like to book the table for?",
- },
- },
- ],
- usage_metadata: {
- input_tokens: 27,
- output_tokens: 13,
- total_tokens: 40,
- },
- };
- const runTree = getCurrentRunTree();
- runTree.metadata.usage_metadata = llmOutput.usage_metadata;
- return llmOutput.choices[0].message;
- },
- {
- run_type: "llm",
- name: "chat_model",
- metadata: {
- ls_provider: "my_provider",
- ls_model_name: "my_model"
- }
- }
-);
-
-await chatModel({ messages });
-```
-
-
-
-### Setting run outputs
-
-You can add a `usage_metadata` key to the function's response to set manual token counts and costs.
-
-
-
-```python Python
-from langsmith import traceable
-
-inputs = [
- {"role": "system", "content": "You are a helpful assistant."},
- {"role": "user", "content": "I'd like to book a table for two."},
-]
-output = {
- "choices": [
- {
- "message": {
- "role": "assistant",
- "content": "Sure, what time would you like to book the table for?"
- }
- }
- ],
- "usage_metadata": {
- "input_tokens": 27,
- "output_tokens": 13,
- "total_tokens": 40,
- "input_token_details": {"cache_read": 10},
- # If you wanted to specify costs:
- # "input_cost": 1.1e-6,
- # "input_cost_details": {"cache_read": 2.3e-7},
- # "output_cost": 5.0e-6,
- },
-}
-
-@traceable(
- run_type="llm",
- metadata={"ls_provider": "my_provider", "ls_model_name": "my_model"}
-)
-def chat_model(messages: list):
- return output
-
-chat_model(inputs)
-```
-
-```typescript TypeScript
-import { traceable } from "langsmith/traceable";
-
-const messages = [
- { role: "system", content: "You are a helpful assistant." },
- { role: "user", content: "I'd like to book a table for two." }
-];
-const output = {
- choices: [
- {
- message: {
- role: "assistant",
- content: "Sure, what time would you like to book the table for?",
- },
- },
- ],
- usage_metadata: {
- input_tokens: 27,
- output_tokens: 13,
- total_tokens: 40,
- },
-};
-
-const chatModel = traceable(
- async ({
- messages,
- }: {
- messages: { role: string; content: string }[];
- model: string;
- }) => {
- return output;
- },
- {
- run_type: "llm",
- name: "chat_model",
- metadata: {
- ls_provider: "my_provider",
- ls_model_name: "my_model"
- }
- }
-);
-
-await chatModel({ messages });
-```
-
-
## Time-to-first-token