add sample rate docs (langfuse#758)

maxdeichmann · web-flow · commit 18fa459b77ed · 2024-08-05T15:13:36.000Z
diff --git a/cookbook/python_sdk_low_level.ipynb b/cookbook/python_sdk_low_level.ipynb
@@ -121,7 +121,8 @@
         "| `LANGFUSE_DEBUG`, `debug` | Optional. Prints debug logs to the console | `False`\n",
         "| `LANGFUSE_THREADS`, `threads` | Specifies the number of consumer threads to execute network requests to the Langfuse server. Helps scaling the SDK for high load. Only increase this if you run into scaling issues. | 1\n",
         "| `LANGFUSE_MAX_RETRIES`, `max_retries` | Specifies the number of times the SDK should retry network requests for tracing. | 3\n",
-        "| `LANGFUSE_TIMEOUT`, `timeout` | Timeout in seonds for network requests | 20"
+        "| `LANGFUSE_TIMEOUT`, `timeout` | Timeout in seonds for network requests | 20\n",
+        "| `LANGFUSE_SAMPLE_RATE`, `sample_rate` | [Sample rate](/docs/tracing-features/sampling) for tracing. | 1.0\n"
       ]
     },
     {
diff --git a/pages/docs/integrations/langchain/tracing.mdx b/pages/docs/integrations/langchain/tracing.mdx
@@ -112,6 +112,7 @@ When initializing the Langfuse handler, you can pass the following **optional**
 | `version`    | `version`   | string  | The version of your application. See [experimentation docs](/docs/experimentation) for details. |
 | `trace_name` |             | string  | Customize the name of the created traces. Defaults to name of chain.                            |
 | `enabled`    | `enabled`   | boolean | Enable or disable the Langfuse integration. Defaults to `true`.                                 |
+| `sample_rate`    | `-`   | float | [Sample rate](/docs/tracing-features/sampling) for tracing.                                       |
 
 ### Interoperability with Langfuse SDKs [#interoperability]
 
diff --git a/pages/docs/integrations/llama-index/get-started.mdx b/pages/docs/integrations/llama-index/get-started.mdx
@@ -95,6 +95,7 @@ You can update trace parameters at any time to add additional context to a trace
 | `tags`       | [Tags](/docs/tracing-features/tags) to categorize and filter traces.      |
 | `version`    | The specified version to trace [experiments](/docs/experimentation).      |
 | `release`    | The specified release to trace [experiments](/docs/experimentation).      |
+| `sample_rate`| [Sample rate](/docs/tracing-features/sampling) for tracing.             |
 
 ```python {11-15}
 from llama_index.core import Settings
diff --git a/pages/docs/integrations/openai/python/get-started.mdx b/pages/docs/integrations/openai/python/get-started.mdx
@@ -172,6 +172,7 @@ You can add the following properties to the openai method, e.g. `openai.chat.com
 | `tags`                  | Set [tags](/docs/tracing-features/tags) to categorize and filter traces.     |
 | `trace_id`              | See "Interoperability with Langfuse Python SDK" (below) for more details.    |
 | `parent_observation_id` | See "Interoperability with Langfuse Python SDK" (below) for more details.    |
+| `sample_rate` | [Sample rate](/docs/tracing-features/sampling) for tracing. |
 
 Example:
 
diff --git a/pages/docs/sdk/python/decorators.mdx b/pages/docs/sdk/python/decorators.mdx
@@ -505,6 +505,10 @@ To avoid this, ensure that the `langfuse_context.flush()` method is called befor
 
 Enable debug mode to get verbose logs. Set the debug mode via the environment variable `LANGFUSE_DEBUG=True`.
 
+### Sampling
+
+Sampling can be controlled via the `LANGFUSE_SAMPLE_RATE` environment variable. See the [sampling documentation](/docs/tracing-features/sampling) for more details.
+
 ### Authentication check
 
 Use `langfuse_context.auth_check()` to verify that your host and API credentials are valid. This operation is blocking and is not recommended for production use.
diff --git a/pages/docs/sdk/python/low-level-sdk.md b/pages/docs/sdk/python/low-level-sdk.md
@@ -61,6 +61,8 @@ langfuse = Langfuse()
 | `LANGFUSE_THREADS`, `threads` | Specifies the number of consumer threads to execute network requests to the Langfuse server. Helps scaling the SDK for high load. Only increase this if you run into scaling issues. | 1
 | `LANGFUSE_MAX_RETRIES`, `max_retries` | Specifies the number of times the SDK should retry network requests for tracing. | 3
 | `LANGFUSE_TIMEOUT`, `timeout` | Timeout in seonds for network requests | 20
+| `LANGFUSE_SAMPLE_RATE`, `sample_rate` | [Sample rate](/docs/tracing-features/sampling) for tracing. | 1.0
+
 
 ## Tracing
 
diff --git a/pages/docs/tracing-features/sampling.mdx b/pages/docs/tracing-features/sampling.mdx
@@ -0,0 +1,101 @@
+---
+description: Configure sampling to control the volume of traces collected by the Langfuse server.
+---
+
+# Sampling
+
+Sampling can be used to control the volume of traces collected by the Langfuse server. 
+
+You can configure the sample rate by setting the `LANGFUSE_SAMPLE_RATE` environment variable or by using the `sample_rate` parameter in the constructors of the Python SDK. The value has to be between 0 and 1. The default value is 1, meaning that all traces are collected. A value of 0.5 means that only 50% of the traces are collected. The SDK samples on the trace level meaning that if a trace is sampled, all observations and scores within that trace will be sampled as well.
+
+Support for the JS SDK is coming soon.
+
+<Tabs items={["Python", "OpenAI (Python)",  "Langchain (Python)", "LlamaIndex"]}>
+<Tab>
+
+When using the [`@observe()` decorator](/docs/sdk/python/decorators):
+
+```python
+from langfuse.decorators import langfuse_context, observe
+
+os.environ["LANGFUSE_SAMPLE_RATE"] = '0.5'
+
+@observe()
+def fn():
+    pass
+
+fn()
+```
+
+When using the [low-level SDK](/docs/sdk/python/low-level-sdk):
+
+```python
+from langfuse import Langfuse
+
+# Either set the environment variable or the constructor parameter. The latter takes precedence.
+os.environ["LANGFUSE_SAMPLE_RATE"] = '0.5'
+langfuse = Langfuse(sample_rate=0.5)
+
+trace = langfuse.trace(
+  name="Rap Battle",
+)
+```
+
+</Tab>
+<Tab>
+
+When using the [OpenAI SDK Integration](/docs/integrations/openai)
+
+```python
+# Either set the environment variable or configure the openai import. The latter takes precedence.
+os.environ["LANGFUSE_SAMPLE_RATE"] = '0.5'
+
+from langfuse.openai import openai
+openai.langfuse_sample_rate = 0.5
+
+completion = openai.chat.completions.create(
+  name="test-chat",
+  model="gpt-3.5-turbo",
+  messages=[
+    {"role": "system", "content": "You are a calculator."},
+    {"role": "user", "content": "1 + 1 = "}],
+)
+```
+
+</Tab>
+<Tab>
+
+When using the [CallbackHandler](/docs/integrations/langchain/tracing)
+
+```python
+from langfuse.callback import CallbackHandler
+
+# Either set the environment variable or the constructor parameter. The latter takes precedence.
+os.environ["LANGFUSE_SAMPLE_RATE"] = '0.5'
+handler = CallbackHandler(
+  sample_rate=0.5
+)
+```
+</Tab>
+
+<Tab>
+
+When using the [LlamaIndex Integration](/docs/integrations/llama-index)
+
+```python
+from llama_index.core import Settings
+from llama_index.core.callbacks import CallbackManager
+from langfuse import langfuse
+
+# Either set the environment variable or the constructor parameter. The latter takes precedence.
+os.environ["LANGFUSE_SAMPLE_RATE"] = '0.5'
+langfuse_callback_handler = LlamaIndexCallbackHandler(sample_rate=0.5)
+
+Settings.callback_manager = CallbackManager([langfuse_callback_handler])
+
+```
+
+</Tab>
+
+
+</Tabs>
diff --git a/pages/guides/cookbook/python_sdk_low_level.md b/pages/guides/cookbook/python_sdk_low_level.md
@@ -61,6 +61,8 @@ langfuse = Langfuse()
 | `LANGFUSE_THREADS`, `threads` | Specifies the number of consumer threads to execute network requests to the Langfuse server. Helps scaling the SDK for high load. Only increase this if you run into scaling issues. | 1
 | `LANGFUSE_MAX_RETRIES`, `max_retries` | Specifies the number of times the SDK should retry network requests for tracing. | 3
 | `LANGFUSE_TIMEOUT`, `timeout` | Timeout in seonds for network requests | 20
+| `LANGFUSE_SAMPLE_RATE`, `sample_rate` | [Sample rate](/docs/tracing-features/sampling) for tracing. | 1.0
+
 
 ## Tracing
 

Original file line number	Diff line number	Diff line change
`@@ -121,7 +121,8 @@`
`121`	`121`	"\| `LANGFUSE_DEBUG`, `debug` \| Optional. Prints debug logs to the console \| `False`\n",
`122`	`122`	"\| `LANGFUSE_THREADS`, `threads` \| Specifies the number of consumer threads to execute network requests to the Langfuse server. Helps scaling the SDK for high load. Only increase this if you run into scaling issues. \| 1\n",
`123`	`123`	"\| `LANGFUSE_MAX_RETRIES`, `max_retries` \| Specifies the number of times the SDK should retry network requests for tracing. \| 3\n",
`124`		- "\| `LANGFUSE_TIMEOUT`, `timeout` \| Timeout in seonds for network requests \| 20"
	`124`	+ "\| `LANGFUSE_TIMEOUT`, `timeout` \| Timeout in seonds for network requests \| 20\n",
	`125`	+ "\| `LANGFUSE_SAMPLE_RATE`, `sample_rate` \| [Sample rate](/docs/tracing-features/sampling) for tracing. \| 1.0\n"
`125`	`126`	`]`
`126`	`127`	`},`
`127`	`128`	`{`