docs: add decorator blog post (langfuse#545)

marcklingen · hassiebp · web-flow · commit cb102c496662 · 2024-04-24T18:00:13.000+02:00
---------

Co-authored-by: Hassieb Pakzad &lt;hassieb.pakzad@gmail.com&gt;
diff --git a/pages/blog/2024-04-python-decorator.mdx b/pages/blog/2024-04-python-decorator.mdx
@@ -0,0 +1,214 @@
+---
+title: "Trace complex LLM applications with the Langfuse decorator (Python)"
+date: 2024/04/24
+description: When building RAG or agents, lots of LLM calls and non-LLM inputs feeds into the final output. The Langfuse decorator allows you to trace and evaluate holistically.
+ogImage: /images/changelog/2024-03-24-python-decorator.png
+tag: integration
+author: Marc
+---
+
+import { BlogHeader } from "@/components/blog/BlogHeader";
+
+<BlogHeader
+  title="Trace complex LLM applications with the Langfuse decorator (Python)"
+  description="When building RAG or agents, lots of LLM calls and non-LLM inputs feeds into the final output. The Langfuse decorator allows you to trace and evaluate holistically."
+  date="April 24, 2024"
+  authors={["marcklingen", "hassiebpakzad"]}
+/>
+
+When we initially built complex agents for web scraping and code generation while in Y Combinator, we quickly recognized the need for new LLM-focused observability to truly understand how our applications produced their outputs. It wasn't just about the LLM calls, but also the retrieval steps, chaining of separate calls, function calling, obtaining the right JSON output, and various API calls that the agents should perform behind the scenes. In the end, all these steps led to them breaking many times on our initial users.
+
+This insight prompted us to start working on Langfuse. As many teams were experimenting, we prioritized easy integrations with frameworks like LangChain and LlamaIndex, as well as wrapping the OpenAI SDK to log individual LLM calls. As applications become increasingly complex and agents are deployed in production, the ability to trace complex applications is even more crucial.
+
+[Traces](/docs/tracing) are at the core of the Langfuse platform. Until now, generating a trace was straightforward if you used a framework. However, if you didn't, you had to use the low-level SDKs to manually create and nest trace objects, which added verbose instrumentation code to a project. Inspired by the developer experience of tools we admire, such as Sentry and Modal, we created the `@observe()` decorator for Langfuse to make tracing your Python code as simple as possible. Note: we plan to revamp our JS/TS tracing as well.
+
+This post is a deep dive into the `@observe()` decorator, our design objectives, challenges we faced when implementing it, and how it can help you trace complex LLM applications.
+
+## Goals
+
+When starting to work on the decorator, we wanted to make everything simple that's complex when manually instrumenting your code. Consider this real-world trace of a complex agent, where maintaining the nesting hierarchy manually using the low-level SDK used to come with additional complexity and boilerplate code.
+
+_Example of a nested trace with multiple LLM, non-LLM calls and a LangChain chain_
+
+<Video
+  src="/images/blog/python-decorator/complex-trace.mp4"
+  gifStyle
+  aspectRatio={16 / 9.93}
+/>
+
+The goal of the `@observe()` decorator is to abstract away the complexity of creating and nesting traces and spans, and to make it easy to trace complex applications.
+
+The decorator should:
+
+- Trace all function calls and their outputs in a single trace while maintaining the nesting hierarchy
+- Be easy to use and require minimal changes to your code
+- Automatically capture the function name, arguments, return value, streamed completions, exceptions, execution time, and nesting of functions
+- Be fully compatible with the native [Langfuse integrations](/docs/integrations) for LangChain, LlamaIndex, and the OpenAI SDK
+- Encourage reusable abstractions across an LLM-based application without needing to consider how to pass trace objects around
+- Support async environments for our many users that run performance-optimized LLM apps
+
+## Design decisions
+
+1. The decorator reuses the low-level SDK to create traces and asynchronously batch them to the Langfuse API. This implementation was derived from the PostHog SDKs and is [tested](/guides/cookbook/langfuse_sdk_performance_test) to have little to no impact on the performance of your application.
+2. The decorator maintains a call stack internally that keeps track of nested function calls to reflect the observation hierarchy in the trace.
+3. To be async-safe, the decorator leverages [Python Contextvars](https://docs.python.org/3/library/contextvars.html) for managing its state.
+4. The same `observe()` decorator is used to create a trace (outermost decorated function) and to add spans to the trace (inner decorated functions). This way, functions can be used in multiple traces without needing to be strictly a "trace" or a "span" function.
+
+## Limitation: Python Contextvars and ThreadPoolExecutors
+
+The power of observability is most visible in complex applications with production workloads. Async and concurrent environments are common in these applications, and the decorator should work seamlessly in these environments. The decorator uses Python's `contextvars` to store the current trace context and to ensure that the observations are correctly associated with the current execution context. This allows you to use the decorator in reliably in async functions.
+
+However, an important exception are Python's ThreadPoolExecutors and ProcessPoolExecutors. The decorator will not work correctly in these environments, as the `contextvars` are not correctly copied to the new threads or processes. There is an [existing issue](https://github.com/python/cpython/pull/9688#issuecomment-544304996) in Python's standard library and a [great explanation](https://github.com/tiangolo/fastapi/issues/2776#issuecomment-776659392) in the fastapi repo that discusses this limitation.
+
+In short, the decorator will work correctly in async environments, but not in ThreadPoolExecutors or ProcessPoolExecutors.
+
+## Before and after
+
+### Status quo: Low-level SDK
+
+The low-level SDK is very flexible but it is also _very_ verbose and requires passing of Langfuse objects.
+
+```python /langfuse = Langfuse()/ /span = trace.span(name="story")/ /trace_id=trace.id/ /parent_observation_id=span.id/ /span.end(output=output)/ /trace = langfuse.trace("main")/ /trace/
+from langfuse import Langfuse
+from langfuse.openai import openai # OpenAI integration
+
+langfuse = Langfuse()
+
+def story(trace):
+  span = trace.span(name="story")
+  output = openai.chat.completions.create(
+        model="gpt-3.5-turbo",
+        max_tokens=100,
+        messages=[
+          {"role": "system", "content": "You are a great storyteller."},
+          {"role": "user", "content": "Once upon a time in a galaxy far, far away..."}
+        ],
+        trace_id=trace.id,
+        parent_observation_id=span.id
+    ).choices[0].message.content
+  span.end(output=output)
+  return output
+
+
+def main():
+  trace = langfuse.trace("main")
+  return story(trace)
+```
+
+### `@observe()` decorator to the rescue
+
+All complexity is abstracted away and you can focus on your business logic. The OpenAI SDK wrapper is aware that it is run within a decorated function and automatically adds its logs to the trace.
+
+```python /@observe()/
+from langfuse.decorators import observe
+from langfuse.openai import openai # OpenAI integration
+
+@observe()
+def story():
+    return openai.chat.completions.create(
+        model="gpt-3.5-turbo",
+        max_tokens=100,
+        messages=[
+          {"role": "system", "content": "You are a great storyteller."},
+          {"role": "user", "content": "Once upon a time in a galaxy far, far away..."}
+        ],
+    ).choices[0].message.content
+
+@observe()
+def main():
+    return story()
+
+main()
+```
+
+<Frame fullWidth>
+  ![Simple OpenAI decorator
+  trace](/images/docs/python-decorator-simple-trace.png)
+</Frame>
+
+## Interoperability
+
+The decorator completely replaces the need to use the low-level SDK. It allows for the creation and manipulation of traces, and you can add custom scores and evaluations to these traces as well. Have a look at the extensive [documentation](/docs/sdk/python/decorators) for more details.
+
+Langfuse is natively integrated with LangChain, LlamaIndex, and the OpenAI SDK and the decorator is [fully compatible](/docs/sdk/python/example#interoperability-with-other-integrations) with these integrations. As a result, you can, in a single trace, use the decorator on the outermost function, decorate function calls and API calls that are non-LLM related, and use the native instrumentation for the OpenAI SDK, LangChain and Llama Index.
+
+Example ([cookbook](/guides/cookbook/python_decorators#interoperability-with-other-integrations)):
+
+```python
+from langfuse.openai import openai
+from langfuse.decorators import observe
+
+@observe()
+def openai_fn(calc: str):
+    res = openai.chat.completions.create(
+        model="gpt-3.5-turbo",
+        messages=[
+          {"role": "system", "content": "You are a very accurate calculator. You output only the result of the calculation."},
+          {"role": "user", "content": calc}],
+    )
+    return res.choices[0].message.content
+
+@observe()
+def llama_index_fn(question: str):
+    # Set callback manager for LlamaIndex, will apply to all LlamaIndex executions in this function
+    langfuse_handler = langfuse_context.get_current_llama_index_handler()
+    Settings.callback_manager = CallbackManager([langfuse_handler])
+
+    # Run application
+    index = VectorStoreIndex.from_documents([doc1,doc2])
+    response = index.as_query_engine().query(question)
+    return response
+
+@observe()
+def langchain_fn(person: str):
+    # Get Langchain Callback Handler scoped to the current trace context
+    langfuse_handler = langfuse_context.get_current_langchain_handler()
+
+    # Pass handler to invoke method of chain/agent
+    chain.invoke({"person": person}, config={"callbacks":[langfuse_handler]})
+
+@observe()
+def main():
+    output_openai = openai_fn("5+7")
+    output_llamaindex = llama_index_fn("What did he do growing up?")
+    output_langchain = langchain_fn("Feynman")
+
+    return output_openai, output_llamaindex, output_langchain
+
+main();
+```
+
+## Outlook
+
+The decorator drives open tracing for teams that don't want to commit to a single application framework or ecosystem, but want to easily switch between frameworks while relying on Langfuse as a single [platform](/docs) for all experimentation, observability and evaluation needs.
+
+Roadmap: The decorator is currently only available for Python. We will add a similar implementation for JS/TS.
+
+## Add-on
+
+If you want to built complex applications while being able to easily switch between models, we strongly recommend using this stack:
+
+- Langfuse Decorator for tracing
+- Langfuse OpenAI SDK Wrapper for automatic instrumentation of OpenAI calls
+- LiteLLM Proxy for standardization of 100+ models on the OpenAI API
+
+Have a look at [this cookbook](/guides/cookbook/integration_litellm_proxy) to see an end-to-end example – we really think you'll like this stack and there are lots of teams in the Langfuse Community who built on top of it.
+
+## Thank you
+
+Thank you to everyone who tested the decorator during the beta phase and provided feedback. We've received several Gists from community members showcasing their own decorator implementations built using Langfuse before the decorator was released as an official integration. We're excited to see what you create with it!
+
+## Get Started
+
+Run the end-to-end cookbook on your Langfuse traces or learn more about model-based evals in Langfuse.
+
+import { FileCode, BookOpen } from "lucide-react";
+
+<Cards num={3}>
+  <Card title="Docs" href="/docs/sdk/python/decorators" icon={<BookOpen />} />
+  <Card
+    title="Example notebook"
+    href="/docs/sdk/python/example"
+    icon={<FileCode />}
+  />
+</Cards>
diff --git a/public/images/blog/python-decorator/complex-trace.mp4 b/public/images/blog/python-decorator/complex-trace.mp4
diff --git a/theme.config.tsx b/theme.config.tsx
@@ -25,7 +25,7 @@ import {
   AvailabilityBanner,
   AvailabilitySidebar,
 } from "./components/availability";
-import { CloudflareVideo } from "./components/Video";
+import { CloudflareVideo, Video } from "./components/Video";
 
 const config: DocsThemeConfig = {
   logo: <Logo />,
@@ -227,6 +227,7 @@ const config: DocsThemeConfig = {
     AvailabilityBanner,
     Callout,
     CloudflareVideo,
+    Video,
   },
   banner: {
     key: "launch-week",