Skip to content

oiime/langfuse-go

Repository files navigation

langfuse-go

Go SDK for Langfuse. The core module has zero dependencies beyond the standard library.

Why this library?

There's an existing Go SDK, but it wraps LLM calls behind its own API, manages implicit state (current trace, current span), and requires calling methods in a specific order. That works for simple scripts but gets in the way when you need concurrency or want to keep using provider SDKs directly.

This library does things differently:

  • You keep using the official OpenAI, Anthropic, Gemini, or xAI Go SDKs as-is. Langfuse hooks in through each SDK's native middleware/transport mechanism.
  • Every Trace, Span, and Generation is a value you hold and pass around. No global state, no implicit "current span" stack.
  • Traces propagate through context.Context like any other Go library.
  • The core module only imports the standard library. Each middleware is a separate module so you only pull in the provider SDK you actually use.

Install

Core library:

go get github.com/oiime/langfuse-go

Provider middlewares (install only what you need):

go get github.com/oiime/langfuse-go/middlewares/openai
go get github.com/oiime/langfuse-go/middlewares/anthropic
go get github.com/oiime/langfuse-go/middlewares/gemini
go get github.com/oiime/langfuse-go/middlewares/xai

Quick start

Hook Langfuse into the OpenAI Go SDK -- every call is automatically traced:

package main

import (
    "context"
    "fmt"

    "github.com/oiime/langfuse-go"
    lfopenai "github.com/oiime/langfuse-go/middlewares/openai"
    "github.com/openai/openai-go/v3"
    "github.com/openai/openai-go/v3/option"
)

func main() {
    ctx := context.Background()

    lf := langfuse.New(langfuse.Config{})  // reads LANGFUSE_PUBLIC_KEY, LANGFUSE_SECRET_KEY from env
    defer lf.Shutdown(ctx)

    client := openai.NewClient(
        option.WithMiddleware(lfopenai.Middleware(lf)),
    )

    resp, err := client.Chat.Completions.New(ctx, openai.ChatCompletionNewParams{
        Model:    openai.ChatModelGPT4o,
        Messages: []openai.ChatCompletionMessageParamUnion{
            openai.UserMessage("How much wood could a woodchuck chuck if a woodchuck could chuck wood"),
        },
    })
    if err != nil {
        panic(err)
    }
    fmt.Println(resp.Choices[0].Message.Content)
}

Middlewares

Each middleware is a separate Go module:

Provider Module Mechanism
OpenAI github.com/oiime/langfuse-go/middlewares/openai option.WithMiddleware
Anthropic github.com/oiime/langfuse-go/middlewares/anthropic option.WithMiddleware
Gemini github.com/oiime/langfuse-go/middlewares/gemini http.Client{Transport: ...}
xAI (Grok) github.com/oiime/langfuse-go/middlewares/xai http.Client{Transport: ...}

See examples for runnable code for each provider, including streaming.

Grouping calls under a trace

By default each LLM call creates its own trace. To group calls, attach a trace to the context:

trace := lf.Trace(langfuse.TraceParams{
    Name:      "my-pipeline",
    UserID:    "user-123",
    SessionID: "session-456",
})
ctx = langfuse.ContextWithTrace(ctx, trace)

// both calls end up under the same trace
resp1, _ := client.Chat.Completions.New(ctx, params1)
resp2, _ := client.Chat.Completions.New(ctx, params2)

// nest under a span
span := trace.Span(langfuse.SpanParams{Name: "retrieval"})
ctx = langfuse.ContextWithSpan(ctx, span)
resp3, _ := client.Chat.Completions.New(ctx, params3)
span.End()

Streaming

Streaming works transparently. The generation is recorded when the stream closes:

stream := client.Chat.Completions.NewStreaming(ctx, params)
for stream.Next() {
    chunk := stream.Current()
    fmt.Print(chunk.Choices[0].Delta.Content)
}
stream.Close()  // generation recorded here

Manual tracing

You can build trace trees directly without any middleware:

lf := langfuse.New(langfuse.Config{})
defer lf.Shutdown(ctx)

trace := lf.Trace(langfuse.TraceParams{
    Name:   "document-qa",
    UserID: "user-123",
    Input:  map[string]any{"question": "What is RAG?"},
})

span := trace.Span(langfuse.SpanParams{Name: "vector-search"})
// ... do retrieval ...
span.End(langfuse.SpanUpdate{Output: docs})

gen := trace.Generation(langfuse.GenerationParams{
    Name:  "answer",
    Model: "gpt-4o",
    Input: messages,
})
// ... call LLM ...
gen.End(langfuse.GenerationUpdate{
    Output: response,
    Usage:  &langfuse.Usage{Input: 150, Output: 50, Total: 200, Unit: "TOKENS"},
})

trace.Score(langfuse.ScoreParams{Name: "relevance", Value: 0.95})
trace.Update(langfuse.TraceParams{Output: answer})

Spans and generations can nest arbitrarily:

span := trace.Span(langfuse.SpanParams{Name: "pipeline"})

  child := span.Span(langfuse.SpanParams{Name: "step-1"})
  child.End()

  gen := span.Generation(langfuse.GenerationParams{Name: "llm", Model: "gpt-4o"})
  gen.End(langfuse.GenerationUpdate{Output: "result"})

  span.Event(langfuse.EventParams{Name: "cache-hit"})

span.End()

Read API

The client wraps the Langfuse read API:

trace, err := lf.GetTrace(ctx, "trace-id-123")
fmt.Println(trace.Name, trace.Latency, len(trace.Observations))

resp, err := lf.ListTraces(ctx, langfuse.TracesListParams{
    Name:   "document-qa",
    UserID: "user-123",
    Limit:  10,
})
for _, t := range resp.Data {
    fmt.Println(t.ID, t.Name, t.TotalCost)
}

obs, _     := lf.GetObservation(ctx, "obs-id")
scores, _  := lf.ListScores(ctx, langfuse.ScoresListParams{TraceID: "trace-id-123"})
session, _ := lf.GetSession(ctx, "session-456")
dataset, _ := lf.GetDataset(ctx, "my-eval-set")
prompt, _  := lf.GetPrompt(ctx, "summarize-v2", 0, "production")

Custom middleware

All built-in middlewares use the same public API. The pattern for any provider:

package myprovider

import (
    "net/http"
    "time"

    "github.com/oiime/langfuse-go"
)

func Transport(lf *langfuse.Client, base http.RoundTripper) http.RoundTripper {
    if base == nil {
        base = http.DefaultTransport
    }
    return &transport{lf: lf, base: base}
}

type transport struct {
    lf   *langfuse.Client
    base http.RoundTripper
}

func (t *transport) RoundTrip(req *http.Request) (*http.Response, error) {
    start := time.Now()

    // 1. Read and parse the request body (model, input, params)
    // 2. Call the real API
    resp, err := t.base.RoundTrip(req)
    if err != nil {
        return resp, err
    }
    // 3. Read the response (output, usage)

    // Resolve trace from context, or auto-create one
    trace, span := langfuse.TraceFromContext(req.Context())
    if trace == nil {
        name := langfuse.TraceNameFromContext(req.Context())
        if name == "" {
            name = "my-provider"
        }
        trace = t.lf.Trace(langfuse.TraceParams{Name: name})
    }

    params := langfuse.GenerationParams{
        Name:      "chat",
        Model:     model,
        Input:     input,
        StartTime: &start,
    }
    var gen *langfuse.Generation
    if span != nil {
        gen = span.Generation(params)
    } else {
        gen = trace.Generation(params)
    }
    gen.End(langfuse.GenerationUpdate{Output: output, Usage: usage})

    return resp, nil
}

Useful context helpers for middleware:

  • langfuse.TraceFromContext(ctx) -- get current Trace and Span
  • langfuse.TraceNameFromContext(ctx) -- get trace name hint
  • langfuse.ContextWithTrace(ctx, trace) -- attach a Trace
  • langfuse.ContextWithSpan(ctx, span) -- attach a Span

Configuration

Field Env var Default Description
PublicKey LANGFUSE_PUBLIC_KEY -- Langfuse public key (required)
SecretKey LANGFUSE_SECRET_KEY -- Langfuse secret key (required)
Host LANGFUSE_BASE_URL https://cloud.langfuse.com API host
FlushInterval -- 500ms Time between flushes
FlushBatch -- 100 Events per batch
MaxBufferSize -- 10000 Max buffered events before dropping oldest
MaxConcurrentFlushes -- 4 Max concurrent HTTP flush calls

Config values take precedence over env vars.

Design

  • Events are buffered and flushed in the background
  • Trace/Span/Generation/Event/Score calls never block
  • Transient failures (5xx, 429) are retried with backoff
  • Memory is bounded; oldest events are dropped if the buffer fills
  • Calling End() multiple times on a Span or Generation is safe

TODO

  • Expand test coverage for Gemini and xAI middlewares
  • Configurable error handler for ingestion failures
  • Write API for prompts and datasets
  • Configurable logger interface (replace log.Printf)

See CONTRIBUTING.md if you want to help.

AI usage

LLM was used for generating API implementation types, fixing grammar, and writing code comments.

License

MIT

About

lightweight golang SDK for langfuse

Resources

License

Contributing

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages