Need help understanding the role of blob storage in Langfuse #10219
-
|
Hello. I'm doing self hosting and in our org we have a requirement we only retain data for 30 days in our s3 buckets. I was wondering if its safe to expire the s3 blob storage items as its only being used as storage for the worker. But once the worker has upserted into clickhouse the blog storage item should technically no longer be needed. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
|
Based on the documentation, you're correct that Langfuse uses S3/blob storage to store raw events before the worker processes them and upserts them into ClickHouse(1)(2). The blob storage serves as a scalable and durable storage solution for large files with strong read-after-write guarantees(1). Regarding your 30-day retention policy: The documentation recommends keeping events for as long as you want to access them within the UI or update them(3). For most customers, a default of 30 days is suggested as a good choice(3). You can implement lifecycle rules to automatically remove old files from your blob storage(3). A 30-day retention policy should be safe for the event upload bucket once the data has been processed by the worker and upserted into ClickHouse(3). Important exception: This does not apply to the media bucket used for storing uploaded media files(3). Setting a retention policy on the media bucket is not recommended because:
Instead, use the Langfuse data-retention feature to manage media files properly and avoid broken references across the product(3). Configuration note: When using blob storage as a disk for ClickHouse, the documentation recommends:
📚 Sources: Have another question? Just tag @inkeep. |
Beta Was this translation helpful? Give feedback.
-
|
@dahnny012 Please see our scaling docs around blob storage usage: https://langfuse.com/self-hosting/configuration/scaling#s3--blob-storage-disk-usage. Having a lifecycle rule is safe as long as there are no event updates happening that exceed the lifecycle duration. |
Beta Was this translation helpful? Give feedback.
@dahnny012 Please see our scaling docs around blob storage usage: https://langfuse.com/self-hosting/configuration/scaling#s3--blob-storage-disk-usage. Having a lifecycle rule is safe as long as there are no event updates happening that exceed the lifecycle duration.