Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain how to configure ClickHouse on S3, with some potential caveats. #1385

Open
alexey-milovidov opened this issue Jul 26, 2023 · 1 comment

Comments

@alexey-milovidov
Copy link
Member

alexey-milovidov commented Jul 26, 2023

Set up TTL for incomplete multipart uploads.
Do not set up a lifecycle policy for object deletion.
Do not set up object versioning.
Do not set up bucket replication.
Do not use FUSE over S3.
Do not point multiple shards to the same prefix in S3.
Do not delete objects on S3 manually.
Use separate buckets for ClickHouse and for other workloads.
How to properly configure the S3 bucket policy to allow the needed operations.
How to properly configure the instance IAM profiles.
How KMS affects performance.
Multi-region vs. single region buckets in GCS.
Choosing the proper region for a bucket.
Caveats of using R2 and B2.
Caveats of using Minio.
The need for a VPC gateway for S3.

@alifirat
Copy link

alifirat commented Oct 3, 2023

I'll also to recommend to add a daily check about your bucket size and the amount of data referenced by ClickHouse.

On the AWS, you can use CloudWatch to monitor your bucket size (it's published once) and compare to it to the system.parts table.

@johnnymatthews johnnymatthews changed the title Knowledge Base: how to configure ClickHouse on S3, potential caveats Explain how to configure ClickHouse on S3, with some potential caveats. Nov 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants