Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation: lack of important documentation in multiple areas #21

Open
zamazan4ik opened this issue Mar 28, 2023 · 1 comment
Open
Assignees
Labels
documentation Improvements or additions to documentation

Comments

@zamazan4ik
Copy link
Contributor

After reading the documentation I've noticed the lack of multiple important areas:

  • Does it support any kind of multi-cluster support? E.g. Cross-Cluster Replication, Stand-By Clusters.
  • Supported Hardware Architectures/OS combinations. From the Install page, it's not clear, which hardware architectures (like x86-64, ARM, etc.) are supported on which operating systems. If you have special requirements for the instruction sets (like SSE or AVX) - write about this in the documentation too. This information should be available in official documentation, not only in https://github.com/ytsaurus/ytsaurus/blob/main/BUILD.md
  • Are there any recommendations regarding setup on cloud environments (like AWS/Azure/GCP/Yandex Cloud)? E.g. reference architectures, recommended hardware (e.g. recommended AWS EC2 machine type, disks, etc.), maybe even ready-to-use Terraform scripts? What about reference deploy architectures for on-premise installations?
  • How could I install Highly-Available (HA) cluster? Are there any restrictions/recommendations regarding network latency between nodes? Recommendations regarding clusters across multiple Availability zones also would be useful.
  • How to upgrade/downgrade YTsaurus? Does it support zero-downtime upgrades and downgrades? What about backward/forward compatibility between releases - what is the current policy?
  • How to backup and restore YTsaurus? Are there any built-in integrity checks for the backup?
  • How to monitor YTsaurus? Does it support any kind of integrated monitoring (like Prometheus endpoint, statsd integration, etc.)? If yes, how to configure it, and which metrics are supported?
  • Would be great if you would be able to publish and maintain a public roadmap for the product.
  • Is there any built-in benchmark utility like it's done in YDB (https://ydb.tech/en/docs/development/load-actors-overview)? Would be useful for benchmarking, choosing the proper cluster size, and performing PGO optimizations.
  • Having public-available benchmarks (like https://benchmark.clickhouse.com/) also would be nice to have.
  • Do you perform Jepsen-like tests? :)

I think this list could be somehow transformed into the documentation task epic and could be resolved step by step.

Thanks in advance!

@gritukan
Copy link
Member

gritukan commented Mar 29, 2023

Hello. First of all thank you very much for so detailed feedback.

Does it support any kind of multi-cluster support? E.g. Cross-Cluster Replication, Stand-By Clusters.

We support automatic replication of dynamic tables link. For static tables there is a remote copy operation which copies table from one cluster to another link.

Supported Hardware Architectures/OS combinations.

In BUILD.md it's written that we support only x86_64 and Linux. I am not aware of any special instruction set requirements, but I need to check with colleagues. We will add these requirements to BUILD,md if any.

Are there any recommendations regarding setup on cloud environments
How could I install Highly-Available (HA) cluster?
How to backup and restore YTsaurus?
How to monitor YTsaurus?

These are very good questions regarding administrator guide which is not completed yet.

cc: @psushin, who is working on it.

Would be great if you would be able to publish and maintain a public roadmap for the product.

Yes, this is important, we will think about it.

Is there any built-in benchmark utility like it's done in YDB

Unfortunately, not.

Having public-available benchmarks (like https://benchmark.clickhouse.com/) also would be nice to have.

Right now we are working on running TPC-DS benchmark over YTsaurus and will probably publish results in a few month. After that we plan to run TPC-C and YCSB to test performance of dynamic tables.

Do you perform Jepsen-like tests? :)

For now, we did not run them, but I really want to. We have stress-test for Hydra (our RSM library), but Jepsen tests have much more rich functionality.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

4 participants