Introduce docs about BlobStorage performance metrics #2509

serbel324 · 2024-03-06T12:57:19Z

...

github-actions · 2024-03-06T13:02:21Z

✅ Documentation build

Revision built successfully
Revision preview link

github-actions · 2024-03-07T15:07:04Z

❌ Documentation build

Revision build failed

Build logs

Errors (1)

❌ Link is unreachable: ../deploy/configuration/config.md#blob-storage-config in /ru/maintenance/manual/performance_metrics.md

github-actions · 2024-03-07T15:15:44Z

✅ Documentation build

Revision built successfully
Revision preview link

github-actions · 2024-03-07T15:25:23Z

✅ Documentation build

Revision built successfully
Revision preview link

github-actions · 2024-03-07T15:38:30Z

✅ Documentation build

Revision built successfully
Revision preview link

github-actions · 2024-03-12T10:16:36Z

✅ Documentation build

Revision built successfully
Revision preview link

github-actions · 2024-03-12T11:00:03Z

✅ Documentation build

Revision built successfully
Revision preview link

ydb/docs/ru/core/maintenance/manual/performance_metrics.md

nsofya · 2024-03-27T14:16:13Z

ydb/docs/ru/core/maintenance/manual/performance_metrics.md

+
+### Настройка метрик
+
+Поскольку коэффициенты для формулы cost измерялись на конкретных физических устройствах, а производительность других устройств может отличаться, метрики могут потребовать масштабирования для использования их в качестве источника гарантий BlobStorage. Для этого задайте параметру `DiskTimeAvailableScale` в [конфигурации BlobStorage](../../deploy/configuration/config.md#blob-storage-config) значение, равное отношению производительности устройств кластера и эталона. Например, если ваша система использует NVME устройства и обеспечивает на 10% более высокую производительность, чем эталон, то задайте следующую конфигурацию:


Вот тут не хватает информации, что за устройства были. Я, конечно, не настоящий сварщик, но предположу, что для идентичных эталону дисков коэффициент должен быть 1.
Т.е. сейчас выглядит так, что подбор и исследование коэффициентов перешли на плечи пользователя и значительно осложнили его жизнь. Я бы предположила, что сценарий использования следующий: берется типовой диск определенного типа, настраивается сторадж, и дальше мы по метрикам смотрим, что он работает как нужно.

А вот если с диском что-то не так, нужно проводить дополнительные калибровки и замеры. Ведь вполне может быть, что при запуске с дисками было все хорошо, а потом их рабочие показатели ухудшились. И вот только тогда имеет смысл что-то пересчитывать и замерять.

для идентичных эталону дисков коэффициент должен быть 1.

Да, все так, по умолчанию там 1. Добавлю это в документацию.

Я бы предположила, что сценарий использования следующий: берется типовой диск определенного типа, настраивается сторадж, и дальше мы по метрикам смотрим, что он работает как нужно.

В теории можно так сделать, но тогда нужен не один диск, а несколько, чтобы на них можно было сторадж-группы поднять.

Т.е. сейчас выглядит так, что подбор и исследование коэффициентов перешли на плечи пользователя и значительно осложнили его жизнь.

Ну мы не можем знать, какие устройства будут на кластере пользователя, мы можем лишь предположить, что примерно такие, как у нас. Коэффициенты имеет смысл подкручивать, если доступное время диска в момент начала перегрузок не сходится с оценкой стоимости, то есть из практического опыта использования.

А вот если с диском что-то не так, нужно проводить дополнительные калибровки и замеры. Ведь вполне может быть, что при запуске с дисками было все хорошо, а потом их рабочие показатели ухудшились. И вот только тогда имеет смысл что-то пересчитывать и замерять.

Такое может быть, но это как раз сигнализирует о проблеме с железом, тормозящее железо надо менять, потому что от пары медленных дисков лейтенси записи на всю группу деградирует.

github-actions · 2024-05-27T10:35:25Z

❌ Documentation build

Revision build failed

Build logs

Errors (1)

❌ Link is unreachable: ../../development/load-actors-storage.md in /ru/maintenance/manual/performance_metrics.md

github-actions · 2024-05-27T10:53:12Z

❌ Documentation build

Revision build failed

Build logs

Errors (1)

❌ Link is unreachable: ../../development/load-actors-storage.md in /ru/maintenance/manual/performance_metrics.md

github-actions · 2024-05-27T11:02:53Z

❌ Documentation build

Revision build failed

Build logs

Errors (1)

❌ Link is unreachable: ../../development/load-actors-storage.md in /ru/maintenance/manual/performance_metrics.md

github-actions · 2024-05-27T11:27:54Z

✅ Documentation build

Revision built successfully
Revision preview link

github-actions · 2024-05-31T11:39:51Z

✅ Documentation build

Revision built successfully
Revision preview link

ydb/docs/ru/core/maintenance/manual/performance_metrics.md

github-actions · 2024-07-08T13:40:55Z

❌ Documentation build

Revision build failed

Build logs

Errors (1)

❌ No such file or has no access to /ru/maintenance/manual/performance_metrics.md

github-actions · 2024-07-08T13:48:15Z

❌ Documentation build

Revision build failed

Build logs

Errors (1)

❌ No such file or has no access to /ru/maintenance/manual/performance_metrics.md

github-actions · 2024-07-08T14:12:35Z

❌ Documentation build

Revision build failed

Build logs

Errors (2)

❌ Link is unreachable: ../../maintenance/manual/dynamic-config.md in /ru/reference/observability/metrics/performance.md

❌ No such file or has no access to /ru/maintenance/manual/performance_metrics.md

github-actions · 2024-07-08T14:23:40Z

❌ Documentation build

Revision build failed

Build logs

Errors (1)

❌ Link is unreachable: ../../maintenance/manual/dynamic-config.md in /ru/reference/observability/metrics/performance.md

github-actions · 2024-07-08T15:14:40Z

✅ Documentation build

Revision built successfully
Revision preview link

github-actions · 2024-07-10T12:58:00Z

✅ Documentation build

Revision built successfully
Revision preview link

ydb/docs/en/core/reference/observability/metrics/toc_p.yaml

ydb/docs/en/core/reference/observability/metrics/performance.md

github-actions · 2024-07-16T14:29:52Z

✅ Documentation build

Revision built successfully
Revision preview link

github-actions · 2024-07-16T15:51:38Z

✅ Documentation build

Revision built successfully
Revision preview link

ydb/docs/en/core/reference/observability/metrics/distributed-storage-performance.md

…torage-performance.md

github-actions · 2024-07-17T10:58:51Z

✅ Documentation build

Revision built successfully
Revision preview link

…4/ydb into YDBDOCS-615-perforamnce-metrics

github-actions · 2024-07-17T13:19:12Z

✅ Documentation build

Revision built successfully
Revision preview link

lgtm

fomichev3000 · 2024-09-19T08:27:49Z

ydb/docs/en/core/reference/observability/metrics/distributed-storage-performance.md

+
+### Available disk time {#diskTimeAvailable}
+
+The PDisk scheduler manages the requests execution order from its client VDisks. PDisk fairly divides the device's time among its VDisks, ensuring that each of the $n$ VDisks is guaranteed $1/n$ seconds of the physical device's working time each second. Based on the information about the number of neighboring VDisks for each VDisk, denoted as $N$, and the configurable parameter `DiskTimeAvailableScale`, the available disk time estimate, referred to as `DiskTimeAvailable`, is calculated by the formula:


Тема DiskTimeAvailableScale не раскрыта. Не понятно, что это такое.

fomichev3000 · 2024-09-19T08:28:27Z

ydb/docs/en/core/reference/observability/metrics/distributed-storage-performance.md

+
+### Available disk time {#diskTimeAvailable}
+
+The PDisk scheduler manages the requests execution order from its client VDisks. PDisk fairly divides the device's time among its VDisks, ensuring that each of the $n$ VDisks is guaranteed $1/n$ seconds of the physical device's working time each second. Based on the information about the number of neighboring VDisks for each VDisk, denoted as $N$, and the configurable parameter `DiskTimeAvailableScale`, the available disk time estimate, referred to as `DiskTimeAvailable`, is calculated by the formula:


n и N -- это по сути одно и то же в рамках заданного контекста? Если да, давай использовать что-то одно, чтобы не путать читателя

fomichev3000 · 2024-09-19T08:30:26Z

ydb/docs/ru/core/reference/observability/metrics/distributed-storage-performance.md

+    DiskTimeAvailable = \dfrac{1000000000}{N} \cdot \dfrac{DiskTimeAvailableScale}{1000} 
+$$
+
+### Детектор всплесков нагрузки {#burstDetector}


Раздел называется "Детектор всплексков нагрузки", но в нем дается только определение всплеска, а не детектора.

fomichev3000 · 2024-09-19T08:31:37Z

ydb/docs/ru/core/reference/observability/metrics/distributed-storage-performance.md

+Всплеск - это резкое краткосрочное повышение нагрузки на VDisk, которое может приводить к деградации времени отклика операций. Значения сенсоров с нод кластера собираются через определенные промежутки времени, например, раз в 15 секунд, что делает невозможным надежное обнаружение краткосрочных событий с помощью одних только метрик стоимости запросов и доступного времени диска. Для решения этой задачи используется модифицированный [алгоритм Token Bucket](https://ru.wikipedia.org/wiki/%D0%90%D0%BB%D0%B3%D0%BE%D1%80%D0%B8%D1%82%D0%BC_%D1%82%D0%B5%D0%BA%D1%83%D1%89%D0%B5%D0%B3%D0%BE_%D0%B2%D0%B5%D0%B4%D1%80%D0%B0), в нашей модификации в ведре может быть отрицательное количество токенов, и такое состояние мы будем называть underflow. К каждому VDisk'у привязан отдельный объект Token Bucket. Минимальное ожидаемое время отклика, при котором повышение нагрузки считается всплеском, определяется настраиваемым параметром `BurstThresholdNs`. Ведро будет перехдить в состояние underflow, если расчетная длительность обработки всплеска запросов в наносекундах превысит значение `BurstThresholdNs`.
+
+### Метрики производительности
+Метрики производительности вычисляются на основе следующих сенсоров VDisk'а:


Мне как пользователю вообще не понятно, где эти метрики брать и смотреть. Где мне их искать? В логах, на странице мониторинга (ссылку можно?), и т.п.

serbel324 requested a review from a team as a code owner March 6, 2024 12:57

serbel324 force-pushed the YDBDOCS-615-perforamnce-metrics branch 2 times, most recently from 04e790c to 2abf9a6 Compare March 7, 2024 15:04

serbel324 force-pushed the YDBDOCS-615-perforamnce-metrics branch from 2abf9a6 to 48ce279 Compare March 7, 2024 15:12

serbel324 force-pushed the YDBDOCS-615-perforamnce-metrics branch from 48ce279 to 06ac6a4 Compare March 7, 2024 15:21

serbel324 force-pushed the YDBDOCS-615-perforamnce-metrics branch from 06ac6a4 to 766dfab Compare March 7, 2024 15:32

azevaykin reviewed Mar 26, 2024

View reviewed changes

nsofya reviewed Mar 27, 2024

View reviewed changes

blinkov requested changes Jun 14, 2024

View reviewed changes

serbel324 added 9 commits July 3, 2024 11:17

Add docs about performance metrics

bc9e240

Intermediate

5b83a9a

Add information about fine-tuning

99f8bc3

Address comments

e97ccf1

Address more cthulhu comments

fdba9b2

address comments

532c7af

Update performance_metrics.md

f497462

Update performance_metrics.md

b24a8e7

Update performance_metrics.md

2913993

serbel324 added 2 commits July 3, 2024 11:17

Update performance_metrics.md

3c970df

Add docs about blobstorage performance metrics

0df425e

serbel324 force-pushed the YDBDOCS-615-perforamnce-metrics branch from 72558e2 to 0df425e Compare July 8, 2024 13:37

Rename BlobStorage -> Distributed Storage

81e190b

serbel324 force-pushed the YDBDOCS-615-perforamnce-metrics branch from 73ab3a5 to 81e190b Compare July 8, 2024 13:44

serbel324 force-pushed the YDBDOCS-615-perforamnce-metrics branch from df82567 to 1a0c384 Compare July 8, 2024 14:18

Add pages to index

41d42b8

serbel324 force-pushed the YDBDOCS-615-perforamnce-metrics branch from 1a0c384 to 41d42b8 Compare July 8, 2024 14:19

Fix paths

0dde757

Fix typos

9a3e494

blinkov previously requested changes Jul 15, 2024

View reviewed changes

Address more comments

09dd67b

Update distributed-storage-performance.md

579a102

blinkov reviewed Jul 17, 2024

View reviewed changes

ydb/docs/en/core/reference/observability/metrics/distributed-storage-performance.md Outdated Show resolved Hide resolved

Update ydb/docs/en/core/reference/observability/metrics/distributed-s…

7b2e58e

…torage-performance.md

serbel324 added 2 commits July 17, 2024 13:13

Address more comments

171dc65

Merge branch 'YDBDOCS-615-perforamnce-metrics' of github.com:serbel32…

22b9c2f

…4/ydb into YDBDOCS-615-perforamnce-metrics

blinkov enabled auto-merge (squash) September 18, 2024 03:43

fomichev3000 requested changes Sep 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce docs about BlobStorage performance metrics #2509

Introduce docs about BlobStorage performance metrics #2509

serbel324 commented Mar 6, 2024 •

edited

Loading

github-actions bot commented Mar 6, 2024

github-actions bot commented Mar 7, 2024

github-actions bot commented Mar 7, 2024

github-actions bot commented Mar 7, 2024

github-actions bot commented Mar 7, 2024

github-actions bot commented Mar 12, 2024

github-actions bot commented Mar 12, 2024

nsofya Mar 27, 2024

serbel324 May 27, 2024

github-actions bot commented May 27, 2024

github-actions bot commented May 27, 2024

github-actions bot commented May 27, 2024

github-actions bot commented May 27, 2024

github-actions bot commented May 31, 2024

github-actions bot commented Jul 8, 2024

github-actions bot commented Jul 8, 2024

github-actions bot commented Jul 8, 2024

github-actions bot commented Jul 8, 2024

github-actions bot commented Jul 8, 2024

github-actions bot commented Jul 10, 2024

github-actions bot commented Jul 16, 2024

github-actions bot commented Jul 16, 2024

github-actions bot commented Jul 17, 2024

github-actions bot commented Jul 17, 2024

fomichev3000 Sep 19, 2024

fomichev3000 Sep 19, 2024

fomichev3000 Sep 19, 2024

fomichev3000 Sep 19, 2024


		### Настройка метрик

		Поскольку коэффициенты для формулы cost измерялись на конкретных физических устройствах, а производительность других устройств может отличаться, метрики могут потребовать масштабирования для использования их в качестве источника гарантий BlobStorage. Для этого задайте параметру `DiskTimeAvailableScale` в [конфигурации BlobStorage](../../deploy/configuration/config.md#blob-storage-config) значение, равное отношению производительности устройств кластера и эталона. Например, если ваша система использует NVME устройства и обеспечивает на 10% более высокую производительность, чем эталон, то задайте следующую конфигурацию:


		### Available disk time {#diskTimeAvailable}

		The PDisk scheduler manages the requests execution order from its client VDisks. PDisk fairly divides the device's time among its VDisks, ensuring that each of the $n$ VDisks is guaranteed $1/n$ seconds of the physical device's working time each second. Based on the information about the number of neighboring VDisks for each VDisk, denoted as $N$, and the configurable parameter `DiskTimeAvailableScale`, the available disk time estimate, referred to as `DiskTimeAvailable`, is calculated by the formula:

Introduce docs about BlobStorage performance metrics #2509

Are you sure you want to change the base?

Introduce docs about BlobStorage performance metrics #2509

Conversation

serbel324 commented Mar 6, 2024 • edited Loading

github-actions bot commented Mar 6, 2024

✅ Documentation build

github-actions bot commented Mar 7, 2024

❌ Documentation build

Build logs

Errors (1)

github-actions bot commented Mar 7, 2024

✅ Documentation build

github-actions bot commented Mar 7, 2024

✅ Documentation build

github-actions bot commented Mar 7, 2024

✅ Documentation build

github-actions bot commented Mar 12, 2024

✅ Documentation build

github-actions bot commented Mar 12, 2024

✅ Documentation build

nsofya Mar 27, 2024

Choose a reason for hiding this comment

serbel324 May 27, 2024

Choose a reason for hiding this comment

github-actions bot commented May 27, 2024

❌ Documentation build

Build logs

Errors (1)

github-actions bot commented May 27, 2024

❌ Documentation build

Build logs

Errors (1)

github-actions bot commented May 27, 2024

❌ Documentation build

Build logs

Errors (1)

github-actions bot commented May 27, 2024

✅ Documentation build

github-actions bot commented May 31, 2024

✅ Documentation build

github-actions bot commented Jul 8, 2024

❌ Documentation build

Build logs

Errors (1)

github-actions bot commented Jul 8, 2024

❌ Documentation build

Build logs

Errors (1)

github-actions bot commented Jul 8, 2024

❌ Documentation build

Build logs

Errors (2)

github-actions bot commented Jul 8, 2024

❌ Documentation build

Build logs

Errors (1)

github-actions bot commented Jul 8, 2024

✅ Documentation build

github-actions bot commented Jul 10, 2024

✅ Documentation build

github-actions bot commented Jul 16, 2024

✅ Documentation build

github-actions bot commented Jul 16, 2024

✅ Documentation build

github-actions bot commented Jul 17, 2024

✅ Documentation build

github-actions bot commented Jul 17, 2024

✅ Documentation build

fomichev3000 Sep 19, 2024

Choose a reason for hiding this comment

fomichev3000 Sep 19, 2024

Choose a reason for hiding this comment

fomichev3000 Sep 19, 2024

Choose a reason for hiding this comment

fomichev3000 Sep 19, 2024

Choose a reason for hiding this comment

serbel324 commented Mar 6, 2024 •

edited

Loading