Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[IOTDB-2144][Metric] Collect IoTDB Runtime Metrics #4573

Merged
merged 14 commits into from Dec 22, 2021

Conversation

xinzhongtianxia
Copy link
Contributor

Description

Benefit from the metric framework, we could now collect some core IoTDB Runtime metrics, which will help users get a clear sight of their IoTDB's status and will also help RD an SRE resolve problems or establish their own monitor systems.

  1. These metrics will coverage current core modules, such as cluster、flush、thrift、compaction、cache、JVM、logback...
  2. All metrics will provided with both standard JMX via JMX beans and Prometheus format via http APIs.

jira:
https://issues.apache.org/jira/browse/IOTDB-2144

Copy link
Contributor

@SpriCoder SpriCoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent job! From my perspective, there are some problems need to fix.

@SpriCoder
Copy link
Contributor

@LebronAl This PR contains some metrics in cluster module, PTAL!

@OneSizeFitsQuorum OneSizeFitsQuorum added the Module - Cluster PRs for the cluster module label Dec 16, 2021
Copy link
Contributor

@OneSizeFitsQuorum OneSizeFitsQuorum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for cluster related metric

@SteveYurongSu
Copy link
Member

SteveYurongSu commented Dec 17, 2021

Thanks for the work of @xinzhongtianxia, now we have a performance report.

The report shows that the current monitoring points have no impact on performance.

Disable metrics:

----------------------Main Configurations----------------------
DB_SWITCH: IoTDB-012-SESSION_BY_RECORDS
OPERATION_PROPORTION: 10:1:1:1:1:1:1:1:1:1:1
ENABLE_THRIFT_COMPRESSION: false
INSERT_DATATYPE_PROPORTION: 1:1:1:1:1:1
IS_CLIENT_BIND: true
CLIENT_NUMBER: 15
GROUP_NUMBER: 20
DEVICE_NUMBER: 100
SENSOR_NUMBER: 10
BATCH_SIZE_PER_WRITE: 10
LOOP: 100000
POINT_STEP: 5000
QUERY_INTERVAL: 250000
IS_OUT_OF_ORDER: false
OUT_OF_ORDER_MODE: 0
OUT_OF_ORDER_RATIO: 0.5
---------------------------------------------------------------
main measurements:
Create schema cost 0.04 second
Test elapsed time (not include schema creation): 684.90 second
----------------------------------------------------------Result Matrix----------------------------------------------------------
Operation           okOperation         okPoint             failOperation       failPoint           throughput(point/s)
INGESTION           4998403             499840300           0                   0                   729799.97
PRECISE_POINT       75067               19                  0                   0                   0.03
TIME_RANGE          74887               3743985             0                   0                   5466.47
VALUE_RANGE         75087               3754246             0                   0                   5481.45
AGG_RANGE           75113               75113               0                   0                   109.67
AGG_VALUE           75425               75425               0                   0                   110.13
AGG_RANGE_VALUE     74690               74690               0                   0                   109.05
GROUP_BY            74918               973934              0                   0                   1422.01
LATEST_POINT        75172               75172               0                   0                   109.76
RANGE_QUERY_DESC    74839               3741459             0                   0                   5462.78
VALUE_RANGE_QUERY_DESC74979               3748794             0                   0                   5473.49
---------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------Latency (ms) Matrix--------------------------------------------------------------------------
Operation           AVG         MIN         P10         P25         MEDIAN      P75         P90         P95         P99         P999        MAX         SLOWEST_THREAD
INGESTION           0.30        0.14        0.16        0.16        0.18        0.22        0.26        0.32        1.48        5.72        60007.18    143014.31
PRECISE_POINT       3.42        0.25        2.17        2.23        2.34        2.84        3.26        3.38        4.49        17.16       59801.91    72709.95
TIME_RANGE          2.82        0.48        2.33        2.41        2.52        3.06        3.53        3.65        5.58        17.76       192.43      14380.86
VALUE_RANGE         5.25        0.50        2.41        2.48        2.43        2.86        3.63        3.79        6.11        17.72       59921.04    74458.15
AGG_RANGE           2.70        0.42        2.30        2.36        2.45        2.68        3.43        3.58        4.77        17.36       316.38      13796.32
AGG_VALUE           80.84       0.64        16.50       40.06       78.96       116.41      138.65      148.63      190.91      218.47      60021.10    455553.88
AGG_RANGE_VALUE     4.12        0.63        2.72        2.85        3.03        3.35        4.15        4.44        7.75        19.05       59772.80    76746.97
GROUP_BY            3.33        0.43        2.16        2.22        2.30        2.54        3.19        3.33        4.62        16.99       59468.04    72096.31
LATEST_POINT        0.25        0.14        0.20        0.21        0.23        0.25        0.28        0.29        0.35        2.71        18.46       1288.20
RANGE_QUERY_DESC    3.64        0.48        2.35        2.41        2.53        3.07        3.53        3.66        5.72        17.76       59943.76    74318.56
VALUE_RANGE_QUERY_DESC3.67        0.52        2.42        2.49        2.59        2.86        3.63        3.79        5.84        17.66       60011.17    74054.30

Enable metrics:

----------------------Main Configurations----------------------
DB_SWITCH: IoTDB-012-SESSION_BY_RECORDS
OPERATION_PROPORTION: 10:1:1:1:1:1:1:1:1:1:1
ENABLE_THRIFT_COMPRESSION: false
INSERT_DATATYPE_PROPORTION: 1:1:1:1:1:1
IS_CLIENT_BIND: true
CLIENT_NUMBER: 15
GROUP_NUMBER: 20
DEVICE_NUMBER: 100
SENSOR_NUMBER: 10
BATCH_SIZE_PER_WRITE: 10
LOOP: 100000
POINT_STEP: 5000
QUERY_INTERVAL: 250000
IS_OUT_OF_ORDER: false
OUT_OF_ORDER_MODE: 0
OUT_OF_ORDER_RATIO: 0.5
---------------------------------------------------------------
main measurements:
Create schema cost 0.08 second
Test elapsed time (not include schema creation): 614.42 second
----------------------------------------------------------Result Matrix----------------------------------------------------------
Operation           okOperation         okPoint             failOperation       failPoint           throughput(point/s)
INGESTION           4998403             499840300           0                   0                   813509.32
PRECISE_POINT       75067               15                  0                   0                   0.02
TIME_RANGE          74887               3743969             0                   0                   6093.45
VALUE_RANGE         75087               3754251             0                   0                   6110.19
AGG_RANGE           75113               75113               0                   0                   122.25
AGG_VALUE           75425               75425               0                   0                   122.76
AGG_RANGE_VALUE     74690               74690               0                   0                   121.56
GROUP_BY            74918               973934              0                   0                   1585.12
LATEST_POINT        75172               75172               0                   0                   122.35
RANGE_QUERY_DESC    74839               3741417             0                   0                   6089.30
VALUE_RANGE_QUERY_DESC74979               3748741             0                   0                   6101.22
---------------------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------Latency (ms) Matrix--------------------------------------------------------------------------
Operation           AVG         MIN         P10         P25         MEDIAN      P75         P90         P95         P99         P999        MAX         SLOWEST_THREAD
INGESTION           0.27        0.16        0.17        0.18        0.20        0.25        0.29        0.37        1.55        6.98        272.78      93945.68
PRECISE_POINT       2.57        0.36        2.13        2.19        2.29        2.69        3.22        3.34        4.20        16.95       220.03      13104.44
TIME_RANGE          2.76        0.83        2.28        2.35        2.45        2.90        3.47        3.61        5.57        17.78       171.88      14154.52
VALUE_RANGE         2.84        0.85        2.38        2.44        2.54        2.83        3.59        3.74        5.69        18.55       199.86      14478.47
AGG_RANGE           2.67        0.75        2.25        2.31        2.40        2.64        3.41        3.56        4.67        17.73       220.22      13721.33
AGG_VALUE           76.64       1.11        15.64       38.45       75.59       112.95      134.03      144.54      182.34      211.64      331.16      399059.57
AGG_RANGE_VALUE     3.33        0.98        2.69        2.82        3.02        3.36        4.16        4.47        8.24        19.43       248.50      16953.70
GROUP_BY            2.55        0.80        2.12        2.18        2.27        2.52        3.18        3.32        4.74        17.91       220.14      13092.05
LATEST_POINT        0.26        0.15        0.20        0.22        0.24        0.26        0.29        0.31        0.48        2.89        81.12       1391.35
RANGE_QUERY_DESC    2.78        0.84        2.29        2.35        2.46        2.90        3.48        3.61        5.57        17.93       245.16      14249.38
VALUE_RANGE_QUERY_DESC2.84        0.82        2.38        2.45        2.55        2.84        3.60        3.77        5.80        18.58       170.85      14732.33
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------

Copy link
Member

@SteveYurongSu SteveYurongSu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty nice!

@SteveYurongSu SteveYurongSu changed the title [IOTDB-2144][metric]Collect IoTDB Runtime Metrics [IOTDB-2144][Metric] Collect IoTDB Runtime Metrics Dec 22, 2021
@SteveYurongSu SteveYurongSu merged commit fcd7824 into apache:master Dec 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Module - Cluster PRs for the cluster module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants