Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpectedly high memory consumption when using CassandraBatchTimeSeries workload #1079

Closed
bmatican opened this issue Mar 26, 2019 · 0 comments
Assignees
Labels
area/docdb YugabyteDB core features

Comments

@bmatican
Copy link
Contributor

Running an in-house test of our a batch workload on YCQL, we are seeing high memory pressure, hitting the soft memory limit.

From the /mem-trackers endpoint in one of the TS:

Id | Current Consumption | Peak consumption | Limit
-- | -- | -- | --
root | 12.57G | 12.57G | 12.35G

Corresponding log messages from peers:

E0326 16:41:54.524134 14680 process_context.cc:180] SQL Error: Execution Error. Write(tablet: 2a0fc6961b134f47a1da85a84d46d0ea, num_ops: 500, num_attempts: 89, txn: 00000000-0000-0000-0000-000000000000) passed its deadline 57676.014s (now: 57689.459s): Remote error (yb/rpc/outbound_call.cc:386): Service unavailable (yb/tserver/tablet_service.cc:239): Soft memory limit exceeded (at 101.64% of capacity)
INSERT INTO batch_ts_metrics_raw (metric_id, ts, value) VALUES (:metric_id, :ts, :value);
       ^^^^
W0326 16:41:54.525184 15093 cql_rpc.cc:271] CQL Call from xx.xx.xx.20:57000 took 73464ms. Details:
W0326 16:41:54.526329 15093 cql_rpc.cc:274] cql_details {
  type: "BATCH"
  call_details {
    sql_id: "a35b1d1d999509e2ab20a7e50d8fb5b3"
    sql_string: "INSERT INTO batch_ts_metrics_raw (metric_id, ts, value) VALUES (:metric_id, :ts, :value);"
  }
  call_details {
    sql_id: "a35b1d1d999509e2ab20a7e50d8fb5b3"
    sql_string: "INSERT INTO batch_ts_metrics_raw (metric_id, ts, value) VALUES (:metric_id, :ts, :value);"
  }
...

cc @kmuthukk

@bmatican bmatican added this to To Do in YBase features via automation Mar 26, 2019
@kmuthukk kmuthukk added the area/docdb YugabyteDB core features label Mar 26, 2019
yugabyte-ci pushed a commit that referenced this issue Apr 3, 2019
Summary:
When one of followers missing a lot of log entries, the following situation could happen.
Leader tries sends big update request, which often times out.
So leader retries to send the same request every 3 seconds.

Each of those requests consumes double memory:
1) Request protobuf.
2) Serialized protobuf.

So memory consumption could grow very fast to big numbers.

The following issues are addressed in this diff:
1) Added mem trackers for sending and queueing serialized data.
2) Release consensus update request protobuf as soon as possible.
3) Release serialized protobufs as soon as they sent or timed out.

Test Plan:
Launch local cluster with:
bin/yb-ctl --rf 3 create --disable_ysql

Launch workload:
java -jar target/yb-sample-apps.jar --workload CassandraBatchTimeseries --nodes 127.0.0.1:9042 --num_threads_read 2 --num_threads_write 2 --num_unique_keys -1

Stop one of nodes for 60 seconds, then start it back:
bin/yb-ctl stop_node 3 && sleep 60 && bin/yb-ctl start_node 3

Check that memory consumption does not grow too high.

Reviewers: mikhail, amitanand, bogdan

Reviewed By: bogdan

Subscribers: ybase, bharat

Differential Revision: https://phabricator.dev.yugabyte.com/D6408
@spolitov spolitov closed this as completed Apr 4, 2019
YBase features automation moved this from To Do to Done Apr 4, 2019
mbautin pushed a commit that referenced this issue Jul 11, 2019
…ed to the

earlier commit 33835b0

Original commit message:

[#1079]: Release sending buffers as soon as possible

Summary:
When one of followers missing a lot of log entries, the following situation could happen.
Leader tries sends big update request, which often times out.
So leader retries to send the same request every 3 seconds.

Each of those requests consumes double memory:
1) Request protobuf.
2) Serialized protobuf.

So memory consumption could grow very fast to big numbers.

The following issues are addressed in this diff:
1) Added mem trackers for sending and queueing serialized data.
2) Release consensus update request protobuf as soon as possible.
3) Release serialized protobufs as soon as they sent or timed out.

Test Plan:
Launch local cluster with:
bin/yb-ctl --rf 3 create --disable_ysql

Launch workload:
java -jar target/yb-sample-apps.jar --workload CassandraBatchTimeseries --nodes 127.0.0.1:9042 --num_threads_read 2 --num_threads_write 2 --num_unique_keys -1

Stop one of nodes for 60 seconds, then start it back:
bin/yb-ctl stop_node 3 && sleep 60 && bin/yb-ctl start_node 3

Check that memory consumption does not grow too high.

Reviewers: mikhail, amitanand, bogdan

Reviewed By: bogdan

Subscribers: ybase, bharat

Differential Revision: https://phabricator.dev.yugabyte.com/D6408
mbautin pushed a commit to mbautin/yugabyte-db that referenced this issue Jul 16, 2019
Summary:
When one of followers missing a lot of log entries, the following situation could happen.
Leader tries sends big update request, which often times out.
So leader retries to send the same request every 3 seconds.

Each of those requests consumes double memory:
1) Request protobuf.
2) Serialized protobuf.

So memory consumption could grow very fast to big numbers.

The following issues are addressed in this diff:
1) Added mem trackers for sending and queueing serialized data.
2) Release consensus update request protobuf as soon as possible.
3) Release serialized protobufs as soon as they sent or timed out.

Test Plan:
Launch local cluster with:
bin/yb-ctl --rf 3 create --disable_ysql

Launch workload:
java -jar target/yb-sample-apps.jar --workload CassandraBatchTimeseries --nodes 127.0.0.1:9042 --num_threads_read 2 --num_threads_write 2 --num_unique_keys -1

Stop one of nodes for 60 seconds, then start it back:
bin/yb-ctl stop_node 3 && sleep 60 && bin/yb-ctl start_node 3

Check that memory consumption does not grow too high.

Reviewers: mikhail, amitanand, bogdan

Reviewed By: bogdan

Subscribers: ybase, bharat

Differential Revision: https://phabricator.dev.yugabyte.com/D6408

Note:
This commit provides additional functionality that is logically related to
the earlier commit yugabyte@33835b0
and supersedes the commit yugabyte@3e89292
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/docdb YugabyteDB core features
Projects
Development

No branches or pull requests

3 participants