janitor supports incremental execution #2094

guidao · 2022-10-12T07:10:44Z

During our usage we found that postgres has a lot of Lock state transactions. This caused the pipeline to restart due to fetching connection timeouts.
When I stopped the janitor execution, I noticed that there were a lot less transactions waiting. This might have something to do with the fact that janitor is getting a lot of splits, so maybe it could be executed in batches, with a fixed number of splits at a time.

guidao · 2022-10-12T07:15:52Z

indexer command: quickwit --service indexer --service metastore
There are 50 indexer pods.

index config:

version: 0

index_id: logs

indexing_settings:
  timestamp_field: unix_ms

search_settings:
  default_search_fields: [log]

sources:
  - source_id: kafka
    source_type: kafka
    num_pipelines: 3
    params:
      topic: log.log-container.stdout
      client_log_level: debug
      client_params:
        bootstrap.servers: '${ip}'
        group.id: quickwit
        debug: all
        log_level: 7
        auto.offset.reset: latest

doc_mapping:
  partition_key: app_name
  max_num_partitions: 50
  tag_fields: [app_name]
  field_mappings:
    - name: app_name
      type: text
      tokenizer: raw
      fast: true
    - name: unit_name
      type: text
      tokenizer: raw
      fast: true
    - name: version
      type: text
      tokenizer: raw
      fast: true
    - name: idc
      type: text
      tokenizer: raw
      fast: true
    - name: container_name
      type: text
      tokenizer: raw
      fast: true
    - name: pod
      type: text
      tokenizer: raw
      fast: true
    - name: log
      type: text
    - name: unix_ms
      type: i64
      fast: true

boraarslan · 2022-10-12T08:05:18Z

Interesting, @evanxg852000 can you take a look at this?

evanxg852000 · 2022-10-12T08:38:27Z

@guidao I will try to reproduce this. At first sight, In janitor we run Gc, retention policy and delete task. Given you have one index without retention policy configured, I think that retention policy be out.

guidao · 2022-10-12T09:02:23Z

Yes, and we don't have a "delete task", so I suspect it's a gc task. We have a lot of split to be deleted, which could also be the cause of the long transaction.

 metastore=> select split_state, count(1) from splits group by split_state;
    split_state    |  count
-------------------+---------
 MarkedForDeletion | 1731338
 Published         |   95184
 Staged            |     400

evanxg852000 · 2022-10-12T12:20:59Z

@guilload @fmassot I think using the system correctly (running all necessary services) should not yield this huge amount of deletable splits. However, we need to guard against having to load huge number of SplitMetadata in memeory by limiting the number of SplitMetadata we load at a time. Assuming a 1kb the size of SplitMetadata , that's a lot to hold in memory and possibly a lot for Postgress to transfert.
I am not entirely sure why Postgress hold locks for a while, But the place where we do a huge write transaction request to Postgres is at the end of the garbage collection, when deleting splits using their ids (String).

fmassot · 2022-10-12T12:26:26Z

It's not normal at all. An explanation for this high number of MarkedForDeletion splits could be a repetitive failing garbage collection or just no janitor service running during a long time. @guidao did you see some error logs on the janitor instance? If yes, can you share the errors?

guidao · 2022-10-12T13:39:59Z

It's not normal at all. An explanation for this high number of MarkedForDeletion splits could be a repetitive failing garbage collection or just no janitor service running during a long time. @guidao did you see some error logs on the janitor instance? If yes, can you share the errors?

What I have observed is that janitor works fine at first, but from a certain time it starts to time out executing database operations, and then it never works properly. The error log has been lost for testing reasons, but I can re-run it tomorrow after clearing the data.

Our consumption has not been very stable, and I suspect that having a large number of splits in Staged state may be causing this problem or there is a large number of mergers

guilload · 2022-10-12T14:11:33Z

A few things:

How many indexer nodes run in your clusters? You have set the parameter num_pipelines to 3, so I expect your cluster to stage and publish at least num_indexers * 3 splits per commit_timeout_secs (in your case 60s, the default value). With a merge_factor of 10, merges will stage and publish an additional 10% splits.
The next time the janitor is staled, can you run those queries, BEFORE AND AFTER you kill the janitor:
Locks (by table and lock mode)

SELECT mode, pg_class.relname, count(*) FROM pg_locks JOIN pg_class ON (pg_
locks.relation = pg_class.oid) WHERE pg_locks.mode IS NOT NULL AND pg_class.
relname NOT LIKE 'pg_%%' GROUP BY pg_class.relname, mode;

Deadlocks (per database) (9.2+)

SELECT datname, deadlocks FROM pg_stat_database;

Dead rows

SELECT relname, n_dead_tup FROM pg_stat_user_tables;

In the meantime, I'm going to add some log statements and metrics in the GC and metastore.

guidao · 2022-10-12T14:25:22Z

@guilload We have 50 Indexers in our cluster. I will try it tomorrow.

guidao · 2022-10-13T03:21:37Z

partition_key: app_name
max_num_partitions: 50

Does the number of split in the "deleted state" have anything to do with this configuration?

guidao · 2022-10-13T10:56:00Z

janitor logs(postgres statement_timeout: 300000):

2022-10-13T10:44:19.200Z ERROR quickwit_janitor::actors::garbage_collector: Failed to run garbage collection on index. index_id=logs error=Failed to delete splits from metastore: 'DbError { message: "error returned from database: canceling statement due to statement timeout" }'.
2022-10-13T10:44:19.449Z ERROR quickwit_actors::actor_handle: actor-exit-without-success actor="DeleteTaskPlanner-empty-EE06"
2022-10-13T10:44:19.449Z ERROR quickwit_janitor::actors::delete_task_pipeline: Delete task pipeline failure. index_id="logs" healthy_actors=None failed_or_unhealthy_actors=Some(["DeleteTaskPlanner-empty-EE06"]) success_actors=Some(["MergeSplitDownloader-small-pq0R", "MergeExecutor-young-Fujx"])
2022-10-13T10:44:22.451Z ERROR quickwit_actors::actor_handle: actor-exit-without-success actor="DeleteTaskPlanner-empty-EE06"
2022-10-13T10:44:22.451Z ERROR quickwit_janitor::actors::delete_task_pipeline: Delete task pipeline failure. index_id="logs" healthy_actors=None failed_or_unhealthy_actors=Some(["DeleteTaskPlanner-empty-EE06"]) success_actors=Some(["MergeSplitDownloader-small-pq0R", "MergeExecutor-young-Fujx"])
2022-10-13T10:44:25.453Z ERROR quickwit_actors::actor_handle: actor-exit-without-success actor="DeleteTaskPlanner-empty-EE06"
2022-10-13T10:44:25.453Z ERROR quickwit_janitor::actors::delete_task_pipeline: Delete task pipeline failure. index_id="logs" healthy_actors=None failed_or_unhealthy_actors=Some(["DeleteTaskPlanner-empty-EE06"]) success_actors=Some(["MergeSplitDownloader-small-pq0R", "MergeExecutor-young-Fujx"])

metastore=> SELECT mode, pg_class.relname, count(*) FROM pg_locks JOIN pg_class ON (pg_locks.relation = pg_class.oid) WHERE pg_locks.mode IS NOT NULL AND pg_class.relname NOT LIKE 'pg_%%' GROUP BY pg_class.relname, mode;
       mode       |   relname    | count
------------------+--------------+-------
 RowExclusiveLock | splits       |   143
 RowShareLock     | indexes      |   137
 ExclusiveLock    | indexes      |     5
 RowExclusiveLock | indexes_pkey |   143
 AccessShareLock  | indexes_pkey |   137
 RowExclusiveLock | indexes      |   143
 RowExclusiveLock | splits_pkey  |   143

metastore=> SELECT datname, deadlocks FROM pg_stat_database;
  datname  | deadlocks
-----------+-----------
 postgres  |         0
 metastore |         0
 template1 |         0
 template0 |         0
(4 rows)

metastore=> SELECT relname, n_dead_tup FROM pg_stat_user_tables;
     relname      | n_dead_tup
------------------+------------
 spatial_ref_sys  |          0
 splits           |      33461
 delete_tasks     |          0
 indexes          |        120
 _sqlx_migrations |          5
(5 rows)

metastore=> select count(1), split_state from splits group by split_state;
 count  |    split_state
--------+-------------------
 283923 | MarkedForDeletion
  68598 | Published
 164837 | Staged
(3 rows)

guidao · 2022-10-14T04:46:24Z

@fmassot @guilload I've posted some logs with information that I hope will help with this issue.

ChillFish8 · 2022-10-18T09:44:58Z

Linking this with so it's easier to follow the paper trail #2126

ChillFish8 · 2022-10-18T14:56:39Z

@guidao what is the general overview of your Postgres server?

How many active connections to the database (server sessions) are there when indexing?
Is there any monitoring around the server running Postgres for things like the iowait, disk speeds, etc...

guidao · 2022-10-19T02:14:42Z

@guidao what is the general overview of your Postgres server?

How many active connections to the database (server sessions) are there when indexing?

Is there any monitoring around the server running Postgres for things like the iowait, disk speeds, etc...

About 300-400 connections
The database does not see the machine disk related monitoring because of the cloud service we are using.

fulmicoton · 2022-10-19T02:26:14Z

Can you detail where the 300 connections are coming from? I assume we have 50 indexers? The extra 250 are searchers?

guidao · 2022-10-19T02:32:33Z

Can you detail where the 300 connections are coming from? I assume we have 50 indexers? The extra 250 are searchers?

quickwit/quickwit/quickwit-metastore/src/metastore/postgresql_metastore.rs

Line 48 in 4abdfb3

const CONNECTION_POOL_MAX_SIZE: u32 = 10;

It should be related to this configuration.

fulmicoton · 2022-10-19T02:33:38Z

Ah yes we have a conneciton pool! understood.

guilload · 2022-11-26T11:33:31Z

We pushed a bunch of fixes that have increased the performance of the janitor. I'm going to close this issue. Feel free to reopen if necessary.

guidao added the enhancement New feature or request label Oct 12, 2022

boraarslan assigned evanxg852000 Oct 12, 2022

guilload assigned guilload and unassigned evanxg852000 Oct 12, 2022

guilload closed this as completed Nov 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

janitor supports incremental execution #2094

janitor supports incremental execution #2094

guidao commented Oct 12, 2022

guidao commented Oct 12, 2022 •

edited

boraarslan commented Oct 12, 2022

evanxg852000 commented Oct 12, 2022 •

edited

guidao commented Oct 12, 2022

evanxg852000 commented Oct 12, 2022 •

edited

fmassot commented Oct 12, 2022 •

edited

guidao commented Oct 12, 2022 •

edited

guilload commented Oct 12, 2022

guidao commented Oct 12, 2022

guidao commented Oct 13, 2022

guidao commented Oct 13, 2022 •

edited

guidao commented Oct 14, 2022

ChillFish8 commented Oct 18, 2022

ChillFish8 commented Oct 18, 2022

guidao commented Oct 19, 2022

fulmicoton commented Oct 19, 2022

guidao commented Oct 19, 2022 •

edited

fulmicoton commented Oct 19, 2022

guilload commented Nov 26, 2022 •

edited

janitor supports incremental execution #2094

janitor supports incremental execution #2094

Comments

guidao commented Oct 12, 2022

guidao commented Oct 12, 2022 • edited

boraarslan commented Oct 12, 2022

evanxg852000 commented Oct 12, 2022 • edited

guidao commented Oct 12, 2022

evanxg852000 commented Oct 12, 2022 • edited

fmassot commented Oct 12, 2022 • edited

guidao commented Oct 12, 2022 • edited

guilload commented Oct 12, 2022

guidao commented Oct 12, 2022

guidao commented Oct 13, 2022

guidao commented Oct 13, 2022 • edited

guidao commented Oct 14, 2022

ChillFish8 commented Oct 18, 2022

ChillFish8 commented Oct 18, 2022

guidao commented Oct 19, 2022

fulmicoton commented Oct 19, 2022

guidao commented Oct 19, 2022 • edited

fulmicoton commented Oct 19, 2022

guilload commented Nov 26, 2022 • edited

guidao commented Oct 12, 2022 •

edited

evanxg852000 commented Oct 12, 2022 •

edited

evanxg852000 commented Oct 12, 2022 •

edited

fmassot commented Oct 12, 2022 •

edited

guidao commented Oct 12, 2022 •

edited

guidao commented Oct 13, 2022 •

edited

guidao commented Oct 19, 2022 •

edited

guilload commented Nov 26, 2022 •

edited