Skip to content

qryn tables replication support

Lorenzo Mangani edited this page Jul 14, 2022 · 2 revisions

Replication of tables for qryn can be done via manual execution of maintenance scripts described in https://github.com/metrico/qryn/blob/master/lib/db/maintain/scripts.js .

The only thing you need is to change the table engines to Replicated* ones.

Replication scripts example

Note! {{DB}} should be replaced by the name of your database.

CREATE TABLE IF NOT EXISTS time_series (
  date Date,
  fingerprint UInt64,
  labels String,
  name String
) ENGINE = ReplicatedReplacingMergeTree(date) PARTITION BY date ORDER BY fingerprint

#---

CREATE TABLE IF NOT EXISTS samples_v3
(
  fingerprint UInt64,
  timestamp_ns Int64 CODEC(DoubleDelta),
  value Float64 CODEC(Gorilla),
  string String
) ENGINE = ReplicatedMergeTree 
PARTITION BY toStartOfDay(toDateTime(timestamp_ns / 1000000000))
ORDER BY (timestamp_ns)

#---

CREATE TABLE IF NOT EXISTS settings (
  fingerprint UInt64, 
  type String, 
  name String, 
  value String, 
  inserted_at DateTime64(9, 'UTC')
) ENGINE = ReplicatedReplacingMergeTree(inserted_at) 
ORDER BY fingerprint

#---

DROP TABLE IF EXISTS samples_read

#---

CREATE TABLE IF NOT EXISTS samples_read
(
  fingerprint UInt64,
  timestamp_ms Int64,
  value Float64,
  string String
) ENGINE=Merge('{{DB}}', '^(samples|samples_v2)$')

#---

CREATE VIEW IF NOT EXISTS samples_read_v2_1 AS 
    SELECT fingerprint, timestamp_ms * 1000000 as timestamp_ns, value, string FROM samples_read

#---

CREATE TABLE IF NOT EXISTS samples_read_v2_2 (
  fingerprint UInt64,
  timestamp_ns Int64,
  value Float64,
  string String
) ENGINE=Merge('{{DB}}', '^(samples_read_v2_1|samples_v3)$')

#---

CREATE TABLE IF NOT EXISTS time_series_gin (
  date Date,
  key String,
  val String,
  fingerprint UInt64
) ENGINE = ReplicatedReplacingMergeTree() PARTITION BY date ORDER BY (key, val, fingerprint)`,

#---

CREATE MATERIALIZED VIEW IF NOT EXISTS time_series_gin_view TO time_series_gin
   AS SELECT date, pairs.1 as key, pairs.2 as val, fingerprint
   FROM time_series ARRAY JOIN JSONExtractKeysAndValues(time_series.labels, 'String') as pairs
   
#---

INSERT INTO settings (fingerprint, type, name, value, inserted_at) VALUES 
  (cityHash64('update_v3_5'), 'update', 'v3_1', toString(toUnixTimestamp(NOW())), NOW())
  
#---

CREATE TABLE IF NOT EXISTS ver (
  k UInt64, 
  ver UInt64
)ENGINE=ReplicatedReplacingMergeTree(ver) 
ORDER BY k

#---

INSERT INTO ver (k, ver) VALUES (1, 10)

Troubleshooting

Here are suggestions of how to fix some problems connected to the replicated tables engine.

  • if rows are not replicated by clickhouse please check if replicated insert works:
set insert_quorum=<number of clickhouse servers in your cluster>
INSERT INTO samples_v3 (fingerprint, timestamp_ns, string) VALUES (1, 1656583893000000000, 'str');
  • all servers should freely communicate through ports 2181 and 9444. Please check if both ports are reachable with telnet
  • if clickhouse keeper doesn't work it may write some errors in /var/log/clickhouse-server/clickhouse-server.log. But that's optional. It may be absent silently.
  • macros and default paths should be provided. For me the next config worked
    <macros>
        <shard>01</shard>
        <replica>example01-01-1</replica>
    </macros>
    <default_replica_path>/clickhouse/tables/{shard}/{database}/{table}</default_replica_path>
    <default_replica_name>{replica}</default_replica_name>

<replica>example01-01-1</replica> part should wary between servers.

  • if you use hostnames instead of IPs like that
    <zookeeper>
        <node>
            <host>clickhouse_1</host>
            <port>2181</port>
            <secure>0</secure>
       </node>
        <node>
            <host>clickhouse_2</host>
            <port>2181</port>
            <secure>0</secure>
        </node>
    </zookeeper>

Then you should provide hostname.hostname alias for all the nodes. I provided clickhouse_1.clickhouse_1 and clickhouse_2.clickhouse_2