Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try setting wsrep_sync_wait for mysql connections #711

Merged
merged 10 commits into from
Apr 5, 2024
3 changes: 1 addition & 2 deletions cmd/icingadb-migrate/misc.go
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@ import (
"context"
"crypto/sha1"
"github.com/icinga/icingadb/pkg/contracts"
"github.com/icinga/icingadb/pkg/driver"
"github.com/icinga/icingadb/pkg/icingadb"
"github.com/icinga/icingadb/pkg/icingadb/objectpacker"
icingadbTypes "github.com/icinga/icingadb/pkg/types"
Expand Down Expand Up @@ -110,7 +109,7 @@ func sliceIdoHistory[Row any](
args["checkpoint"] = checkpoint
args["bulk"] = 20000

if ht.snapshot.DriverName() != driver.MySQL {
if ht.snapshot.DriverName() != icingadb.MySQL {
query = strings.ReplaceAll(query, " USE INDEX (PRIMARY)", "")
}

Expand Down
7 changes: 7 additions & 0 deletions config.example.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,13 @@ database:
# It is not possible to set this option to a smaller number than "1".
# max_rows_per_transaction: 8192

# Enforce Galera cluster nodes to perform strict cluster-wide causality checks before executing
# specific SQL queries determined by the number you provided.
# Note: You can only set this option to a number "0 - 15".
# Defaults to 7.
# See https://icinga.com/docs/icinga-db/latest/doc/03-Configuration/#galera-cluster
# wsrep_sync_wait: 7

# Connection configuration for the Redis server where Icinga 2 writes its configuration, state and history items.
# This is the same connection as configured in the 'icingadb' feature of the corresponding Icinga 2 node.
# High availability setups require a dedicated Redis server per Icinga 2 node and
Expand Down
47 changes: 33 additions & 14 deletions doc/03-Configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,20 +29,20 @@ This is also the database used in
[Icinga DB Web](https://icinga.com/docs/icinga-db-web) to view and work with the data.
In high availability setups, all Icinga DB instances must write to the same database.

| Option | Description |
|----------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| type | **Optional.** Either `mysql` (default) or `pgsql`. |
| host | **Required.** Database host or absolute Unix socket path. |
| port | **Optional.** Database port. By default, the MySQL or PostgreSQL port, depending on the database type. |
| database | **Required.** Database name. |
| user | **Required.** Database username. |
| password | **Optional.** Database password. |
| tls | **Optional.** Whether to use TLS. |
| cert | **Optional.** Path to TLS client certificate. |
| key | **Optional.** Path to TLS private key. |
| ca | **Optional.** Path to TLS CA certificate. |
| insecure | **Optional.** Whether not to verify the peer. |
| options | **Optional.** List of low-level database options that can be set to influence some Icinga DB internal default behaviours. See [database options](#database-options) for details. |
| Option | Description |
|----------|------------------------------------------------------------------------------------------------------------------------------------------------|
| type | **Optional.** Either `mysql` (default) or `pgsql`. |
| host | **Required.** Database host or absolute Unix socket path. |
| port | **Optional.** Database port. By default, the MySQL or PostgreSQL port, depending on the database type. |
| database | **Required.** Database name. |
| user | **Required.** Database username. |
| password | **Optional.** Database password. |
| tls | **Optional.** Whether to use TLS. |
| cert | **Optional.** Path to TLS client certificate. |
| key | **Optional.** Path to TLS private key. |
| ca | **Optional.** Path to TLS CA certificate. |
| insecure | **Optional.** Whether not to verify the peer. |
| options | **Optional.** List of low-level [database options](#database-options) that can be set to influence some Icinga DB internal default behaviours. |

### Database Options

Expand All @@ -61,6 +61,7 @@ manual adjustments.
| max_connections_per_table | **Optional.** Maximum number of queries Icinga DB is allowed to execute on a single table concurrently. Defaults to `8`. |
| max_placeholders_per_statement | **Optional.** Maximum number of placeholders Icinga DB is allowed to use for a single SQL statement. Defaults to `8192`. |
| max_rows_per_transaction | **Optional.** Maximum number of rows Icinga DB is allowed to `SELECT`,`DELETE`,`UPDATE` or `INSERT` in a single transaction. Defaults to `8192`. |
| wsrep_sync_wait | **Optional.** Enforce [Galera cluster](#galera-cluster) nodes to perform strict cluster-wide causality checks. Defaults to `7`. |

## Logging Configuration

Expand Down Expand Up @@ -112,3 +113,21 @@ allowing to keep this information for longer with a smaller storage footprint.

A duration string is a sequence of decimal numbers and a unit suffix, such as `"20s"`.
Valid units are `"ms"`, `"s"`, `"m"` and `"h"`.

### Galera Cluster

Icinga DB expects a more consistent behaviour from its database than a
[Galera cluster](https://mariadb.com/kb/en/what-is-mariadb-galera-cluster/) provides by default. To accommodate this,
Icinga DB sets the [wsrep_sync_wait](https://mariadb.com/kb/en/galera-cluster-system-variables/#wsrep_sync_wait) system
variable for all its database connections. Consequently, strict cluster-wide causality checks are enforced before
executing specific SQL queries, which are determined by the value set in the `wsrep_sync_wait` system variable.
By default, Icinga DB sets this to `7`, which includes `READ, UPDATE, DELETE, INSERT, REPLACE` query types and is
usually sufficient. Unfortunately, this also has the downside that every single Icinga DB query will be blocked until
the cluster nodes resynchronise their states after each executed query, and may result in degraded performance.

However, this does not necessarily have to be the case if, for instance, Icinga DB is only allowed to connect to a
single cluster node at a time. This is the case when a load balancer does not randomly route connections to all the
nodes evenly, but always to the same node until it fails, or if your database cluster nodes have a virtual IP address
fail over assigned. In such situations, you can set the `wsrep_sync_wait` system variable to `0` in the
`/etc/icingadb/config.yml` file to disable it entirely, as Icinga DB doesn't have to wait for cluster
synchronisation then.
74 changes: 58 additions & 16 deletions pkg/config/database.go
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
package config

import (
"context"
"database/sql"
"database/sql/driver"
"fmt"
"github.com/go-sql-driver/mysql"
"github.com/icinga/icingadb/pkg/driver"
"github.com/icinga/icingadb/pkg/icingadb"
"github.com/icinga/icingadb/pkg/logging"
"github.com/icinga/icingadb/pkg/utils"
"github.com/jmoiron/sqlx"
"github.com/jmoiron/sqlx/reflectx"
"github.com/lib/pq"
"github.com/pkg/errors"
"net"
"net/url"
"strconv"
"strings"
"sync"
"time"
)

var registerDriverOnce sync.Once

// Database defines database client configuration.
type Database struct {
Type string `yaml:"type" default:"mysql"`
Expand All @@ -35,11 +35,7 @@ type Database struct {
// Open prepares the DSN string and driver configuration,
// calls sqlx.Open, but returns *icingadb.DB.
func (d *Database) Open(logger *logging.Logger) (*icingadb.DB, error) {
registerDriverOnce.Do(func() {
driver.Register(logger)
})

var dsn string
var db *sqlx.DB
switch d.Type {
case "mysql":
config := mysql.NewConfig()
Expand Down Expand Up @@ -75,7 +71,19 @@ func (d *Database) Open(logger *logging.Logger) (*icingadb.DB, error) {
}
}

dsn = config.FormatDSN()
_ = mysql.SetLogger(icingadb.MysqlFuncLogger(logger.Debug))

c, err := mysql.NewConnector(config)
if err != nil {
return nil, errors.Wrap(err, "can't open mysql database")
}

wsrepSyncWait := int64(d.Options.WsrepSyncWait)
setWsrepSyncWait := func(ctx context.Context, conn driver.Conn) error {
return setGaleraOpts(ctx, conn, wsrepSyncWait)
}

db = sqlx.NewDb(sql.OpenDB(icingadb.NewConnector(c, logger, setWsrepSyncWait)), icingadb.MySQL)
case "pgsql":
uri := &url.URL{
Scheme: "postgres",
Expand Down Expand Up @@ -123,16 +131,17 @@ func (d *Database) Open(logger *logging.Logger) (*icingadb.DB, error) {
}

uri.RawQuery = query.Encode()
dsn = uri.String()

connector, err := pq.NewConnector(uri.String())
if err != nil {
return nil, errors.Wrap(err, "can't open pgsql database")
}

db = sqlx.NewDb(sql.OpenDB(icingadb.NewConnector(connector, logger, nil)), icingadb.PostgreSQL)
default:
return nil, unknownDbType(d.Type)
}

db, err := sqlx.Open("icingadb-"+d.Type, dsn)
if err != nil {
return nil, errors.Wrap(err, "can't open database")
}

db.SetMaxIdleConns(d.Options.MaxConnections / 3)
db.SetMaxOpenConns(d.Options.MaxConnections)

Expand Down Expand Up @@ -173,3 +182,36 @@ func (d *Database) isUnixAddr() bool {
func unknownDbType(t string) error {
return errors.Errorf(`unknown database type %q, must be one of: "mysql", "pgsql"`, t)
}

// setGaleraOpts sets the "wsrep_sync_wait" variable for each session ensures that causality checks are performed
// before execution and that each statement is executed on a fully synchronized node. Doing so prevents foreign key
// violation when inserting into dependent tables on different MariaDB/MySQL nodes. When using MySQL single nodes,
// the "SET SESSION" command will fail with "Unknown system variable (1193)" and will therefore be silently dropped.
//
// https://mariadb.com/kb/en/galera-cluster-system-variables/#wsrep_sync_wait
func setGaleraOpts(ctx context.Context, conn driver.Conn, wsrepSyncWait int64) error {
const galeraOpts = "SET SESSION wsrep_sync_wait=?"

stmt, err := conn.(driver.ConnPrepareContext).PrepareContext(ctx, galeraOpts)
if err != nil {
if errors.Is(err, &mysql.MySQLError{Number: 1193}) { // Unknown system variable
return nil
}

return errors.Wrap(err, "cannot prepare "+galeraOpts)
}
// This is just for an unexpected exit and any returned error can safely be ignored and in case
// of the normal function exit, the stmt is closed manually, and its error is handled gracefully.
defer func() { _ = stmt.Close() }()

_, err = stmt.(driver.StmtExecContext).ExecContext(ctx, []driver.NamedValue{{Value: wsrepSyncWait}})
if err != nil {
return errors.Wrap(err, "cannot execute "+galeraOpts)
}

if err = stmt.Close(); err != nil {
return errors.Wrap(err, "cannot close prepared statement "+galeraOpts)
}

return nil
}
114 changes: 0 additions & 114 deletions pkg/driver/driver.go

This file was deleted.

22 changes: 0 additions & 22 deletions pkg/driver/pgsql.go

This file was deleted.

5 changes: 2 additions & 3 deletions pkg/icingadb/cleanup.go
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ import (
"fmt"
"github.com/icinga/icingadb/internal"
"github.com/icinga/icingadb/pkg/com"
"github.com/icinga/icingadb/pkg/driver"
"github.com/icinga/icingadb/pkg/types"
"time"
)
Expand All @@ -20,10 +19,10 @@ type CleanupStmt struct {
// Build assembles the cleanup statement for the specified database driver with the given limit.
func (stmt *CleanupStmt) Build(driverName string, limit uint64) string {
switch driverName {
case driver.MySQL, "mysql":
case MySQL:
return fmt.Sprintf(`DELETE FROM %[1]s WHERE environment_id = :environment_id AND %[2]s < :time
ORDER BY %[2]s LIMIT %[3]d`, stmt.Table, stmt.Column, limit)
case driver.PostgreSQL, "postgres":
case PostgreSQL:
return fmt.Sprintf(`WITH rows AS (
SELECT %[1]s FROM %[2]s WHERE environment_id = :environment_id AND %[3]s < :time ORDER BY %[3]s LIMIT %[4]d
)
Expand Down