Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug Report: TabletExternallyReparented during vttablets rolling-update may cause a service interruption #12624

Closed
yoheimuta opened this issue Mar 14, 2023 · 0 comments · Fixed by #12893

Comments

@yoheimuta
Copy link
Contributor

yoheimuta commented Mar 14, 2023

Overview of the Issue

We've encountered a problem with TabletExternallyReparented: our vtgates continued to connect to the previous PRIMARY tablet even after reparenting.

The vtgate does not recognize the newly restarted tablet pod (with a new IP address) until the TopologyWatcher calls the loadTablets function to get the latest tablets.
As a result, the second TabletExternallyReparented call within tablet_refresh_interval (default: 1 minute) after the SPARE pod restart may cause a service interruption

Discussion is https://vitess.slack.com/archives/C0PQY0PTK/p1678243354071319.

Reproduction Steps

  1. Set up a minikube cluster:
# Make sure to set the CNI plugin to kindnet so that the pod's IP address changes every time it restarts, similar to the behavior in the GKE environment. 
# This is because the pod in the default minikube single node cluster does not change its IP address during vttablet rolling updates.
$ minikube start --cni=kindnet --kubernetes-version=v1.19.16 --cpus=4 --memory=8000 --disk-size=40g -p parallel-test

# Make sure that the vitess-operator's image is planetscale/vitess-operator:v2.8.1
$ minikube -p parallel-test kubectl -- apply -f operator.yaml
  1. Deploy the following yaml:
  • Note: The default value of tablet_refresh_interval makes it difficult to reproduce this issue with just a few attempts. Increasing the tablet_refresh_interval duration makes it easier to reproduce the issue.
apiVersion: planetscale.com/v2
kind: VitessCluster
metadata:
  name: example
spec:
  images:
    vtctld: vitess/lite:v15.0.1
    vtadmin: vitess/vtadmin:latest
    vtgate: vitess/lite:v15.0.1
    vttablet: vitess/lite:v15.0.1
    vtbackup: vitess/lite:v15.0.1
    mysqld:
      mysql56Compatible: vitess/lite:v15.0.1
    mysqldExporter: prom/mysqld-exporter:v0.11.0
  cells:
  - name: zone1
    gateway:
      authentication:
        static:
          secret:
            name: example-cluster-config
            key: users.json
      # extraFlags:
       #  tablet_refresh_interval: "5m"
      replicas: 1
      resources:
        requests:
          cpu: 100m
          memory: 256Mi
        limits:
          memory: 256Mi
  vitessDashboard:
    cells:
    - zone1
    extraFlags:
      security_policy: read-only
    replicas: 1
    resources:
      limits:
        memory: 128Mi
      requests:
        cpu: 100m
        memory: 128Mi
  vtadmin:
    rbac:
      name: example-cluster-config
      key: rbac.yaml
    cells:
      - zone1
    apiAddresses:
      - http://localhost:14001
    replicas: 1
    readOnly: false
    apiResources:
      limits:
        memory: 128Mi
      requests:
        cpu: 100m
        memory: 128Mi
    webResources:
      limits:
        memory: 128Mi
      requests:
        cpu: 100m
        memory: 128Mi

  keyspaces:
  - name: main
    durabilityPolicy: none
    turndownPolicy: Immediate
    partitionings:
    - equal:
        parts: 1
        shardTemplate:
          databaseInitScriptSecret:
            name: example-cluster-config
            key: init_db.sql
          replication:
            enforceSemiSync: false
          tabletPools:
          - cell: zone1
            type: externalmaster
            replicas: 2
            vttablet:
              extraFlags:
                db_charset: utf8mb4_unicode_ci
                log_queries_to_file: vt/vtdataroot/queries.log
                queryserver-config-pool-size: "10"
              resources:
                limits:
                  memory: 256Mi
                requests:
                  cpu: 100m
                  memory: 256Mi
            externalDatastore:
              user: root
              host: mysql
              port: 3306
              database: main
              credentialsSecret:
                name: example-cluster-config
                key: ext_db_credentials_secret.json
  updateStrategy:
    type: Immediate
---
apiVersion: v1
kind: Secret
metadata:
  name: example-cluster-config
type: Opaque
stringData:
  users.json: |
    {
      "user": [{
        "UserData": "user",
        "Password": ""
      }]
    }
  init_db.sql: |
    # This file is executed immediately after mysql_install_db,
    # to initialize a fresh data directory.

    ###############################################################################
    # Equivalent of mysql_secure_installation
    ###############################################################################

    # Changes during the init db should not make it to the binlog.
    # They could potentially create errant transactions on replicas.
    SET sql_log_bin = 0;
    # Remove anonymous users.
    DELETE FROM mysql.user WHERE User = '';

    # Disable remote root access (only allow UNIX socket).
    DELETE FROM mysql.user WHERE User = 'root' AND Host != 'localhost';

    # Remove test database.
    DROP DATABASE IF EXISTS test;

    ###############################################################################
    # Vitess defaults
    ###############################################################################

    # Vitess-internal database.
    CREATE DATABASE IF NOT EXISTS _vt;
    # Note that definitions of local_metadata and shard_metadata should be the same
    # as in production which is defined in go/vt/mysqlctl/metadata_tables.go.
    CREATE TABLE IF NOT EXISTS _vt.local_metadata (
      name VARCHAR(255) NOT NULL,
      value VARCHAR(255) NOT NULL,
      db_name VARBINARY(255) NOT NULL,
      PRIMARY KEY (db_name, name)
      ) ENGINE=InnoDB;
    CREATE TABLE IF NOT EXISTS _vt.shard_metadata (
      name VARCHAR(255) NOT NULL,
      value MEDIUMBLOB NOT NULL,
      db_name VARBINARY(255) NOT NULL,
      PRIMARY KEY (db_name, name)
      ) ENGINE=InnoDB;

    # Admin user with all privileges.
    CREATE USER 'vt_dba'@'localhost';
    GRANT ALL ON *.* TO 'vt_dba'@'localhost';
    GRANT GRANT OPTION ON *.* TO 'vt_dba'@'localhost';

    # User for app traffic, with global read-write access.
    CREATE USER 'vt_app'@'localhost';
    GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, PROCESS, FILE,
      REFERENCES, INDEX, ALTER, SHOW DATABASES, CREATE TEMPORARY TABLES,
      LOCK TABLES, EXECUTE, REPLICATION CLIENT, CREATE VIEW,
      SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, CREATE USER, EVENT, TRIGGER
      ON *.* TO 'vt_app'@'localhost';

    # User for app debug traffic, with global read access.
    CREATE USER 'vt_appdebug'@'localhost';
    GRANT SELECT, SHOW DATABASES, PROCESS ON *.* TO 'vt_appdebug'@'localhost';

    # User for administrative operations that need to be executed as non-SUPER.
    # Same permissions as vt_app here.
    CREATE USER 'vt_allprivs'@'localhost';
    GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, PROCESS, FILE,
      REFERENCES, INDEX, ALTER, SHOW DATABASES, CREATE TEMPORARY TABLES,
      LOCK TABLES, EXECUTE, REPLICATION SLAVE, REPLICATION CLIENT, CREATE VIEW,
      SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, CREATE USER, EVENT, TRIGGER
      ON *.* TO 'vt_allprivs'@'localhost';

    # User for slave replication connections.
    # TODO: Should we set a password on this since it allows remote connections?
    CREATE USER 'vt_repl'@'%';
    GRANT REPLICATION SLAVE ON *.* TO 'vt_repl'@'%';

    # User for Vitess filtered replication (binlog player).
    # Same permissions as vt_app.
    CREATE USER 'vt_filtered'@'localhost';
    GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, PROCESS, FILE,
      REFERENCES, INDEX, ALTER, SHOW DATABASES, CREATE TEMPORARY TABLES,
      LOCK TABLES, EXECUTE, REPLICATION SLAVE, REPLICATION CLIENT, CREATE VIEW,
      SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, CREATE USER, EVENT, TRIGGER
      ON *.* TO 'vt_filtered'@'localhost';

    # User for Orchestrator (https://github.com/openark/orchestrator).
    # TODO: Reenable when the password is randomly generated.
    CREATE USER 'orc_client_user'@'%' IDENTIFIED BY 'orc_client_user_password';
    GRANT SUPER, PROCESS, REPLICATION SLAVE, RELOAD
      ON *.* TO 'orc_client_user'@'%';
    GRANT SELECT
      ON _vt.* TO 'orc_client_user'@'%';

    FLUSH PRIVILEGES;

    RESET SLAVE ALL;
    RESET MASTER;
  rbac.yaml: |
    rules:
    - resource: "*"
      actions:
        - "get"
        - "create"
        - "put"
        - "ping"
      subjects: ["*"]
      clusters: ["*"]
    - resource: "Shard"
      actions:
        - "emergency_reparent_shard"
        - "planned_reparent_shard"
      subjects: ["*"]
      clusters:
        - "local"
  orc_config.json: |
    {
      "Debug": true,
      "MySQLTopologyUser": "orc_client_user",
      "MySQLTopologyPassword": "orc_client_user_password",
      "MySQLReplicaUser": "vt_repl",
      "MySQLReplicaPassword": "",
      "RecoveryPeriodBlockSeconds": 5
    }
  ext_db_credentials_secret.json: |
    {
      "root": ["password"]
    }
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: mysql-pv-claim
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 2Gi
---
apiVersion: v1
kind: Service
metadata:
  name: mysql
  labels:
    app: mysql
spec:
  ports:
    - port: 3306
  selector:
    app: mysql
  clusterIP: None
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: mysql
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mysql
  template:
    metadata:
      labels:
        app: mysql
    spec:
      containers:
        - name: mysql
          image: mysql:5.7
          env:
            - name: MYSQL_ROOT_PASSWORD
              value: password
          ports:
            - containerPort: 3306
          volumeMounts:
            - name: mysql-persistent-storage
              mountPath: /var/lib/mysql
      volumes:
        - name: mysql-persistent-storage
          persistentVolumeClaim:
            claimName: mysql-pv-claim
  1. Create a table:
create table customer(
  customer_id bigint not null auto_increment,
  email varbinary(128),
  primary key(customer_id)
) ENGINE=InnoDB;
  1. Apply the following diff:
$ minikube -p parallel-test kubectl -- diff -f parallel_cluster_vttablet-update.yaml; date;
...
@@ -225,7 +225,7 @@
               extraFlags:
                 db_charset: utf8mb4_unicode_ci
                 log_queries_to_file: vt/vtdataroot/queries.log
                 # My Comment: This change itself doesn't mean anything. Just any change to trigger the rolling update.
-                queryserver-config-pool-size: "10"
+                queryserver-config-pool-size: "15"
               resources:
                 limits:
                   memory: 256Mi
Mon Mar 13 15:58:07 JST 2023
$ minikube -p parallel-test kubectl -- apply -f parallel_cluster_vttablet-update.yaml; date;
vitesscluster.planetscale.com/example configured
secret/example-cluster-config configured
persistentvolumeclaim/mysql-pv-claim unchanged
service/mysql unchanged
deployment.apps/mysql unchanged
Mon Mar 13 15:58:08 JST 2023
  1. View error
$ minikube -p parallel-test kubectl -- port-forward --address localhost service/example-zone1-vtgate-bc6cde92 15306:3306
# Issue the query in whichever way you prefer
# ex. mysql -h 127.0.0.1 -P 15306 -u user --table --execute="select * from customer;"

2023-03-13 15:59:07.25799 +0900
#<Mysql2::Error::TimeoutError: [mysql_127.0.0.1:15306] Timeout waiting for a response from the last query. (waited 3 seconds)>

2023-03-13 15:59:11.282234 +0900
#<Mysql2::Error::TimeoutError: [mysql_127.0.0.1:15306] Timeout waiting for a response from the last query. (waited 3 seconds)>

2023-03-13 15:59:15.304278 +0900
#<Mysql2::Error: target: main.-.primary: vttablet: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 10.244.0.37:15999: connect: connection refused">

2023-03-13 15:59:18.287796 +0900
#<Mysql2::Error: target: main.-.primary: vttablet: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 10.244.0.37:15999: connect: connection refused">

2023-03-13 15:59:19.316975 +0900
#<Mysql2::Error: target: main.-.primary: vttablet: Connection Closed>
...
2023-03-13 15:59:35.7701 +0900
#<Mysql2::Error: target: main.-.primary: vttablet: Connection Closed>
$ watch -n 1 'mysql -h 127.0.0.1 -P 15306 -u user --table --execute="show vitess_tablets" | tee -a /tmp/vitess_tablets.txt; echo `date` | tee -a /tmp/vitess_tablets.txt'
...
+-------+----------+-------+------------+-------------+------------------+-------------+----------------------+
| Cell  | Keyspace | Shard | TabletType | State       | Alias            | Hostname    | PrimaryTermStartTime |
+-------+----------+-------+------------+-------------+------------------+-------------+----------------------+
| zone1 | main     | -     | PRIMARY    | SERVING     | zone1-4073072872 | 10.244.0.37 | 2023-03-13T06:58:09Z |
| zone1 | main     | -     | SPARE      | NOT_SERVING | zone1-1951951717 | 10.244.0.36 |                      |
+-------+----------+-------+------------+-------------+------------------+-------------+----------------------+
Mon Mar 13 15:59:07 JST 2023

# Start the errors
+-------+----------+-------+------------+-------------+------------------+-------------+----------------------+
| Cell  | Keyspace | Shard | TabletType | State       | Alias            | Hostname    | PrimaryTermStartTime |
+-------+----------+-------+------------+-------------+------------------+-------------+----------------------+
| zone1 | main     | -     | SPARE      | NOT_SERVING | zone1-1951951717 | 10.244.0.36 |                      |
| zone1 | main     | -     | SPARE      | NOT_SERVING | zone1-4073072872 | 10.244.0.37 |                      |
+-------+----------+-------+------------+-------------+------------------+-------------+----------------------+
Mon Mar 13 15:59:17 JST 2023
...
+-------+----------+-------+------------+-------------+------------------+-------------+----------------------+
| Cell  | Keyspace | Shard | TabletType | State       | Alias            | Hostname    | PrimaryTermStartTime |
+-------+----------+-------+------------+-------------+------------------+-------------+----------------------+
| zone1 | main     | -     | SPARE      | NOT_SERVING | zone1-1951951717 | 10.244.0.36 |                      |
| zone1 | main     | -     | SPARE      | NOT_SERVING | zone1-4073072872 | 10.244.0.37 |                      |
+-------+----------+-------+------------+-------------+------------------+-------------+----------------------+
Mon Mar 13 15:59:35 JST 2023

# Finish the errors
+-------+----------+-------+------------+-------------+------------------+-------------+----------------------+
| Cell  | Keyspace | Shard | TabletType | State       | Alias            | Hostname    | PrimaryTermStartTime |
+-------+----------+-------+------------+-------------+------------------+-------------+----------------------+
| zone1 | main     | -     | PRIMARY    | SERVING     | zone1-1951951717 | 10.244.0.38 | 2023-03-13T06:59:07Z |
| zone1 | main     | -     | SPARE      | NOT_SERVING | zone1-1951951717 | 10.244.0.36 |                      |
| zone1 | main     | -     | SPARE      | NOT_SERVING | zone1-4073072872 | 10.244.0.39 |                      |
+-------+----------+-------+------------+-------------+------------------+-------------+----------------------+
Mon Mar 13 15:59:36 JST 2023

Binary Version

This issue can be reproduced in either version v15.0.1 or v16.0.0.

$ mysql -h 127.0.0.1 -P 15306 -u user --table 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.7.9-vitess-15.0.1 Version: 15.0.1 (Git revision 13ee9c817638d59bebd6bc598f9d673a893c41cd branch 'heads/v15.0.1') built on Tue Nov 29 21:08:39 UTC 2022 by vitess@buildkitsandbox using go1.18.7 linux/amd64

OR

$ mysql -h 127.0.0.1 -P 15306 -u user --table 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 244
Server version: 8.0.30-Vitess Version: 16.0.0 (Git revision bb768df0008fc09f7e6868a4fa571c32cc1cb526 branch 'heads/v16.0.0') built on Tue Feb 28 15:38:00 UTC 2023 by vitess@buildkitsandbox using go1.20.1 linux/amd64

Operating System and Environment details

This issue can be reproduced in either version v15.0.1 (with planetscale/vitess-operator:v2.8.1) or v16.0.0 (with planetscale/vitess-operator:v2.9.0).

$ kubectl describe pods/example-zone1-vtgate-bc6cde92-5f479548d-7rzb4 | grep Image
    Image:         vitess/lite:v15.0.1
    Image ID:      docker-pullable://vitess/lite@sha256:9cf89b7948fa288ef43239ff57a74174bd0d981e760dd44316352daea8692dd4

$ kubectl describe pods/vitess-operator-557f797f86-sj9z8 | grep Image
    Image:         planetscale/vitess-operator:v2.8.1
    Image ID:      docker-pullable://planetscale/vitess-operator@sha256:5937ea75815332b1d8545bb49ee9c860f9006b8b1df358aa9be2b9fa3704d335

OR

$ kubectl describe pods/example-zone1-vtgate-bc6cde92-c76674cf8-phz7w | grep Image
    Image:         vitess/lite:v16.0.0
    Image ID:      docker-pullable://vitess/lite@sha256:ac0254e2f1c741536e3fb039a0c2ccb32db2ba82b1313c2c43503add3a7ada1d

$ kubectl describe pods/vitess-operator-84dbfdcd46-vm895 | grep Image
    Image:         planetscale/vitess-operator:v2.9.0
    Image ID:      docker-pullable://planetscale/vitess-operator@sha256:c8002219fe961f2d95e8c56c1af185b14a29b87847c86cfbdb8884dae73c7eaa

Log Fragments

Here is a diagram that illustrates the sequence of events and their flow, as described in the above log.

BugReport_vttablet (2)

@yoheimuta yoheimuta added Needs Triage This issue needs to be correctly labelled and triaged Type: Bug labels Mar 14, 2023
@vmg vmg added Component: Cluster management Component: Query Serving and removed Needs Triage This issue needs to be correctly labelled and triaged labels Mar 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants