Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster stress tests. #20

Open
gray380 opened this issue Jan 4, 2023 · 0 comments
Open

Cluster stress tests. #20

gray380 opened this issue Jan 4, 2023 · 0 comments

Comments

@gray380
Copy link

gray380 commented Jan 4, 2023

Hi,

I'm testing postgres and 2 keycloak under docker swarm with traefik as a loadbalancer runnung in the same docker overlay network.

Keycloak 1:

    environment:
      - PROXY_ADDRESS_FORWARDING=true
      - KC_DB=postgres
      - KC_DB_URL_HOST=keycloak-postgres
      - KC_DB_URL_DATABASE=keycloak
      - KC_DB_SCHEMA=clustered_jdbc
      - KC_CACHE_CONFIG_FILE=cache-ispn-jdbc-ping.xml
      - JGROUPS_DISCOVERY_EXTERNAL_IP=keycloak-jdbc1
      - KC_LOG_LEVEL=INFO,org.infinispan:DEBUG,org.jgroups:DEBUG

Keycloak 2:

    environment:
      - PROXY_ADDRESS_FORWARDING=true
      - KC_DB=postgres
      - KC_DB_URL_HOST=keycloak-postgres
      - KC_DB_URL_DATABASE=keycloak
      - KC_DB_SCHEMA=clustered_jdbc
      - KC_CACHE_CONFIG_FILE=cache-ispn-jdbc-ping.xml
      - JGROUPS_DISCOVERY_EXTERNAL_IP=keycloak-jdbc2
      - KC_LOG_LEVEL=INFO,org.infinispan:DEBUG,org.jgroups:DEBUG

JGROUPSPING table

keycloak=# SELECT * FROM clustered_jdbc.JGROUPSPING;
               own_addr               | cluster_name |   bind_addr    |          updated           |                                                ping_data                                                 
--------------------------------------+--------------+----------------+----------------------------+----------------------------------------------------------------------------------------------------------
 b1a481f2-96a9-4cf7-b1d1-c14d0bf38b35 | ISPN         | keycloak-jdbc2 | 2023-01-03 19:05:10.402106 | \x02b1d1c14d0bf38b35b1a481f296a94cf7030100146b6579636c6f616b2d6a646263322d343337353910040a0014b11e78ffff
 c4845a5e-6526-4664-a811-0f90a76b99c1 | ISPN         | keycloak-jdbc2 | 2023-01-03 19:05:10.429296 | \x02a8110f90a76b99c1c4845a5e65264664010100146b6579636c6f616b2d6a646263312d323738393010040a0002ee1e78ffff
(2 rows)

And I'm trying to run stress tests.
It's okay when one keycloak left the cluster (traefik sends requests to "survived" one):

docker service scale common_keycloak-jdbc2=0

Some logs:

Expected behavior:

DEBUG [org.jgroups.protocols.JDBC_PING] (Thread-15) Removed b1a481f2-96a9-4cf7-b1d1-c14d0bf38b35 for cluster ISPN from database
DEBUG [org.jgroups.protocols.JDBC_PING] (Thread-4) Removed b1a481f2-96a9-4cf7-b1d1-c14d0bf38b35 for cluster ISPN from database

Unexpected behavior:

DEBUG [org.jgroups.protocols.JDBC_PING] (jgroups-362,keycloak-jdbc1-27890) Removed c4845a5e-6526-4664-a811-0f90a76b99c1 for cluster ISPN from database
DEBUG [org.jgroups.protocols.JDBC_PING] (jgroups-362,keycloak-jdbc1-27890) Inserted c4845a5e-6526-4664-a811-0f90a76b99c1 for cluster ISPN into database
DEBUG [org.jgroups.protocols.JDBC_PING] (jgroups-362,keycloak-jdbc1-27890) Inserted c4845a5e-6526-4664-a811-0f90a76b99c1 for cluster ISPN into database
DEBUG [org.jgroups.protocols.JDBC_PING] (jgroups-362,keycloak-jdbc1-27890) Removed c4845a5e-6526-4664-a811-0f90a76b99c1 for cluster ISPN from database
DEBUG [org.jgroups.protocols.JDBC_PING] (jgroups-362,keycloak-jdbc1-27890) Removed c4845a5e-6526-4664-a811-0f90a76b99c1 for cluster ISPN from database
DEBUG [org.jgroups.protocols.JDBC_PING] (jgroups-362,keycloak-jdbc1-27890) Inserted c4845a5e-6526-4664-a811-0f90a76b99c1 for cluster ISPN into database

JGROUPSPING table:

keycloak=# SELECT * FROM clustered_jdbc.JGROUPSPING;
               own_addr               | cluster_name |   bind_addr    |          updated          |                                                ping_data                                                 
--------------------------------------+--------------+----------------+---------------------------+----------------------------------------------------------------------------------------------------------
 c4845a5e-6526-4664-a811-0f90a76b99c1 | ISPN         | keycloak-jdbc1 | 2023-01-04 09:37:22.26674 | \x02a8110f90a76b99c1c4845a5e65264664030100146b6579636c6f616b2d6a646263312d323738393010040a0002ee1e78ffff
(1 row)

but when it comes back it tooks a time to reform the cluster:

docker service scale common_keycloak-jdbc2=1

Some logs:

DEBUG [org.jgroups.protocols.TCP] (TQ-Bundler-7,keycloak-jdbc2-35303) JGRP000034: keycloak-jdbc2-35303: failure sending message to keycloak-jdbc1-27890: java.net.ConnectException: Connection refused (Connection refused)
DEBUG [org.jgroups.protocols.FD_SOCK] (FD_SOCK pinger-10,keycloak-jdbc2-35303) keycloak-jdbc2-35303: broadcasting suspect(keycloak-jdbc1-27890)
DEBUG [org.jgroups.protocols.FD_SOCK] (jgroups-380,keycloak-jdbc1-27890) keycloak-jdbc1-27890: suspecting [keycloak-jdbc2-35303]
DEBUG [org.jgroups.protocols.FD_SOCK] (jgroups-380,keycloak-jdbc1-27890) keycloak-jdbc1-27890: broadcasting unsuspect(keycloak-jdbc2-35303)
DEBUG [org.jgroups.protocols.FD_SOCK] (jgroups-21,keycloak-jdbc2-35303) keycloak-jdbc2-35303: suspecting [keycloak-jdbc1-27890]
DEBUG [org.jgroups.protocols.FD_SOCK] (jgroups-25,keycloak-jdbc2-35303) keycloak-jdbc2-35303: broadcasting unsuspect(keycloak-jdbc1-27890)
DEBUG [org.jgroups.protocols.FD_SOCK] (jgroups-382,keycloak-jdbc1-27890) keycloak-jdbc1-27890: broadcasting unsuspect(keycloak-jdbc2-35303)
...
a series of removed/inserted (both nodes) for cluster ISPN into database

JGROUPSPING table

keycloak=# SELECT * FROM clustered_jdbc.JGROUPSPING;
               own_addr               | cluster_name |   bind_addr    |          updated           |                                                ping_data                                                 
--------------------------------------+--------------+----------------+----------------------------+----------------------------------------------------------------------------------------------------------
 c4845a5e-6526-4664-a811-0f90a76b99c1 | ISPN         | keycloak-jdbc1 | 2023-01-04 09:48:34.328075 | \x02a8110f90a76b99c1c4845a5e65264664030100146b6579636c6f616b2d6a646263312d323738393010040a0002ee1e78ffff
 262c1029-ab8a-481c-9b51-8e78c1906ef1 | ISPN         | keycloak-jdbc1 | 2023-01-04 09:48:34.347452 | \x029b518e78c1906ef1262c1029ab8a481c010100146b6579636c6f616b2d6a646263322d333533303310040a0014b41e78ffff
(2 rows)

so while it makes all of these connection refuse, suspect/unsuspect, remove/insert the container is already up and running and traefik sends part of requests to unready keyclock instance.

And the worst part is that sometimes the cluster failed to reform, I can see two different bind_addr for the same cluster_name in the JGROUPSPING.

best regards,
Serhiy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant