Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a demo application showing client-side HA with reconnections #568

Merged
merged 21 commits into from
Feb 12, 2021

Conversation

DimCitus
Copy link
Collaborator

@DimCitus DimCitus commented Jan 7, 2021

Introduces a demo application to pg_auto_failover. The demo application loops over a simple INSERT query that records the time taken to open a connection, and the number of retries that was necessary in the process of doing so.

pg_autoctl do demo
  run      Run the pg_auto_failover demo application
  uri      Grab the application connection string from the monitor
  ping     Attempt to connect to the application URI
  summary  Display a summary of the previous demo app run

The intend is to be able to run such application from within the docker-compose environment in order to show the proper behaviour of retrying connections when the connection is lost, all without having to edit the connection string upon failover.

Here is a sample output from running the demo manually in a make cluster environment:

$ PG_AUTOCTL_DEBUG=1 pg_autoctl do demo run --monitor 'postgres://autoctl_node@localhost:5500/pg_auto_failover?sslmode=require' --clients 4 --dur
ation 25                                                                                
00:56:37 35288 INFO  Using application connection string "postgres://localhost:5502,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer"
00:56:37 35288 INFO  Using Postgres user PGUSER "dim"                                                                                                                         
00:56:37 35288 INFO  Preparing demo schema: drop schema if exists demo cascade                                                                                                
00:56:37 35288 WARN  NOTICE:  schema "demo" does not exist, skipping                                                                                                          
00:56:37 35288 INFO  Preparing demo schema: create schema demo                                                                                                                
00:56:37 35288 INFO  Preparing demo schema: create table demo.tracking(ts timestamptz default now(), client integer, loop integer, retries integer, us bigint, recovery bool)
00:56:37 35288 INFO  Starting 4 concurrent clients as sub-processes                                                                                                           
00:56:37 35296 INFO  Failover client is started, will failover in 10s and every 20s after that
00:56:37 35297 INFO  Client 1 connected in 108.279 ms in loop 0
00:56:37 35298 INFO  Client 2 connected in 110.023 ms in loop 0
00:56:37 35300 INFO  Client 4 connected in 109.739 ms in loop 0  
00:56:37 35299 INFO  Client 3 connected in 109.909 ms in loop 0      
00:56:47 35296 INFO  pg_autoctl perform failover          
00:56:47 35296 INFO  Listening monitor notifications about state changes in formation "default" and group 0                                                                   
00:56:47 35296 INFO  Following table displays times when notifications are received
    Time |  Name |  Node |      Host:Port |       Current State |      Assigned State
---------+-------+-------+----------------+---------------------+--------------------
00:56:47 | node1 |   0/1 | localhost:5501 |             primary |            draining                                                                                         
00:56:47 | node2 |   0/2 | localhost:5502 |           secondary |   prepare_promotion
00:56:47 | node2 |   0/2 | localhost:5502 |   prepare_promotion |   prepare_promotion 
00:56:47 | node2 |   0/2 | localhost:5502 |   prepare_promotion |    stop_replication
00:56:47 | node1 |   0/1 | localhost:5501 |             primary |      demote_timeout
00:56:47 35297 WARN  Failed to connect to "postgres://localhost:5502,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer", retrying until the server is ready  
00:56:47 35300 WARN  Failed to connect to "postgres://localhost:5502,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer", retrying until the server is ready  
00:56:47 35299 WARN  Failed to connect to "postgres://localhost:5502,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer", retrying until the server is ready  
00:56:47 35298 WARN  Failed to connect to "postgres://localhost:5502,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer", retrying until the server is ready
00:56:47 | node1 |   0/1 | localhost:5501 |            draining |      demote_timeout                                                                                         
00:56:47 | node1 |   0/1 | localhost:5501 |      demote_timeout |      demote_timeout  
00:56:48 | node2 |   0/2 | localhost:5502 |    stop_replication |    stop_replication
00:56:48 | node2 |   0/2 | localhost:5502 |    stop_replication |        wait_primary  
00:56:48 | node1 |   0/1 | localhost:5501 |      demote_timeout |             demoted  
00:56:48 | node2 |   0/2 | localhost:5502 |        wait_primary |        wait_primary  
00:56:48 | node1 |   0/1 | localhost:5501 |             demoted |             demoted  
00:56:48 | node1 |   0/1 | localhost:5501 |             demoted |          catchingup
00:56:48 35297 INFO  Successfully connected to "postgres://localhost:5502,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer" after 6 attempts in 1492 ms.    
00:56:48 35298 INFO  Successfully connected to "postgres://localhost:5502,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer" after 6 attempts in 1493 ms.    
00:56:48 35297 INFO  Client 1 connected after 6 attempts in 1493.217ms
00:56:48 35298 INFO  Client 2 connected after 6 attempts in 1494.184ms                 
00:56:48 35300 INFO  Successfully connected to "postgres://localhost:5502,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer" after 6 attempts in 1494 ms.    
00:56:48 35300 INFO  Client 4 connected after 6 attempts in 1495.348ms                 
00:56:48 35299 INFO  Successfully connected to "postgres://localhost:5502,localhost:5501/demo?target_session_attrs=read-write&sslmode=prefer" after 6 attempts in 1497 ms.    
00:56:48 35299 INFO  Client 3 connected after 6 attempts in 1497.881ms                                                                                                        
00:56:50 | node1 |   0/1 | localhost:5501 |          catchingup |          catchingup                                                                                         
00:56:51 | node1 |   0/1 | localhost:5501 |          catchingup |           secondary  
00:56:51 | node1 |   0/1 | localhost:5501 |           secondary |           secondary                                                                                         
00:56:53 | node2 |   0/2 | localhost:5502 |        wait_primary |             primary  
00:56:53 | node2 |   0/2 | localhost:5502 |             primary |             primary
00:56:57 35299 INFO  Client 3 connected in 63.463 ms in loop 106                       
00:56:57 35300 INFO  Client 4 connected in 65.919 ms in loop 106                       
00:56:57 35297 INFO  Client 1 connected in 63.221 ms in loop 107                       
00:56:57 35298 INFO  Client 2 connected in 62.443 ms in loop 107
00:57:03 35288 INFO  Client 0 (pid 35296) is done now.              
00:57:03 35288 INFO  Client 4 (pid 35300) is done now.
00:57:03 35288 INFO  Client 3 (pid 35299) is done now.     
00:57:03 35288 INFO  Client 2 (pid 35298) is done now.                                 
00:57:03 35288 INFO  Client 1 (pid 35297) is done now.                                                                                                                        
00:57:03 35288 INFO  Summary for the demo app running with 4 clients for 25s                                                                                                  
        Client        | Connections | Retries | Min Connect Time (ms) |   max    |   p95   |   p99                                                     
----------------------+-------------+---------+-----------------------+----------+---------+---------                                                                         
 Client 1             |         140 |       6 |                51.278 | 1493.217 | 122.330 | 129.209                                                                          
 Client 2             |         140 |       6 |                51.560 | 1494.183 | 122.228 | 126.851                                                                          
 Client 3             |         140 |       6 |                53.212 | 1497.881 | 122.561 | 128.061                                                                          
 Client 4             |         140 |       6 |                51.817 | 1495.348 | 120.904 | 127.104                                                                         
 All Clients Combined |         560 |      24 |                51.278 | 1497.881 | 122.361 | 128.279                                                                          
(5 rows)                                                                                                                                                                      

@DimCitus DimCitus added the Size:M Effort Estimate: Medium label Jan 7, 2021
@DimCitus DimCitus added this to the Sprint 2020 W52 2021 W1 milestone Jan 7, 2021
@DimCitus DimCitus self-assigned this Jan 7, 2021
@DimCitus DimCitus changed the title Feature/demo app Implement a demo application showing client-side HA with reconnections Jan 7, 2021
@DimCitus
Copy link
Collaborator Author

Now with an histogram of connection times distribution:

 Min Connect Time (ms) |   max    | freq |                                                                bar                                                                
-----------------------+----------+------+-----------------------------------------------------------------------------------------------------------------------------------
                78.916 |  195.621 |  746 | ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒
               195.699 |  312.459 | 1085 | ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒
               312.708 |  428.993 |  305 | ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒
               429.711 |  476.829 |  153 | ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒
               653.009 |  658.865 |    2 | 
              1703.596 | 1703.596 |    1 | 
              1751.626 | 1786.613 |    4 | 
              1853.557 | 1942.100 |   10 | ▒
              1954.034 | 2058.231 |    7 | ▒
              2076.156 | 2106.264 |    6 | ▒
              2181.002 | 2181.002 |    1 | 
(11 rows)

Base automatically changed from feature/open-hba-for-lan to master January 15, 2021 15:47
src/bin/pg_autoctl/pgsql.c Outdated Show resolved Hide resolved
src/bin/pg_autoctl/pgsql.c Outdated Show resolved Hide resolved
src/bin/pg_autoctl/pgsql.c Show resolved Hide resolved
With this option pg_autoctl edits the pg_hba.conf for Postgres to grant
connection privileges on the detected LAN for the --dbname database and for
the --username user.

The LAN detection is done the same way as with the monitor.
This avoids some retry attemps from the other nodes at first startup.
Rather than adding an hostname that we know faulty in the HBA file, we add
one of the IP addresses of the hostname instead. We might want to revisit
this (add all IP addresses maybe? or find the one we want to add by
connecting, like we do for the monitor?), but it allows the docker-compose
demo to just work with a minimum of trouble.
The goal is to show what happens client-side (in the application) at
failover time. Handling disconnections properly and reconnecting to the same
database connection string to reach the new primary.
The demo application runs concurrent clients that each connect to a Postgres
database multi-host connection string, and INSERT some statistics about the
time taken to connect, and how many connection attempts where made.
Let's better focus on the story we want to share through this demo app.
@DimCitus DimCitus merged commit b965460 into master Feb 12, 2021
@DimCitus DimCitus deleted the feature/demo-app branch February 12, 2021 16:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Size:M Effort Estimate: Medium
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants