Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple instances can be run if the pid file is specified as a relative path #426

Closed
fluca1978 opened this issue Mar 25, 2024 · 6 comments
Labels
bug Something isn't working

Comments

@fluca1978
Copy link
Collaborator

On 8a1d641 having pid_file = relative_file.pid makes pgagroal able to run multiple times from different directories.

Example: running the first instance:

Running the second instance from a different directory:

[luca@rachel]~% pgagroal
DEBUG network.c:648 server: bind: 127.0.0.1:54322 (Address already in use)
DEBUG network.c:648 server: bind: 10.0.2.15:54322 (Address already in use)
DEBUG network.c:648 server: bind: 192.168.222.50:54322 (Address already in use)
DEBUG network.c:648 server: bind: ::1:54322 (Address already in use)
DEBUG network.c:648 server: bind: fe80::a00:27ff:fee4:438d:54322 (Invalid argument)

I think we should either force an absolute pid_file, therefore aborting execution if pid_file is not absolute, or abort the execution if the bind fails. In any case, the fact that bind failure allows for continuation is suspicions.

@fluca1978 fluca1978 added the bug Something isn't working label Mar 25, 2024
@jesperpedersen
Copy link
Collaborator

We should def error out if the configuration is the same.

However, it should be possible to run multiple instance on the same server - like one for the primary instance, and another for standby

@fluca1978
Copy link
Collaborator Author

We should def error out if the configuration is the same.

The only "quick" way to find out if the configuration is (almost) the same is the failure of bind or the same usage of the managament socket (and it could be also the metrics one). If any of these is already in use, we should abort.

However, it should be possible to run multiple instance on the same server - like one for the primary instance, and another for standby

Good point, but while it is immediate to find out a "misrun" by the user when using the same configuration for multiple instances, if the pid_file is absolute, this becomes harder to detect if the pid file is relative (until we fix the above socket problems).

@decarv
Copy link
Contributor

decarv commented Mar 31, 2024

@fluca1978 I couldn't reproduce this, but I may have misunderstood the bug.

What I am trying to do is set the unix_socket_dir to a relative path in config and then run two pgagroal instances from different directories. What I get is a bind error.

If you could give me more details I could work something out.

@fluca1978
Copy link
Collaborator Author

@fluca1978 I couldn't reproduce this, but I may have misunderstood the bug.

What I am trying to do is set the unix_socket_dir to a relative path in config and then run two pgagroal instances from different directories. What I get is a bind error.

If you could give me more details I could work something out.

When I launch the second instance, I got a bind error too, but the instance continues to run. Is your second instance aborting? That could be due to the presence or absence of other network cards?

@decarv
Copy link
Contributor

decarv commented Apr 3, 2024

When I launch the second instance, I got a bind error too, but the instance continues to run. Is your second instance aborting?

Yes, mine aborts exactly after returning from pgagroal_bind function.

$ ./pgagroal -c pgagroal.conf
2024-04-03 10:54:45 DEBUG configuration.c:2656 PID file automatically set to: [./pgagroal.2345.pid]
2024-04-03 10:54:45 DEBUG network.c:648 server: bind: localhost:2345 (Address already in use)
2024-04-03 10:54:45 FATAL main.c:924 pgagroal: Could not bind to localhost:2345

That could be due to the presence or absence of other network cards?

I have researched this and it's possible that it is a matter of how the OS deals with SO_REUSEADDR, but I need to research more. Do you have details on the address-port pairs that were bound on each processes?

@fluca1978
Copy link
Collaborator Author

Apprently the problem is with ``hostconfiguration: if set tolocalhost` the second instance aborts as expected:

% pgagroal
-> DEBUG network.c:648 server: bind: localhost:54322 (Address already in use)
-> DEBUG network.c:648 server: bind: localhost:54322 (Address already in use)
-> FATAL main.c:924 pgagroal: Could not bind to localhost:54322

but when set to * the second instance runs.

fluca1978 added a commit to fluca1978/pgagroal that referenced this issue Apr 4, 2024
If the configuration `host` is set to `*`, meaning all the interfaces,
the `pgagroal_bind` function was always returning a true value even if
no available bind addresses were left.
In order to fix this, the commit tests if the `star_length` variable
has been set to something non-zero, otherwise no more sockets are
available (bound) and the function returns a `1` to indicate an error.

Close agroal#426
jesperpedersen pushed a commit that referenced this issue Apr 4, 2024
If the configuration `host` is set to `*`, meaning all the interfaces,
the `pgagroal_bind` function was always returning a true value even if
no available bind addresses were left.
In order to fix this, the commit tests if the `star_length` variable
has been set to something non-zero, otherwise no more sockets are
available (bound) and the function returns a `1` to indicate an error.

Close #426
EuGig pushed a commit to EuGig/pgagroal that referenced this issue Apr 6, 2024
If the configuration `host` is set to `*`, meaning all the interfaces,
the `pgagroal_bind` function was always returning a true value even if
no available bind addresses were left.
In order to fix this, the commit tests if the `star_length` variable
has been set to something non-zero, otherwise no more sockets are
available (bound) and the function returns a `1` to indicate an error.

Close agroal#426
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants