Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PostgreSQL not starting correctly if image already contains data #5359

Open
TomCools opened this issue May 10, 2022 · 2 comments
Open

PostgreSQL not starting correctly if image already contains data #5359

TomCools opened this issue May 10, 2022 · 2 comments

Comments

@TomCools
Copy link

Hi everyone. Reporting this after my TestContainers talk at Jfokus and talking about this with @kiview.

Description

If the image you are running with already contains data in the database (because it was built for that specific reason... with built in testdata or with fully ran migrations), then the TestContainer doesn't recognize correctly that the database has started.

The reason for this is the default WaitStrategy, which waits until "database system is ready" is logged 2 times.

// From PostgreSQLContainer.class
this.waitStrategy = (new LogMessageWaitStrategy())
                                         .withRegEx(".*database system is ready to accept connections.*\\s")
                                         .withTimes(2)
                                         .withStartupTimeout(Duration.of(60L, ChronoUnit.SECONDS));

This waiter is correct when running with an empty database. Starting a new container like below results in a logfile with 2x that log line.

@Container
public static PostgreSQLContainer postgreSQLContainer = new PostgreSQLContainer<>("postgres:14.2") // database image without data.
       .withDatabaseName("testcontainer")
       .withUsername("sa")
       .withPassword("sa");

Results in the following logs: postgresdb-no-data.txt

However, when you build a custom image which includes some data (like I have done below), then the logs will only contain "database system is ready" a single time.

private static DockerImageName IMAGE = DockerImageName.parse("tomcools/postgres:dev")
            .asCompatibleSubstituteFor("postgres");

@Container
 public static PostgreSQLContainer postgreSQLContainer = new PostgreSQLContainer<>(IMAGE)
       .withDatabaseName("testcontainer")
       .withUsername("sa")
       .withPassword("sa");

postgresdb-with-data-test-logs.txt

Workaround

The way I have worked around this for now, it to change the default waiter with a custom one that only waits until the log has passed a single time.

    private static DockerImageName IMAGE = DockerImageName.parse("tomcools/postgres:dev")
            .asCompatibleSubstituteFor("postgres");

    @Container
    public static PostgreSQLContainer postgreSQLContainer = new PostgreSQLContainer<>(IMAGE)
            // Custom waiter
            .waitingFor((new LogMessageWaitStrategy())
                    .withRegEx(".*database system is ready to accept connections.*\\s")
                    .withTimes(1)
                    .withStartupTimeout(Duration.of(60L, ChronoUnit.SECONDS))
            )

Possible solution directions

The contribution documentation states that, in order for something to become a module, it needs to "add value", where one of the examples given is:

does it add technology-specific wait strategies?

Given that statement, I'd expect the PostgreSQLContainer to be bootstrapped with a waiter that can handle both images with or without data present.

Possible ideas:

  • Create a method to set "withData(boolean)" , which can setup a different waiter, but that doesn't feel intuitive;
  • Allow composing of WaitStrategy(s): might need some AND/OR/NOT logic then, this would allow the creation of more complex wait strategy here to either check for 2x original log line (empty db), or 1x "data present" + 1x original log line;
  • Create a JDBC polling wait strategy, where you try to establish a JDBC connection (we already have the JDBC connection url), or potentially even execute a test query;
  • Something else???

I'm willing to help implement this. If no solution is implemented, I'd say we at least document the workaround on the PostgreSQL page.

Kind regards,
Tom

eddumelendez added a commit that referenced this issue May 12, 2022
A more generic wait strategy for jdbc. i.e Postgres has different
output when the container database has data that doesn't match with
the default wait strategy.

See gh-5359
@kiview
Copy link
Member

kiview commented May 13, 2022

Hey @TomCools, sorry for answering so late, especially considering you did such a great job authoring the issue 🙂

We already have a WaitAllStrategy that allows exposing, but is just considering an AND case. Would be interesting to think it in a more composable way.

The JDBC strategy sounds intuitively like a good idea, but the funny thing is, that we used to implicitly have it in the past and it would lead to a race condition, because of the restart behavior of the official postgres image (strategy could sometimes succeed before the restart).
See #322 and #5363

Since @eddumelendez started to look into this as well, depending on the image version, we might be able to identify better and more stable log message readiness indicators.

@TomCools
Copy link
Author

Hey @TomCools, sorry for answering so late, especially considering you did such a great job authoring the issue 🙂

Busy conference life eh? 🙂


On topic:
I understand, take your time. @eddumelendez, I'm on the Slack, if you want to collaborate a bit on this, we can talk there!

eddumelendez added a commit that referenced this issue May 16, 2022
A more generic wait strategy for postgres. i.e Postgres has different
output when the container database has data that doesn't match with
the default wait strategy.

First evaluation looks for (db doesn't contain data):
* PostgreSQL init process complete
* database system is ready to accept connections

Second evaluation looks for (db contains data):
* PostgreSQL Database directory appears to contain a database
* database system is ready to accept connections

See gh-5359
eddumelendez added a commit that referenced this issue May 17, 2022
A more generic wait strategy for postgres. i.e Postgres has different
output when the container database has data that doesn't match with
the default wait strategy.

First evaluation looks for (db doesn't contain data):
* PostgreSQL init process complete
* database system is ready to accept connections

Second evaluation looks for (db contains data):
* PostgreSQL Database directory appears to contain a database
* database system is ready to accept connections

See gh-5359
eddumelendez added a commit to eddumelendez/testcontainers-java that referenced this issue Feb 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants