Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reusable containers #1781

Merged
merged 17 commits into from Oct 26, 2019
Merged

Reusable containers #1781

merged 17 commits into from Oct 26, 2019

Conversation

@bsideup
Copy link
Member

bsideup commented Aug 24, 2019

This PR implements a first step for the reusability of containers (see #781).

Current state

What is implemented:

  • Hashing
  • JDBC URL support
  • Check for overrides like containerIsCreated (we cannot assume reusability if containerIsCreated is overridden)
  • Per-environment enablement
  • Read testcontainers.reuse.enable only from ~/.testcontainers.properties

TODO list:

  • Hashing of copied files
  • Add TC's version to the hash or add "hashing version" property

What is not implemented but planned as step 2:

  • ℹ️ Locking - two parallel tests with the same hash will reuse the same container
  • ℹ️ TTL - containers won't get destroyed after some time (Docker Compose -like behaviour)
  • ⚠️ Cleanup - if the configuration changes, a new container will be started, but the old one will not be destroyed and has to be terminated manually by the user
  • "Started successfully" marker to avoid a race condition where a container was created & hashed but the startup checks were not executed

Challenges ahead

  • Networks
  • Multi-container setups
  • There must be something else 😅

Prerequisites

Add testcontainers.reuse.enable=true to ~/.testcontainers.properties file on your local machine.

Example usage with container objects

KafkaContainer kafka = new KafkaContainer()
    .withNetwork(null)
    .withReuse(true);
kafka.start();

testKafkaFunctionality(kafka.getBootstrapServers());

Here we unset the network that was implicitly created by KafkaContainer (but not used in this case), because otherwise the network id will always be random and affect the hash.
Kafka is an exceptional case (to be fixed, left for backward compatibility) and most of other containers do not set the network implicitly.

Output:

19:27:07.019 INFO  🐳 [confluentinc/cp-kafka:5.2.1] - Creating container for image: confluentinc/cp-kafka:5.2.1
19:27:07.109 INFO  🐳 [confluentinc/cp-kafka:5.2.1] - Reusing container with ID: 86fb93cd10118c147f419d485781b999c2340b10d3b956ea5bb1bf5e7eaaae2a and hash: e10c95bab8779258336609315f4efa72d191aa9f
19:27:07.115 INFO  🐳 [confluentinc/cp-kafka:5.2.1] - Container confluentinc/cp-kafka:5.2.1 is starting: 86fb93cd10118c147f419d485781b999c2340b10d3b956ea5bb1bf5e7eaaae2a
19:27:07.269 INFO  🐳 [confluentinc/cp-kafka:5.2.1] - Container confluentinc/cp-kafka:5.2.1 started in PT0.295S

Example usage with JDBC URLs

Instant startedAt = Instant.now();
Connection connection = DriverManager.getConnection(
    "jdbc:tc:db2:///?TC_REUSABLE=true"
);

connection.createStatement().execute("SELECT 1  FROM SYSIBM.SYSDUMMY1");

System.out.println("Total test time: " + Duration.between(startedAt, Instant.now()));

Output:

19:10:47.273 INFO  🐳 [ibmcom/db2:11.5.0.0a] - Creating container for image: ibmcom/db2:11.5.0.0a
19:10:47.375 INFO  🐳 [ibmcom/db2:11.5.0.0a] - Reusing container with ID: 7b2da71c8c0c3aa901f5a610bd181e3a398ec4f3c8d22dbd491e23f5e1af47e9 and hash: 848385e00a45f9e1eecc42e90e0bcb7e1f2391bd
19:10:47.381 INFO  🐳 [ibmcom/db2:11.5.0.0a] - Container ibmcom/db2:11.5.0.0a is starting: 7b2da71c8c0c3aa901f5a610bd181e3a398ec4f3c8d22dbd491e23f5e1af47e9
19:10:47.382 INFO  🐳 [ibmcom/db2:11.5.0.0a] - Container ibmcom/db2:11.5.0.0a started in PT0.148S

Total test time: PT2.782S
@bsideup bsideup requested a review from testcontainers/core-team Aug 24, 2019
*/
@Setter(AccessLevel.NONE)
protected DockerClient dockerClient = DockerClientFactory.instance().client();
protected DockerClient dockerClient = LazyDockerClient.INSTANCE;

This comment has been minimized.

Copy link
@bsideup

bsideup Aug 24, 2019

Author Member

needed this one for unit tests, plus good improvement in general. Was also reported as #1749

This comment has been minimized.

Copy link
@aguibert

aguibert Aug 28, 2019

Contributor

awesome, thanks for doing this change @bsideup!


logger().info("Starting container with ID: {}", containerId);
dockerClient.startContainerCmd(containerId).exec();
if (containerInfo == null) {

This comment has been minimized.

Copy link
@bsideup

bsideup Aug 24, 2019

Author Member

Important information for the reviewers:
these containerInfo == null branches are important to review carefully

bsideup added 8 commits Aug 25, 2019
@@ -1246,6 +1333,11 @@ public SELF withTmpFs(Map<String, String> mapping) {
return self();
}

public SELF withReuse(boolean reusable) {

This comment has been minimized.

Copy link
@rnorth

rnorth Aug 26, 2019

Member

Re this bit of the API, I'm afraid I still really would like this to not be a boolean, so that we have room for expansion. e.g. to have an enum or interface to allow (pseudo-code-ish):

.withReuse( Reuse.reuseContainers() )

but also allow, for the sake of #1713 (and the hilariously old draft implementation):

.withReuse ( Reuse.reuseImagesAfterInitialization() ) // definitely needs a better name

I think we could get the second style of reuse done very quickly for JDBC-based containers.

With the reduced scope of container reuse in this PR, do you think this buys us enough cognitive space to fit this flexibility in?

This comment has been minimized.

Copy link
@bsideup

bsideup Aug 26, 2019

Author Member

While I understand why we would want to add it, here my 2c why I think we should not do it as part of the first step:

  1. JDBC URL can only have boolean (or lambda reference, but that's tricky)
  2. withReuse(Boolean) is easy to extend later with an overloaded withReuse(ReuseStrategy)
  3. The image caching is another great optimization, but I don't feel that it is the same as the container reusing - we will need to start a new container every time (which is also CI friendly, unlike this PR), which kinda makes it a sugar for the image + capturing an image after the start of container.
  4. The reuse is currently has to be enabled with the property. Having different strategies may make it harder because I assume it will be per-strategy enablement

The implementation details of the reuse also make it not easy to have strategies for it, and we may never have more than one (given №3)

This comment has been minimized.

Copy link
@rnorth

rnorth Oct 20, 2019

Member

OK, I'm convinced 😄

I'll add a 5th reason: while I'd love to do it, we may never get around to image reuse. To be pragmatic, I don't want to hold up this almost feature for a someday-maybe feature.

try {
Method method = type.getDeclaredMethod("containerIsCreated", String.class);
if (method.getDeclaringClass() != GenericContainer.class) {
logger().warn("{} can't be reused because it overrides {}", getClass(), method.getName());

This comment has been minimized.

Copy link
@rnorth

rnorth Aug 26, 2019

Member

Very good idea.

bsideup added 3 commits Aug 26, 2019
# Conflicts:
#	core/src/main/java/org/testcontainers/images/builder/ImageFromDockerfile.java
#	core/src/main/java/org/testcontainers/utility/TestcontainersConfiguration.java
createCommand.getLabels().put(HASH_LABEL, hash);
}
} else {
logger().info("Reuse was requested but the environment does not support the reuse of containers");

This comment has been minimized.

Copy link
@aguibert

aguibert Aug 28, 2019

Contributor

IMO we should make this a big/obvious INFO message that includes instructions on how to enable reusable containers. This way users that don't closely follow the new TC features will still hear about it and know how to enable it.
I'm thinking something like:

################################################################
Reuse was requested but the environment does not support the reuse of containers. 
To enable container reuse, add the property 'testcontainers.reuse.enable=true' 
to a file at ~/.testcontainers.properties (you may need to create it).
################################################################

This comment has been minimized.

Copy link
@bsideup

bsideup Aug 29, 2019

Author Member

This warning will only appear if they use withReuse(true), which means that they are aware of the feature already :)

Also, we expect this feature to be pretty well discoverable by anyone who cares about the speed of the tests (will also make sure that the docs explain it well), and I don't feel that we need the banner

This comment has been minimized.

Copy link
@aguibert

aguibert Aug 29, 2019

Contributor

I think for projects with multiple developers it may be less obvious. One developer on a team may know about the feature and add withReuse(true) but other developers may not

This comment has been minimized.

Copy link
@dsyer

dsyer Sep 10, 2019

I just tried this feature for the first time, and I have to say it's completely non-obvious. I see the INFO log, but there is no hint about what I need to do to fix it.

This comment has been minimized.

Copy link
@dsyer

dsyer Sep 10, 2019

BTW the file name in the banner hint above appears to be wrong as well. I think it needs to be ~/.testcontainers.properties. Windows users are probably going to be confused by that as well. Is it the only way to set that property?

This comment has been minimized.

Copy link
@alex-sherwin

alex-sherwin Jan 21, 2020

Was just trying to debug this feature... it is very difficult to discover.

Additionally, it's setup such that it forces you to use ~/.testcontainers.properties to configure it... why not allow it to also be set from the classpath testcontainers.properties?

By forcing it to be under $HOME, you're forcing end-users to setup all build servers, every developer workstation, etc to configure this file. I don't see why you can't let the end-user opt-in to this feature with the classpath config file, this makes the most sense for enterprise development where consistency is king

This comment has been minimized.

Copy link
@aguibert

aguibert Jan 21, 2020

Contributor

The intent of this feature (reusing containers across multiple test runs) was to only have it apply for developer workstations. @bsideup specifically did not want it to be used for CI/build servers because container instances could pile up and bog down remote servers and easily go un-noticed.

That being said, I do think a more prominent INFO/WARN message for this situation would help users figure out how to turn this on in their local dev workstations.

This comment has been minimized.

Copy link
@bsideup

bsideup Jan 24, 2020

Author Member

Thanks @aguibert for providing a good answer :)

So yeah, what Andy said :)

I do think a more prominent INFO/WARN message for this situation would help users figure out how to turn this on in their local dev workstations.

Yes, next iteration we will definitely improve the UX around it 👍

This comment has been minimized.

Copy link
@checketts

checketts Jan 24, 2020

What about if a user does check this in to the file on their classpath, to ignore it (as you are already doing), but also print out the explanation that @aguibert mentioned and that the setting needs to be set elsewhere.

(Perhaps that is what you meant by improve the UX)

This comment has been minimized.

Copy link
@bsideup

bsideup Jan 24, 2020

Author Member

yes, we will definitely consider reporting environment-specific configuration set in the classpath file! Good suggestion 👍

alex-konn added a commit to Graylog2/graylog2-server that referenced this pull request Nov 22, 2019
Added the new `withReuse(true)` flag from this new testcontainers
feature: testcontainers/testcontainers-java#1781

This allows not only reusing a container between different tests in the
same run, but also keeping the container alive for subsequent runs to
decrease container startup time.
bsideup added a commit that referenced this pull request Nov 22, 2019
To make KafkaContainer work with #1781 without having to nullify the network, we should start removed the old, deprecated behaviour where the network was implicitly created in the constructor.
bernd added a commit to Graylog2/graylog2-server that referenced this pull request Nov 25, 2019
…#6843)

Added the new `withReuse(true)` flag from this new testcontainers
feature: testcontainers/testcontainers-java#1781

This allows not only reusing a container between different tests in the
same run, but also keeping the container alive for subsequent runs to
decrease container startup time.
@aritzbastida

This comment has been minimized.

Copy link

aritzbastida commented Nov 27, 2019

I just tested this reusable container feature in my local environment for a database container, and got mixed feelings.

I set up this JDBC URL:
jdbc:tc:oracle:thin:@///xe?TC_INITSCRIPT=data/ddl_gstr_tc.sql&TC_REUSABLE=true

I also enabled the reuse flag in my user home settings:
testcontainers.reuse.enable=true

My expectation:

  • The container survives the integration tests, and can be reused when launched again.
  • The init script is executed just once, when the container is initially created (not reused).

The first point appears to be true. That can be verified with docker ps. The same container is reused over and over. However, if TC_REUSABLE is true and the reuse flag is not set (case for CI environment), then the container will still survive the test, but won't be reused. As a result, a new container is created each time, and never killed. This is the situation after 3 tests:

$ docker ps
CONTAINER ID    IMAGE                                    CREATED         STATUS
f6d131d2cf31    oracleinanutshell/oracle-xe-11g:latest   2 minutes ago   Up 2 minutes       
d6dd56817fdd    oracleinanutshell/oracle-xe-11g:latest   4 minutes ago   Up 4 minutes        
367ccd485a8a    oracleinanutshell/oracle-xe-11g:latest   7 hours ago     Up 7 hours

As for the second point, the main use case for a init script is setting up a database schema, something that is only needed once. So it would make more sense not to execute the script when the container is reused.

BTW, should I open a new issue for this?

linuspahl added a commit to Graylog2/graylog2-server that referenced this pull request Nov 27, 2019
…#6843)

Added the new `withReuse(true)` flag from this new testcontainers
feature: testcontainers/testcontainers-java#1781

This allows not only reusing a container between different tests in the
same run, but also keeping the container alive for subsequent runs to
decrease container startup time.
@bsideup

This comment has been minimized.

Copy link
Member Author

bsideup commented Nov 27, 2019

@aritzbastida
Thanks for trying!

However, if TC_REUSABLE is true and the reuse flag is not set (case for CI environment), then the container will still survive the test, but won't be reused. As a result, a new container is created each time, and never killed

There is a bug (see #2051) that will be fixed in the next release.

So it would make more sense not to execute the script when the container is reused.

Will also be included in the next release, see #2052

@aritzbastida

This comment has been minimized.

Copy link

aritzbastida commented Nov 27, 2019

Thank you very much for the early reply! Glad to hear it will be fixed :)

By the way, will this feature be documented in the testcontainers.org website? This PR was the only place I could find how to set it up.

@bsideup

This comment has been minimized.

Copy link
Member Author

bsideup commented Nov 27, 2019

@aritzbastida
It will be, eventually :)
We have a bit of a capacity problem (and will definitely appreciate a contribution to the docs!)

@aritzbastida

This comment has been minimized.

Copy link

aritzbastida commented Nov 27, 2019

Thanks again! I'd be glad to contribute, though for now I'm a complete newbie in Testcontainers (just used it for the first time).

@aguibert

This comment has been minimized.

Copy link
Contributor

aguibert commented Nov 27, 2019

@bsideup @aritzbastida I've also had a similar expectation with DB containers and init scripts. Looking at #2052, it's not immediately obvious to me how this solves the problem of only running init scripts once?

Perhaps we could add a new API to the JdbcDatabaseContainer class:

/**
 * Runs a SQL initialization script after this container is started
 * @param initScriptPath Path to the initialization script
 * @param runOnReuse True if the initialization script should run again on a container that has already been started and is being reused. False otherwise.
 */
public SELF withInitScript(String initScriptPath, boolean runOnReuse) {
 // ...
}
@aguibert

This comment has been minimized.

Copy link
Contributor

aguibert commented Nov 27, 2019

If you think this change would be a good enhancement I'd be happy to draft up a PR with the change

@aritzbastida

This comment has been minimized.

Copy link

aritzbastida commented Nov 30, 2019

I just tested 1.12.4 and I can confirm that #2051 is now fixed.

However, as @aguibert pointed out, the issue with the init script (#2052) still remains. The init script is executed again when the container is reused, failing like this:

Caused by: oracle.jdbc.OracleDatabaseException: ORA-00955: name is already used by an existing object
@bsideup

This comment has been minimized.

Copy link
Member Author

bsideup commented Dec 1, 2019

@aritzbastida @aguibert
could you please create a separate issue for the init script? There are a few things to discuss before implementing it and I wanted to make sure that we're all on the same page :)

@aguibert

This comment has been minimized.

Copy link
Contributor

aguibert commented Dec 1, 2019

Created separate issue for the init script here: #2127

@jsauvain

This comment has been minimized.

Copy link

jsauvain commented Dec 4, 2019

@bsideup When using 1.12.4 and trying to manually stop the container, the ryuk container won't destroy itself and runs infinite although the testcontainer has already been shut down.

Maybe it's a bug of the Start Ryuk lazy PR

@bsideup

This comment has been minimized.

Copy link
Member Author

bsideup commented Dec 4, 2019

@jsauvain please submit an issue (preferably with a reproducer). Thanks

@sirianni

This comment has been minimized.

Copy link

sirianni commented Dec 9, 2019

Thanks for this great library and this useful enhancement!

I must be missing something obvious, but even when setting withReuse(true) my containers are still stopped at the end of each test class (via the test rule). Looking at the diff there are no changes to the GenericContainer.stop() method to prevent stopping of reusable containers. How should I prevent my containers from being stopped?

@aguibert

This comment has been minimized.

Copy link
Contributor

aguibert commented Dec 9, 2019

hi @sirianni, do you have testcontainers.reuse.enable=true set in your ~/.testcontainers.properties properties file? That is a required step for reusable containers.

@bsideup

This comment has been minimized.

Copy link
Member Author

bsideup commented Dec 9, 2019

@sirianni
Thanks for the feedback!

Yes, calling .stop() on container will terminate it, this is an intended behaviour. If you want to reuse a container, the easiest would be to use the Singleton Container pattern, so that you only call .start() but never stop().

Also, make sure that you've done this:

Add testcontainers.reuse.enable=true to ~/.testcontainers.properties file on your local machine.

@sirianni

This comment has been minimized.

Copy link

sirianni commented Dec 9, 2019

Thanks @aguibert and @bsideup - yes I do have the properties configured properly. Thanks for the pointer to the SingletonContainer pattern.

How do I know in my singleton whether or not the container is enabled for reuse? I assume that I only want to skip the stop() call for environments where developers have set testcontainers.reuse.enable=true.

I saw the comment above requesting an isReused() attribute.

@bsideup

This comment has been minimized.

Copy link
Member Author

bsideup commented Dec 9, 2019

@sirianni

I assume that I only want to skip the stop() call for environments where developers have set testcontainers.reuse.enable=true.

Why? Testcontainers will stop it for you after the tests (in the most reliable way) if reuse is not enabled.

I saw the comment above requesting an isReused() attribute.

The comment was about the hooks and being able to detect the reuse mode for some cleanups - this is a bit different thing.

@sirianni

This comment has been minimized.

Copy link

sirianni commented Dec 9, 2019

Why? Testcontainers will stop it for you after the tests (in the most reliable way) if reuse is not enabled.

Ah OK. Guess I missed this part in the docs.

At the end of the test suite the Ryuk container that is started by Testcontainers core will take care of stopping the singleton container.

So the Ryuk container has the smarts to leave the container running if reuse is enabled?

@bsideup

This comment has been minimized.

Copy link
Member Author

bsideup commented Dec 9, 2019

@sirianni

So the Ryuk container has the smarts to leave the container running if reuse is enabled?

Yes, that's correct. In the next versions, there will be a separate sidecar to also shutdown these reusable containers after some TTL

@vmassol

This comment has been minimized.

Copy link
Contributor

vmassol commented Jan 7, 2020

Interesting (but dangerous) feature!

I can see a use case when there are containers that take a very long time to start the first time (e.g. the Oracle Database 19.3 container takes between 20-40mn to start the first time). However, I think I'd prefer to docker commit the state into a new image after it's started and reuse that new image.

However the local dev machine use case seems a good target for this use case to save some time during execution (if you can make it deterministic - which is the biggest worry IMO). The debug use case is also great.

Thanks for working on this! :)

@@ -85,12 +86,17 @@ public boolean isDisableChecks() {
return Boolean.parseBoolean((String) environmentProperties.getOrDefault("checks.disable", "false"));
}

@UnstableAPI
public boolean environmentSupportsReuse() {
return Boolean.parseBoolean((String) environmentProperties.getOrDefault("testcontainers.reuse.enable", "false"));

This comment has been minimized.

Copy link
@gvdp

gvdp Jan 24, 2020

Is there a reason this is fetched from the environmentProperties and not just the general properties? I just added this property to my testcontainers.properties file on my classpath and was wondering for a long time why this wasn't working until I saw this line.

This comment has been minimized.

Copy link
@bsideup

bsideup Jan 24, 2020

Author Member

See this thread:
#1781 (comment)

TL;DR: it is a parameter of every environment and cannot be shared / committed to the repo. Also, it SHOULD NOT be used on CI environments.
The PR's description states it pretty clearly and does not mention the classpath file anywhere :)

@bsideup

This comment has been minimized.

Copy link
Member Author

bsideup commented Jan 24, 2020

@vmassol

I'd prefer to docker commit

even if you commit the state, starting a container takes some seconds (we tried :D)

if you can make it deterministic

That was the biggest challenge when introducing this feature and we always keep it in mind. It is also the reason why it took so long to introduce an alpha version of this feature - we wanted to make it right and ensure that wrong containers are not accidentally reused :)

Thanks for working on this! :)

Thanks for the feedback! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.