Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: failed to connect to reaper: dial tcp 127.0.0.1:33267: connect: connection refused: Connecting to Ryuk on localhost:33267 failed #2618

Closed
maxmzkr opened this issue Jul 1, 2024 · 5 comments
Labels
bug An issue with the library

Comments

@maxmzkr
Copy link

maxmzkr commented Jul 1, 2024

Testcontainers version

0.31.0

Using the latest Testcontainers version?

Yes

Host OS

Linux

Host arch

x86

Go version

1.21.4

Docker version

docker version     
Client: Docker Engine - Community
 Version:           26.1.3
 API version:       1.45
 Go version:        go1.21.10
 Git commit:        b72abbb
 Built:             Thu May 16 08:35:10 2024
 OS/Arch:           linux/amd64
 Context:           default

Server: Docker Engine - Community
 Engine:
  Version:          26.1.3
  API version:      1.45 (minimum version 1.24)
  Go version:       go1.21.10
  Git commit:       8e96db1
  Built:            Thu May 16 08:33:26 2024
  OS/Arch:          linux/amd64
  Experimental:     false
 containerd:
  Version:          1.6.32
  GitCommit:        8b3b7ca2e5ce38e8f31a34f35b2b68ceb8470d89
 runc:
  Version:          1.1.12
  GitCommit:        v1.1.12-0-g51d5e94
 docker-init:
  Version:          0.19.0
  GitCommit:        de40ad0

Docker info

docker info                                                               
Client: Docker Engine - Community
 Version:    26.1.3
 Context:    default
 Debug Mode: false
 Plugins:
  buildx: Docker Buildx (Docker Inc.)
    Version:  v0.14.0
    Path:     /usr/libexec/docker/cli-plugins/docker-buildx
  compose: Docker Compose (Docker Inc.)
    Version:  v2.27.0
    Path:     /usr/libexec/docker/cli-plugins/docker-compose

Server:
 Containers: 61
  Running: 3
  Paused: 0
  Stopped: 58
 Images: 197
 Server Version: 26.1.3
 Storage Driver: btrfs
  Btrfs: 
 Logging Driver: json-file
 Cgroup Driver: systemd
 Cgroup Version: 2
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 8b3b7ca2e5ce38e8f31a34f35b2b68ceb8470d89
 runc version: v1.1.12-0-g51d5e94
 init version: de40ad0
 Security Options:
  seccomp
   Profile: builtin
  cgroupns
 Kernel Version: 6.8.10-300.fc40.x86_64
 Operating System: Fedora Linux 40 (Workstation Edition)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 31GiB
 Name: worky
 ID: 6SAN:GXXQ:AUGM:AOWJ:GLJW:YHHS:ICMJ:AZP7:VLNA:6YGY:NZKS:5RUK
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

What happened?

We are using the docker compose module with this snippet

		compose, err := tc.NewDockerCompose("../../../docker-compose.yaml")
		require.NoError(t, err)
		
		err = compose.Up(ctx, tc.RunServices("db", "init1", "init2"), tc.Wait(false))
		require.NoError(t, err)

init1 depends on db being started, and init2 depends on init1 completing successfully.

During this time, we are finding that the ryuk container is spun up, and then is terminated before Up has completed.

Some notes, when we have a failure, we see this in the logs

2024/07/01 19:42:02 New client connected: 172.17.0.1:46954                                                                                                                                                                                                                                                                                                                                  
2024/07/01 19:42:02 EOF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
2024/07/01 19:42:02 Client disconnected: 172.17.0.1:46954                                                                                                                                                                                                                                                                                                                                   

but when we have a success, we don't see this in the logs. The connection appears to come from the ryuk waitstrategy. I don't know for sure, but it seems like it's possible for the waitstrategy to connect and disconnect quick enough that ryuk doesn't see the connection, and that is when our tests work.

Ideally ryuk would not shut down while Up is being run. I know we can set reconnect timeout higher, but it would be nice to have this work out of the box without having to change a setting.

Relevant log output

2024/07/01 15:42:02 🐳 Creating container for image testcontainers/ryuk:0.7.0
2024/07/01 15:42:02 ✅ Container created: 6e7cea31cd0a
2024/07/01 15:42:02 🐳 Starting container: 6e7cea31cd0a
2024/07/01 15:42:02 ✅ Container started: 6e7cea31cd0a
2024/07/01 15:42:02 🚧 Waiting for container id 6e7cea31cd0a image: testcontainers/ryuk:0.7.0. Waiting for: &{Port:8080/tcp timeout:<nil> PollInterval:100ms}
2024/07/01 19:42:02 Pinging Docker...
2024/07/01 19:42:02 Docker daemon is available!
2024/07/01 19:42:02 Starting on port 8080...
2024/07/01 19:42:02 Started!
2024/07/01 19:42:02 New client connected: 172.17.0.1:46954
2024/07/01 19:42:02 EOF
2024/07/01 19:42:02 Client disconnected: 172.17.0.1:46954
2024/07/01 15:42:02 🔔 Container is ready: 6e7cea31cd0a
2024/07/01 19:42:12 Timed out waiting for re-connection
2024/07/01 19:42:12 Removed 0 container(s), 0 network(s), 0 volume(s) 0 image(s)


### Additional information

_No response_
@maxmzkr maxmzkr added the bug An issue with the library label Jul 1, 2024
@bearrito
Copy link
Contributor

bearrito commented Jul 2, 2024

@maxmzkr Can you include your compose file?

@maxmzkr
Copy link
Author

maxmzkr commented Jul 2, 2024

yup, here's a quick example

services:
  job2:
    image: ubuntu
    command: echo "completed"
    depends_on:
      job1:
        condition: service_completed_successfully
  job1:
    image: ubuntu
    command: sleep 20
    depends_on:
      - server
  server:
    image: ubuntu
    command: echo started && sleep infinity
package main

import (
	"context"
	"testing"

	"github.com/stretchr/testify/require"
	tc "github.com/testcontainers/testcontainers-go/modules/compose"
)

func Test(t *testing.T) {
	ctx := context.Background()
	compose, err := tc.NewDockerCompose("docker-compose.yaml")
	require.NoError(t, err)

	t.Cleanup(func() {
		require.NoError(t, compose.Down(context.Background(), tc.RemoveOrphans(true), tc.RemoveImagesLocal))
	})

	ctx, cancel := context.WithCancel(ctx)
	t.Cleanup(cancel)

	require.NoError(t, compose.Up(ctx, tc.RunServices("server", "job1", "job2"), tc.Wait(false)))
}
go test -v ./
=== RUN   Test
2024/07/01 21:38:17 github.com/testcontainers/testcontainers-go - Connected to docker:
  Server Version: 24.0.5
  API Version: 1.43
  Operating System: Ubuntu 24.04 LTS
  Total Memory: 15708 MB
  Resolved Docker Host: unix:///var/run/docker.sock
  Resolved Docker Socket Path: /var/run/docker.sock
  Test SessionID: bf31dc8853fffd65a008e2f80fc8116ab9c67be3cba58f980f0980066cb15ce9
  Test ProcessID: 4a4462c7-f208-4f58-95c3-32ea957f4b46
2024/07/01 21:38:17 🐳 Creating container for image testcontainers/ryuk:0.7.0
2024/07/01 21:38:17 ✅ Container created: 6e9a6a49aeda
2024/07/01 21:38:17 🐳 Starting container: 6e9a6a49aeda
2024/07/01 21:38:17 ✅ Container started: 6e9a6a49aeda
2024/07/01 21:38:17 🚧 Waiting for container id 6e9a6a49aeda image: testcontainers/ryuk:0.7.0. Waiting for: &{Port:8080/tcp timeout:<nil> PollInterval:100ms}
2024/07/01 21:38:17 🔔 Container is ready: 6e9a6a49aeda
 Network 1cb60bea-6c13-4541-afaf-5fb3b93c567c_default  Creating
 Network 1cb60bea-6c13-4541-afaf-5fb3b93c567c_default  Created
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-server-1  Creating
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-server-1  Created
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job1-1  Creating
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job1-1  Created
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job2-1  Creating
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job2-1  Created
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-server-1  Starting
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-server-1  Started
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job1-1  Starting
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job1-1  Started
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job1-1  Waiting
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job1-1  Exited
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job2-1  Starting
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job2-1  Started
    main_test.go:23:
                Error Trace:    /home/max/testcontainers-example/main_test.go:23
                Error:          Received unexpected error:
                                failed to connect to reaper: dial tcp 127.0.0.1:32770: connect: connection refused: Connecting to Ryuk on localhost:32770 failed
                Test:           Test
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job2-1  Stopping
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job2-1  Stopped
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job2-1  Removing
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job2-1  Removed
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job1-1  Stopping
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job1-1  Stopped
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job1-1  Removing
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-job1-1  Removed
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-server-1  Stopping
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-server-1  Stopped
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-server-1  Removing
 Container 1cb60bea-6c13-4541-afaf-5fb3b93c567c-server-1  Removed
 Network 1cb60bea-6c13-4541-afaf-5fb3b93c567c_default  Removing
 Network 1cb60bea-6c13-4541-afaf-5fb3b93c567c_default  Removed
--- FAIL: Test (22.18s)
FAIL
FAIL    github.com/maxmzkr/testcontainers-example       22.234s
FAIL

@mdelapenya
Copy link
Collaborator

I think issue is related to #2563 (comment)

I'd check if increasing the timeouts for ryuk with compose fix it: as we are delegating to compose the pull of the images it could be the case ryuk needs more time, depending on the number and type of services in your compose file.

@mdelapenya
Copy link
Collaborator

In fact, this issue seems a duplication of #2563

I'm closing this in favor of the other one 🙏

Please reopen it if you consider is different, thank you!

@mdelapenya mdelapenya closed this as not planned Won't fix, can't repro, duplicate, stale Jul 2, 2024
@maxmzkr
Copy link
Author

maxmzkr commented Jul 2, 2024

I think it's possible it's related. I see a ipv6 address in that one and I worry it has something to do with weird ipv6 handling. Some mac users on our project have needed to manually specify a ipv4 address for ryuk. I'll treat them the same until we have a better reason to believe otherwise

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug An issue with the library
Projects
None yet
Development

No branches or pull requests

3 participants