Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flaky RGW CI test #326

Open
UtkarshBhatthere opened this issue Mar 7, 2024 · 1 comment
Open

Flaky RGW CI test #326

UtkarshBhatthere opened this issue Mar 7, 2024 · 1 comment

Comments

@UtkarshBhatthere
Copy link
Contributor

Issue report

What version of MicroCeph are you using ?

Development Versions from active PRs.

What are the steps to reproduce this issue ?

This is a probabilistic issue but I have seen many instances of this failure.

What happens (observed behaviour) ?

shell: /usr/bin/bash -e {0}
+ lxc exec node-wrk0 -- sh -c '/mnt/actionutils.sh testrgw '
  cluster:
    id:     d6[4](https://github.com/canonical/microceph/actions/runs/8174334470/job/22349159128?pr=325#step:17:5)02f43-0b46-48ef-91a3-38cedd71aaa2
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum node-wrk0,node-wrk1,node-wrk2 (age 8s)
    mgr: node-wrk0(active, starting, since 11s), standbys: node-wrk1, node-wrk2
    osd: 3 osds: 3 up (since 2s), 3 in (since 76s)
    rgw: 1 daemon active (1 hosts, 1 zones)
 
  data:
    pools:   7 pools, 131 pgs
    objects: 19[5](https://github.com/canonical/microceph/actions/runs/8174334470/job/22349159128?pr=325#step:17:6) objects, 454 KiB
    usage:   82 MiB used, 2.9 GiB / 3 GiB avail
    pgs:     1.527% pgs unknown
             3.053% pgs not active
             125 active+clean
             4   peering
             2   unknown
 
  progress:
    Global Recovery Event (0s)
      [............................] 
 
● snap.microceph.rgw.service - Service for snap application microceph.rgw
     Loaded: loaded (/etc/systemd/system/snap.microceph.rgw.service; enabled; vendor preset: enabled)
     Active: active (running) since Wed 2024-03-0[6](https://github.com/canonical/microceph/actions/runs/8174334470/job/22349159128?pr=325#step:17:7) 15:33:08 UTC; 32s ago
   Main PID: 6183 (radosgw)
      Tasks: 52 (limit: 19169)
     Memory: 31.9M
        CPU: 121ms
     CGroup: /system.slice/snap.microceph.rgw.service
             └─6183 radosgw -f --cluster ceph --name client.radosgw.gateway -c /var/snap/microceph/x1/conf/radosgw.conf

Mar 06 15:33:08 node-wrk0 systemd[1]: Started Service for snap application microceph.rgw.
Mar 06 15:33:29 node-wrk0 microceph.rgw[6183]: 2024-03-06T15:33:29.201+0000 [7](https://github.com/canonical/microceph/actions/runs/8174334470/job/22349159128?pr=325#step:17:8)f91bac4a0c0 -1 asok(0x55ec0e1ee000) AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to bind the UNIX domain socket to '/var/snap/microceph/793/run/ceph-client.radosgw.gateway.61[8](https://github.com/canonical/microceph/actions/runs/8174334470/job/22349159128?pr=325#step:17:9)3.[9](https://github.com/canonical/microceph/actions/runs/8174334470/job/22349159128?pr=325#step:17:10)4472337561760.asok': (13) Permission denied

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    An unexpected error has occurred.
  Please try reproducing the error using
  the latest s3cmd code from the git master
  branch found at:
    https://github.com/s3tools/s3cmd
  and have a look at the known issues list:
    https://github.com/s3tools/s3cmd/wiki/Common-known-issues-and-their-solutions-(FAQ)
  If the error persists, please report the
  following lines (removing any private
  info as necessary) to:
   s3tools-bugs@lists.sourceforge.net


!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Invoked as: /usr/bin/s3cmd --host localhost --host-bucket=localhost/%(bucket) --access_key=fooAccessKey --secret_key=fooSecretKey --no-ssl mb s3://testbucket
Problem: <class 'ConnectionRefusedError: [Errno 111] Connection refused
S3cmd:   2.2.0
python:   3.[10](https://github.com/canonical/microceph/actions/runs/8174334470/job/22349159128?pr=325#step:17:11).12 (main, Nov 20 2023, 15:14:05) [GCC [11](https://github.com/canonical/microceph/actions/runs/8174334470/job/22349159128?pr=325#step:17:12).4.0]
environment LANG=C.UTF-8

Traceback (most recent call last):
  File "/usr/bin/s3cmd", line 3209, in <module>
    rc = main()
  File "/usr/bin/s3cmd", line 3106, in main
    rc = cmd_func(args)
  File "/usr/bin/s3cmd", line 260, in cmd_bucket_create
    response = s3.bucket_create(uri.bucket(), cfg.bucket_location, cfg.extra_headers)
  File "/usr/lib/python3/dist-packages/S3/S3.py", line 430, in bucket_create
    response = self.send_request(request)
  File "/usr/lib/python3/dist-packages/S3/S3.py", line [14](https://github.com/canonical/microceph/actions/runs/8174334470/job/22349159128?pr=325#step:17:15)80, in send_request
    conn = ConnMan.get(self.get_hostname(resource['bucket']))
  File "/usr/lib/python3/dist-packages/S3/ConnMan.py", line [28](https://github.com/canonical/microceph/actions/runs/8174334470/job/22349159128?pr=325#step:17:29)4, in get
    conn.c.connect()
  File "/usr/lib/python3.10/http/client.py", line 942, in connect
    self.sock = self._create_connection(
  File "/usr/lib/python3.10/socket.py", line 845, in create_connection
    raise err
  File "/usr/lib/python3.10/socket.py", line 8[33](https://github.com/canonical/microceph/actions/runs/8174334470/job/22349159128?pr=325#step:17:34), in create_connection
    sock.connect(sa)
ConnectionRefusedError: [Errno 111] Connection refused

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
    An unexpected error has occurred.
  Please try reproducing the error using
  the latest s3cmd code from the git master
  branch found at:
    https://github.com/s3tools/s3cmd
  and have a look at the known issues list:
    https://github.com/s3tools/s3cmd/wiki/Common-known-issues-and-their-solutions-(FAQ)
  If the error persists, please report the
  above lines (removing any private
  info as necessary) to:
   s3tools-bugs@lists.sourceforge.net
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

What were you expecting to happen ?

The RGW client test to be successfull.

@sabaini
Copy link
Collaborator

sabaini commented Mar 7, 2024

Ack, thanks for reporting -- I've seen this a few times as well. Maybe need to update timeouts, resp. wait longer for RGW to come up?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants