Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with Ceph Ansible stable-8.0 on Ubuntu 22.04/24.04 and Impact on OpenStack Ansible (OSA) Integration #7496

Closed
tiagonix opened this issue Mar 12, 2024 · 13 comments · Fixed by #7523

Comments

@tiagonix
Copy link

tiagonix commented Mar 12, 2024

Greetings to all,

As an avid supporter and long-term user of Ceph Ansible, I've had the privilege of deploying Ceph in numerous corporate environments over the years, alongside educating many students on its deployment and maintenance. Our organization extensively utilizes Ubuntu and its Ubuntu Cloud Archive, appreciating the seamless upgrade paths it offers. This allows for the deployment of an LTS distribution and incremental Ceph releases atop the same LTS until the next LTS release. Transitioning between LTS releases while maintaining the same Ceph version is a breeze, fully backed by Canonical's support, leveraging its Debian heritage. Here’s an overview for clarity:

We manage several Ceph Ansible pipelines, facilitating individual Ceph Deployments in a CI/CD-like pipeline, including Ubuntu and Ceph upgrades when necessary. However, we're currently facing challenges with the stable-8.0 branch of Ceph Ansible, particularly with its compatibility with Ubuntu 22.04 (with Ubuntu Cloud Archive Bobcat enabled for Ceph Reef) and Ubuntu 24.04 (default with Ceph Reef). It appears that stable-8.0 exhibits significant issues, impacting not only our operations but also the broader OpenStack Ansible (OSA) community. The OSA project is constrained to the stable-7.0 branch and Ceph Quincy, despite Reef being accessible in Ubuntu repositories.

References:

Ceph Pipelines Configuration

For enabling Ubuntu's UCA repositories, we opt for manual configuration over predefined variables such as ceph_repository: uca and its related settings. Instead, we use:

ceph_origin: distro
ceph_stable_release: {{ceph_release_codename}}

We do not use:

ceph_repository: uca
ceph_stable_repo_uca: "http://ubuntu-cloud.archive.canonical.com/ubuntu"
ceph_stable_openstack_release_uca: yoga
ceph_stable_release_uca: "{{ ansible_facts['distribution_release'] }}-updates/{{ ceph_stable_openstack_release_uca }}"

This approach allows Ceph Ansible to utilize distro, automatically leveraging UCA when appropriate, simplifying the process without additional variables/logic.

Deployment Scenarios

  1. Ubuntu 20.04 with Ceph Octopus
    • Ubuntu: 20.04
    • UCA: None
    • Ceph: Octopus
    • Ceph Ansible Branch: stable-5.0
    • Ansible Version: 2.9 (via apt install ansible)
    • Deployment Success

Configured Ceph Ansible Variables

ceph_origin: distro
ceph_stable_release: octopus
docker: false
containerized_deployment: false

Ceph Ansible Run

cd ~/ceph-ansible
cp site.yml.sample site.yml
ansible-playbook -i /etc/ceph-ansible-inventory/hosts.ini site.yml

Works!

  1. Ubuntu 20.04 with Ceph Pacific (via UCA Wallaby)
    • Ubuntu: 20.04
    • UCA: add-apt-repository cloud-archive:wallaby
    • Ceph: Pacific
    • Ceph Ansible Branch: stable-6.0
    • Ansible Version: 2.10 (via PPA add-apt-repository ppa:ansible/ansible-2.10)
    • Deployment Success

Configured Ceph Ansible Variables

ceph_origin: distro
ceph_stable_release: pacific
docker: false
containerized_deployment: false

Ceph Ansible Run

cd ~/ceph-ansible
ansible-galaxy install -r requirements.yml
cp site.yml.sample site.yml
ansible-playbook -i /etc/ceph-ansible-inventory/hosts.ini site.yml

Works!

  1. Ubuntu 20.04 with Ceph Quincy (via UCA Yoga)
    • Ubuntu: 20.04
    • UCA: add-apt-repository cloud-archive:yoga
    • Ceph: Quincy
    • Ceph Ansible Branch: stable-7.0
    • Ansible Version: 2.12 (via PPA add-apt-repository ppa:ansible/ansible)
    • Deployment Success

Configured Ceph Ansible Variables

ceph_origin: distro
ceph_stable_release: quincy
docker: false
containerized_deployment: false

Ceph Ansible Run

cd ~/ceph-ansible
ansible-galaxy install -r requirements.yml
cp site.yml.sample site.yml
ansible-playbook -i /etc/ceph-ansible-inventory/hosts.ini site.yml

Works!

  1. Ubuntu 22.04 with Ceph Quincy
    • Ubuntu: 22.04
    • UCA: None
    • Ceph Ansible Branch: stable-7.0
    • Ansible Version: 2.12 (via apt install ansible-core)
    • Deployment Success

Configured Ceph Ansible Variables

ceph_origin: distro
ceph_stable_release: quincy
docker: false
containerized_deployment: false

Ceph Ansible Run

cd ~/ceph-ansible
pip install resolvelib==0.5.4 # known issue
ansible-galaxy install -r requirements.yml
cp site.yml.sample site.yml
ansible-playbook -i /etc/ceph-ansible-inventory/hosts.ini site.yml

Works!

IMPORTANT NOTES

Details about a working Ceph Ansible deployment.

When I can deploy Ceph using Ceph Ansible up until its version stable-7.0. Here's a good ceph.conf example:

root@ceph-mon-1:~# cat /etc/ceph/ceph.conf
[global]
mon initial members = ceph-mon-1,ceph-mon-2,ceph-mon-3
osd pool default crush rule = -1
fsid = a46d52d3-5a8b-4609-9a1f-e22e06f710f9
mon host = [v2:10.192.0.11:3300,v1:10.192.0.11:6789],[v2:10.192.0.12:3300,v1:10.192.0.12:6789],[v2:10.192.0.13:3300,v1:10.192.0.13:6789]
public network = 10.192.0.0/24
cluster network = 10.192.1.0/24

[client.rgw.ceph-mon-1.rgw0]
host = ceph-mon-1
keyring = /var/lib/ceph/radosgw/ceph-rgw.ceph-mon-1.rgw0/keyring
log file = /var/log/ceph/ceph-rgw-ceph-mon-1.rgw0.log
rgw frontends = beast endpoint=10.192.0.11:8080
rgw thread pool size = 512

So far, so good, ceph status always works.

  1. Ubuntu 22.04 with Ceph Reef (via UCA Bobcat) - Currently Failing
    • Ubuntu: 22.04
    • UCA: add-apt-repository cloud-archive:bobcat
    • Ceph Ansible Branch: stable-8.0
    • Ansible Version: 2.16 (via apt install ansible-core)
    • Deployment Failure with errors during monitor quorum formation. Logs are included in the original message.
    • Workaround: Use stable-7.0 with ceph_stable_release: reef for deployment on Ubuntu 22.04 + Bobcat.

Configured Ceph Ansible Variables

ceph_origin: distro
ceph_stable_release: reef
docker: false
containerized_deployment: false

Ceph Ansible Run

cd ~/ceph-ansible
pip install resolvelib==0.5.4 # known issue
ansible-galaxy install -r requirements.yml
cp site.yml.sample site.yml
ansible-playbook -i /etc/ceph-ansible-inventory/hosts.ini site.yml

FAILED!!! Error:

 TASK [ceph-mon : Waiting for the monitor(s) to form the quorum...] ***
 FAILED - RETRYING: [ceph-mon-1]: Waiting for the monitor(s) to form the quorum... (10 retries left).
 FAILED - RETRYING: [ceph-mon-1]: Waiting for the monitor(s) to form the quorum... (9 retries left).
 FAILED - RETRYING: [ceph-mon-1]: Waiting for the monitor(s) to form the quorum... (8 retries left).
 FAILED - RETRYING: [ceph-mon-1]: Waiting for the monitor(s) to form the quorum... (7 retries left).
 FAILED - RETRYING: [ceph-mon-1]: Waiting for the monitor(s) to form the quorum... (6 retries left).
 FAILED - RETRYING: [ceph-mon-1]: Waiting for the monitor(s) to form the quorum... (5 retries left).
 FAILED - RETRYING: [ceph-mon-1]: Waiting for the monitor(s) to form the quorum... (4 retries left).
 FAILED - RETRYING: [ceph-mon-1]: Waiting for the monitor(s) to form the quorum... (3 retries left).
 FAILED - RETRYING: [ceph-mon-1]: Waiting for the monitor(s) to form the quorum... (2 retries left).
 FAILED - RETRYING: [ceph-mon-1]: Waiting for the monitor(s) to form the quorum... (1 retries left).
 fatal: [ceph-mon-1]: FAILED! => changed=false
   attempts: 10
   cmd:
   - ceph
   - --cluster
   - ceph
   - daemon
   - mon.ceph-mon-1
   - mon_status
   - --format
   - json
   delta: '0:00:00.122353'
   end: '2024-03-07 19:53:15.190161'
   msg: non-zero return code
   rc: 22
   start: '2024-03-07 19:53:15.067808'
   stderr: 'admin_socket: exception getting command descriptions: [Errno 2] No such file or directory'
   stderr_lines: <omitted>
   stdout: ''
   stdout_lines: <omitted>
 NO MORE HOSTS LEFT *************************************************************

When I went to Ceph Mon to check its configuration:

root@ceph-mon-1:~# cat /etc/ceph/ceph.conf
[global]
fsid = a46d52d3-5a8b-4609-9a1f-e22e06f710f9
mon host = [v2:10.192.0.11:3300,v1:10.192.0.11:6789],[v2:10.192.0.12:3300,v1:10.192.0.12:6789],[v2:10.192.0.13:3300,v1:10.192.0.13:6789]

Many lines are missing from ceph.conf!

  1. Ubuntu 24.04 with Ceph Reef - Currently Failing
    • Ubuntu: 24.04
    • UCA: None
    • Ceph Ansible Branch: stable-8.0 not working for Reef deployment on Ubuntu 24.04.
    • Untested Solution: Potential workaround using stable-7.0 (untested), though not ideal for long-term use.

OpenStack Ansible and Ceph Ansible Integration

The synergy between OpenStack Ansible (OSA) and Ceph Ansible is a cornerstone for efficient infrastructure deployment, enabling seamless OpenStack and Ceph installations with a unified Ansible Inventory. This integration simplifies processes, allowing users to deploy Ceph within OSA environments through streamlined commands.

For a traditional Ceph deployment, the process involves:

cd ~/ceph-ansible
cp site.yml.sample site.yml
ansible-playbook -i /etc/ceph-ansible-inventory/hosts.ini site.yml

Conversely, deploying Ceph via OSA is more integrated:

cd /opt/openstack-ansible/playbooks/
openstack-ansible ceph-install.yml

This approach not only simplifies the deployment process but also ensures a more cohesive infrastructure setup with OSA.

However, challenges arise with the stable-8.0 branch of Ceph Ansible, particularly its incompatibility with Ubuntu 22.04 (Bobcat/Reef) and Ubuntu 24.04 (Reef by default), and its adverse impact on OSA integration. Notably, the removal of ceph_conf_overrides disrupts this integration, a change widely discussed within the community for its negative implications.

OSA's preference for LXC containers, and its avoidance of Docker/Prometheus, further complicates the situation. The push towards cephadm, which mandates Docker, conflicts with OSA's (and many users') deployment strategies, particularly within LXC or LXD environments.

A significant point of contention is the amendment made in Ceph Ansible stable-8.0, documented here:

14b4abf#diff-a57eaa1f236c68f6acc319d4f9710af8b513741d044e8dd4ddf544c1c7d09cefL144

This change, among others (9c467e4), has sparked discussions about the need for stable-8.0 to better align with user needs and the operational realities of OSA deployments, urging a reconsideration or adaptation of these changes to facilitate smoother integration and functionality.

Conclusion and Call to Action

The modifications in Ceph Ansible stable-8.0 do not fully account for its longstanding use cases, particularly concerning its integration with the OSA community. These changes, including the deprecation of ceph_conf_overrides and a forced shift towards containerized deployments, disrupt established workflows and compatibility.

I strongly advocate for the Ceph community to consider reverting disruptive changes in the stable-8.0 branch, restoring feature parity with stable-7.0, and providing clear guidance and support for transitioning OpenStack Ansible and other dependents to newer versions. It’s crucial to maintain Ceph Ansible’s utility and accessibility for both individual and organization

@guits
Copy link
Collaborator

guits commented Mar 19, 2024

hi @tiagonix

Thanks for your feedback,

Conclusion and Call to Action

The modifications in Ceph Ansible stable-8.0 do not fully account for its longstanding use cases, particularly concerning its integration with the OSA community. These changes, including the deprecation of ceph_conf_overrides and a forced shift towards containerized deployments, disrupt established workflows and compatibility.

I strongly advocate for the Ceph community to consider reverting disruptive changes in the stable-8.0 branch, restoring feature parity with stable-7.0, and providing clear guidance and support for transitioning OpenStack Ansible and other dependents to newer versions. It’s crucial to maintain Ceph Ansible’s utility and accessibility for both individual and organization

For context, the fate of ceph-ansible has been uncertain for several releases. It was supposed to be deprecated and left unmaintained after pacific, this was postponed until after Quincy. In the end, we (@clwluvw and I) decided to keep maintaining it, (and to make it very clear, we do so only in our free time) so stable-8.0 has finally received some engineering efforts. Although it was announced that stable-8.0 was most likely going to see some breaking changes, I admit that branch has suffered from a poor management so far. On the other hand, we took this liberty because the activity regarding ceph-ansible was very quiet, I don't want to come across as rude but the reality is that people usually show up more often when things are broken than for contributing - no offense intended - I just wan't to clarify that we were unaware so much people were still relying on ceph-ansible, I personally thought people had massively migrated to cephadm as this is the official installer for Ceph and it was strongly encouraged for multiple releases now. Hope it gives a bit more context regarding the current situation with stable-8.0.

Regarding ceph_conf_overrides it has finally been reintroduced in stable-8.0, which has recently received some new commits.

As for the following statement:

a forced shift towards containerized deployments

I'm not sure I totally understand what you are trying to bring out here, could you elaborate a bit more on this?

@tiagonix
Copy link
Author

Hi @guits,

Firstly, I want to extend my gratitude for your detailed response. It's greatly appreciated.

I've taken note that ceph_conf_overrides has been reintroduced, which is indeed welcome news.

I'm eager to contribute to the testing of the stable-8.0 branch, particularly to ensure its compatibility with Ubuntu 22.04 (Bobcat) and the upcoming Ubuntu 24.04 LTS cycle, including future UCA repositories. I have an OpenStack Heat template that allows for the rapid deployment of a Ceph cluster on OpenStack using VMs, which's quite beneficial for testing:

openstack create -t ceph-basic-stack-ubuntu-5.yaml ceph-lab-5

I'm more than willing to share this template if it's of interest.

To maintain focus and objectivity, I propose we prioritize making the stable-8.0 branch functional on Ubuntu 22.04 with Bobcat (Ceph Reef), particularly addressing the Ansible task failure in deployment example "5: Ubuntu 22.04 with Ceph Reef (via UCA Bobcat) - Currently Failing":

TASK [ceph-mon : Waiting for the monitor(s) to form the quorum...].
...
fatal: [ceph-mon-1]: FAILED! => changed=false

Overcoming this issue could significantly advance integration efforts with OpenStack Ansible and potentially make it work on Ubuntu 24.04 as well.

Regarding the OpenStack components within Ceph Ansible, discussions with the OSA team suggest a consensus for moving the OpenStack-specific functionalities (e.g., pool creation) to OSA playbooks, which seems a reasonable direction.

On the topic of containerized deployments, I've delved into the discussion in the Ceph list thread https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/TTTYKRVWJOR7LOQ3UCQAZQR32R7YADVY/ and found it quite enlightening. Our organization is heavily reliant on ceph-ansible for deployments and training. We do not support Docker or Podman in our infrastructure, which is firmly based on Ubuntu and LXD. Hence, introducing Docker/Podman is not viable for us, underscoring our preference for Debian-based deployments for Ceph and other services.

Again, thank you for your engagement and openness to community feedback. I look forward to contributing (mostly with testing) to the stable-8.0 branch's success and facilitating its broader adoption.

Cheers!

@guits
Copy link
Collaborator

guits commented Mar 19, 2024

@tiagonix

I tested the cutting edge of stable-8.0 on Ubuntu 22.04, the task you mention passed in my environment.
If it's still failing for you, I would need more details to debug this (ansible inventory, full log, group_vars/host_vars, etc.)

@tiagonix
Copy link
Author

tiagonix commented Mar 19, 2024

Thank you for taking the time to test it! Let me show you exactly what I'm doing.

BTW, have you also enabled Ubuntu's UCA Bobcat in your test before running Ceph Ansible (stable-8.0)?

To make it clear, I have to share two OpenStack Heat templates I'm using, one related to "Deployment 2" (Ubuntu 20.04 + Ceph Pacific), which works (stable-6.0), and another related to "Deployment 5," which fails (stable-8.0).

Here are the differences between "Deployment 2" and "Deployment 5":

$ diff -Nru ceph-basic-stack-ubuntu-2.yaml ceph-basic-stack-ubuntu-5.yaml
--- ceph-basic-stack-ubuntu-2.yaml	2024-03-19 16:57:04.925881995 +0100
+++ ceph-basic-stack-ubuntu-5.yaml	2024-03-19 15:18:49.480372431 +0100
@@ -66,8 +66,8 @@
   os_image_1:
     type: string
     label: 'Ubuntu Server - 64-bit'
-    description: 'Ubuntu - Focal Fossa - LTS'
-    default: 'ubuntu-20.04.1-20201201'
+    description: 'Ubuntu - Jammy Jellyfish - LTS'
+    default: 'ubuntu-22.04-20230110'
 
   # Flavors for Ceph
   flavor_ceph_generic:
@@ -529,7 +529,8 @@
         packages:
         - zram-config
         - net-tools
-        - ansible
+        - ansible-core
+        - python3-pip
         - python3-six
         - python3-netaddr
 
@@ -664,10 +665,11 @@
           owner: root
           content: |
             ---
+            yes_i_know: True
             cluster: ceph
             ntp_daemon_type: chronyd
             ceph_origin: distro
-            ceph_stable_release: pacific
+            ceph_stable_release: reef
             generate_fsid: false
             docker: false
             containerized_deployment: false
@@ -919,6 +921,8 @@
           owner: root
           content: |
             #!/bin/bash
+            # Python resolvelib issue: https://bugs.launchpad.net/ubuntu/+source/ansible/+bug/1995249
+            pip install resolvelib==0.5.4
             pushd /home/manager/ceph-ansible
             ansible-galaxy install -r requirements.yml
             cp site.yml.sample site.yml
@@ -950,7 +954,7 @@
 
             ssh-keygen -b 2048 -t rsa -f .ssh/id_rsa -q -C 'manager@cephao-1-ceph-ansible-1' -N ''
 
-            git clone -b stable-6.0 https://github.com/ceph/ceph-ansible.git
+            git clone -b stable-8.0 https://github.com/ceph/ceph-ansible.git
 
             mkdir /home/manager/ansible
 
@@ -965,8 +969,8 @@
         runcmd:
         - [ sh, -c, "/usr/local/sbin/bootstrap-instance.sh" ]
         - [ sh, -c, "/usr/local/bin/netcat-tarpipe-send-ssh-pubkey.sh" ]
-        - [ sh, -c, "add-apt-repository ppa:ansible/ansible-2.10"]
-        - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+        - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
+        - [ sh, -c, "add-apt-repository -y ppa:ansible/ansible"]
         - [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
         - [ sh, -c, "snap refresh" ]
         - [ sh, -c, "reboot" ]
@@ -1073,7 +1077,7 @@
         runcmd:
         - [ sh, -c, "/usr/local/sbin/bootstrap-instance.sh" ]
         - [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
-        - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+        - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
         - [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
         - [ sh, -c, "snap refresh" ]
         - [ sh, -c, "reboot" ]
@@ -1156,7 +1160,7 @@
         - swapon /swap.img
         - echo '/swap.img none swap sw 0 0' >> /etc/fstab
         - [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
-        - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+        - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
         - [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
         - [ sh, -c, "snap refresh" ]
         - [ sh, -c, "reboot" ]
@@ -1232,7 +1236,7 @@
         - swapon /swap.img
         - echo '/swap.img none swap sw 0 0' >> /etc/fstab
         - [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
-        - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+        - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
         - [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
         - [ sh, -c, "snap refresh" ]
         - [ sh, -c, "reboot" ]
@@ -1307,7 +1311,7 @@
         - swapon /swap.img
         - echo '/swap.img none swap sw 0 0' >> /etc/fstab
         - [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
-        - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+        - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
         - [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
         - [ sh, -c, "snap refresh" ]
         - [ sh, -c, "reboot" ]
@@ -1480,7 +1484,7 @@
         runcmd:
         - [ sh, -c, "/usr/local/sbin/bootstrap-instance.sh" ]
         - [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
-        - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+        - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
         - [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
         - [ sh, -c, "snap refresh" ]
         - [ sh, -c, "reboot" ]
@@ -1642,7 +1646,7 @@
         runcmd:
         - [ sh, -c, "/usr/local/sbin/bootstrap-instance.sh" ]
         - [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
-        - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+        - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
         - [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
         - [ sh, -c, "snap refresh" ]
         - [ sh, -c, "reboot" ]
@@ -1804,7 +1808,7 @@
         runcmd:
         - [ sh, -c, "/usr/local/sbin/bootstrap-instance.sh" ]
         - [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
-        - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+        - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
         - [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
         - [ sh, -c, "snap refresh" ]
         - [ sh, -c, "reboot" ]
@@ -1902,7 +1906,7 @@
         runcmd:
         - [ sh, -c, "/usr/local/sbin/bootstrap-instance.sh" ]
         - [ sh, -c, "/usr/local/sbin/netcat-tarpipe-ssh-pubkey.sh" ]
-        - [ sh, -c, "add-apt-repository -y cloud-archive:wallaby"]
+        - [ sh, -c, "add-apt-repository -y cloud-archive:bobcat"]
         - [ sh, -c, "apt update ; apt upgrade -y ; apt autoremove -y" ]
         - [ sh, -c, "snap refresh" ]
         - [ sh, -c, "reboot" ]

So, "Deployment 2" works as expected, which is based on ubuntu-20.04 Cloud image with UCA Wallaby (with Ceph Pacific), but in "Deployment 5" with Ubuntu 22.04 and its UCA cloud-archive:bobcat containing Ceph Reef, Ceph Ansible stable-8.0 fails. I also enabled Ansible's PPA ppa:ansible/ansible so stable-8.0 sees a supported Ansible version (2.16) on Ubuntu 22.04.

Those two OpenStack Heat templates contain everything I'm using to deploy Ceph on top of vanilla Ubuntu 20.04 and 22.04 Cloud Images! Including a complete Ansible Inventory (around line 555) and Ceph Ansible variables (around line 665). The Bash script /usr/local/bin/ceph-ansible-deploy.sh inside of the Heat also shows how I'm running Ceph Ansible via /etc/rc.local (right after the Instances are upgraded and rebooted).

The Ansible error log I posted in my first message! Please let me know if there are other logs you think are worth collecting. I can run those Heat templates over and over, no problem.

@guits
Copy link
Collaborator

guits commented Mar 19, 2024

@tiagonix by any chance, are you available on ceph-storage.slack.com ?

@tiagonix
Copy link
Author

tiagonix commented Mar 19, 2024

Let me run our Deployment 2! Check this out, it's kinda cool:

It's called with ceph-basic-stack-ubuntu-2.yaml I shared in the previous message.

$ openstack stack create -t ceph-basic-stack-ubuntu-2.yaml cephao-2
+---------------------+---------------------------------------------------------------------+
| Field               | Value                                                               |
+---------------------+---------------------------------------------------------------------+
| id                  | 3437c0a1-91ca-4df6-ba9d-6cb6cdc6b847                                |
| stack_name          | cephao-2                                                            |
| description         |                                                                     |
|                     | HOT template to create standard setup for a Ceph Cluster, with      |
|                     | Security Groups and Floating IPs.                                   |
|                     |                                                                     |
|                     | Total of 8 Instances for a basic Ceph Cluster.                      |
|                     | * 1 Ubuntu as Ceph Ansible                                          |
|                     | * 3 Ubuntu as Ceph Control Plane, for MONs, MGRs, Dashboard and etc |
|                     | * 3 Ubuntu as Ceph OSDs                                             |
|                     | * 1 Ubuntu as Ceph Client (RBD)                                     |
|                     |                                                                     |
|                     | Network Diagram - ASCII                                             |
|                     |                                                                     |
|                     |  Control Network (with Internet access via router_i_1)              |
|                     |                                                                     |
|                     |   ---------|ctrl_subnet|---------------------------------           |
|                     |   |     |     |     |     |    |                        |           |
|                     |   |     |     |     |     |    |                        |           |
|                     |  ---   ---   ---   ---   ---  ---                       |           |
|                     |  | |   | |   | |   | |   | |  | |                       |           |
|                     |  | |   | |   | |   | |   | |  | |   ----|CEPH ANSIBLE|--|           |
|                     |  |-|---|-|ceph_pub_subnet|-|--|-|---|                   |           |
|                     |  | |   | |   | |   | |   | |  | |   |---|CEPH GRAFANA|--|           |
|                     |  |C|   |C|   |C|   |C|   |C|  |C|   |   |& PROMETHEUS|  |           |
|                     |  |E|   |E|   |E|   |E|   |E|  |E|   |                   |           |
|                     |  |P|   |P|   |P|   |P|   |P|  |P|   ----|CEPH CLIENT|----           |
|                     |  |H|   |H|   |H|   |H|   |H|  |H|                                   |
|                     |  | |   | |   | |   | |   | |  | |                                   |
|                     |  |C|   |C|   |C|   |O|   |O|  |O|                                   |
|                     |  |P|   |P|   |P|   |S|   |S|  |S|                                   |
|                     |  | |   | |   | |   |D|   |D|  |D|                                   |
|                     |  ---   ---   ---   ---   ---  ---                                   |
|                     |                     |     |    |                                    |
|                     |                   |ceph_pri_subnet|                                 |
|                     |                                                                     |
| creation_time       | 2024-03-19T15:51:55Z                                                |
| updated_time        | None                                                                |
| stack_status        | CREATE_IN_PROGRESS                                                  |
| stack_status_reason | Stack CREATE started                                                |
+---------------------+---------------------------------------------------------------------+

After about 10 minutes, Ceph Ansible did its thing! Look:

$ openstack server list
+--------------------------------------+-------------------------+--------+----------------------------------------------------------------------------------------------------------------------+-------------------------+-----------+
| ID                                   | Name                    | Status | Networks                                                                                                             | Image                   | Flavor    |
+--------------------------------------+-------------------------+--------+----------------------------------------------------------------------------------------------------------------------+-------------------------+-----------+
| 35abd0bc-51fd-48a3-83da-5647186cb065 | cephao-2-ceph-osd-2     | ACTIVE | cephao-2-ceph-cluster=10.192.1.22; cephao-2-ceph-public=10.192.0.22; cephao-2-control=10.232.243.112, 192.168.192.22 | ubuntu-20.04.1-20201201 | m1.medium |
| 44c87e44-f354-41f8-9f82-e8df6904c22d | cephao-2-ceph-cp-2      | ACTIVE | cephao-2-ceph-public=10.192.0.12; cephao-2-control=10.232.242.241, 192.168.192.12                                    | ubuntu-20.04.1-20201201 | m1.small  |
| 59004ee9-100a-40d7-accb-f014ad6e89cd | cephao-2-ceph-osd-3     | ACTIVE | cephao-2-ceph-cluster=10.192.1.23; cephao-2-ceph-public=10.192.0.23; cephao-2-control=10.232.242.202, 192.168.192.23 | ubuntu-20.04.1-20201201 | m1.medium |
| 78941aac-54ba-4649-b03e-2aad8f478727 | cephao-2-ceph-cp-1      | ACTIVE | cephao-2-ceph-public=10.192.0.11; cephao-2-control=10.232.244.142, 192.168.192.11                                    | ubuntu-20.04.1-20201201 | m1.small  |
| 28e8611c-f26e-45f5-853b-7b1df61d52e9 | cephao-2-ceph-client-1  | ACTIVE | cephao-2-ceph-public=10.192.0.5; cephao-2-control=10.232.241.31, 192.168.192.5                                       | ubuntu-20.04.1-20201201 | m1.small  |
| a2face4f-4e23-4886-85f7-797dda2f748f | cephao-2-ceph-cp-3      | ACTIVE | cephao-2-ceph-public=10.192.0.13; cephao-2-control=10.232.244.150, 192.168.192.13                                    | ubuntu-20.04.1-20201201 | m1.small  |
| cb7ecb6e-2155-4a9d-821a-8e479b2f5a60 | cephao-2-ceph-osd-1     | ACTIVE | cephao-2-ceph-cluster=10.192.1.21; cephao-2-ceph-public=10.192.0.21; cephao-2-control=10.232.242.73, 192.168.192.21  | ubuntu-20.04.1-20201201 | m1.medium |
| 3ad74d3f-6c37-4e1a-8915-79e7583f207d | cephao-2-ceph-dash-1    | ACTIVE | cephao-2-ceph-public=10.192.0.10; cephao-2-control=10.232.244.92, 192.168.192.10                                     | ubuntu-20.04.1-20201201 | m1.small  |
| de717ee1-677e-4ff8-9783-d2166b275436 | cephao-2-ceph-ansible-1 | ACTIVE | cephao-2-ceph-public=10.192.0.4; cephao-2-control=10.232.242.17, 192.168.192.4                                       | ubuntu-20.04.1-20201201 | m1.small  |
+--------------------------------------+-------------------------+--------+----------------------------------------------------------------------------------------------------------------------+-------------------------+-----------+
$ ssh manager@10.232.244.142
manager@cephao-2-ceph-cp-1:~$ sudo ceph status
  cluster:
    id:     a46d52d3-5a8b-4609-9a1f-e22e06f710f9
    health: HEALTH_WARN
            mons are allowing insecure global_id reclaim
            Degraded data redundancy: 7/688 objects degraded (1.017%), 7 pgs degraded, 256 pgs undersized
 
  services:
    mon: 3 daemons, quorum cephao-2-ceph-cp-1,cephao-2-ceph-cp-2,cephao-2-ceph-cp-3 (age 9m)
    mgr: cephao-2-ceph-cp-1(active, since 111s), standbys: cephao-2-ceph-cp-2, cephao-2-ceph-cp-3
    mds: 1/1 daemons up, 2 standby
    osd: 18 osds: 18 up (since 6m), 18 in (since 6m)
    rgw: 3 daemons active (3 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    pools:   24 pools, 737 pgs
    objects: 227 objects, 9.9 KiB
    usage:   5.2 GiB used, 54 TiB / 54 TiB avail
    pgs:     7/688 objects degraded (1.017%)
             481 active+clean
             249 active+undersized
             7   active+undersized+degraded
manager@cephao-2-ceph-cp-1:~$ sudo ceph osd df tree
ID  CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE  DATA     OMAP  META     AVAIL    %USE  VAR   PGS  STATUS  TYPE NAME                   
-1         54.00000         -   54 TiB  5.2 GiB   40 MiB   0 B  5.1 GiB   54 TiB  0.01  1.00    -          root default                
-5         18.00000         -   18 TiB  1.7 GiB   13 MiB   0 B  1.7 GiB   18 TiB  0.01  1.00    -              host cephao-2-ceph-osd-1
 2    ssd   3.00000   1.00000  3.0 TiB  292 MiB  2.2 MiB   0 B  290 MiB  3.0 TiB  0.01  0.99  116      up          osd.2               
 3    ssd   3.00000   1.00000  3.0 TiB  296 MiB  2.2 MiB   0 B  294 MiB  3.0 TiB  0.01  1.01  131      up          osd.3               
 6    ssd   3.00000   1.00000  3.0 TiB  296 MiB  2.3 MiB   0 B  294 MiB  3.0 TiB  0.01  1.01  127      up          osd.6               
10    ssd   3.00000   1.00000  3.0 TiB  296 MiB  2.2 MiB   0 B  294 MiB  3.0 TiB  0.01  1.01  120      up          osd.10              
13    ssd   3.00000   1.00000  3.0 TiB  292 MiB  2.2 MiB   0 B  290 MiB  3.0 TiB  0.01  0.99  135      up          osd.13              
15    ssd   3.00000   1.00000  3.0 TiB  292 MiB  2.2 MiB   0 B  290 MiB  3.0 TiB  0.01  0.99  108      up          osd.15              
-3         18.00000         -   18 TiB  1.7 GiB   13 MiB   0 B  1.7 GiB   18 TiB  0.01  1.00    -              host cephao-2-ceph-osd-2
 0    ssd   3.00000   1.00000  3.0 TiB  296 MiB  2.2 MiB   0 B  294 MiB  3.0 TiB  0.01  1.01  123      up          osd.0               
 4    ssd   3.00000   1.00000  3.0 TiB  292 MiB  2.2 MiB   0 B  290 MiB  3.0 TiB  0.01  0.99  122      up          osd.4               
 7    ssd   3.00000   1.00000  3.0 TiB  292 MiB  2.2 MiB   0 B  290 MiB  3.0 TiB  0.01  0.99   96      up          osd.7               
 9    ssd   3.00000   1.00000  3.0 TiB  296 MiB  2.3 MiB   0 B  294 MiB  3.0 TiB  0.01  1.01  127      up          osd.9               
12    ssd   3.00000   1.00000  3.0 TiB  296 MiB  2.2 MiB   0 B  294 MiB  3.0 TiB  0.01  1.01  132      up          osd.12              
16    ssd   3.00000   1.00000  3.0 TiB  296 MiB  2.3 MiB   0 B  294 MiB  3.0 TiB  0.01  1.01  137      up          osd.16              
-7         18.00000         -   18 TiB  1.7 GiB   13 MiB   0 B  1.7 GiB   18 TiB  0.01  1.00    -              host cephao-2-ceph-osd-3
 1    ssd   3.00000   1.00000  3.0 TiB  292 MiB  2.2 MiB   0 B  290 MiB  3.0 TiB  0.01  0.99  114      up          osd.1               
 5    ssd   3.00000   1.00000  3.0 TiB  292 MiB  2.2 MiB   0 B  290 MiB  3.0 TiB  0.01  0.99  114      up          osd.5               
 8    ssd   3.00000   1.00000  3.0 TiB  296 MiB  2.2 MiB   0 B  294 MiB  3.0 TiB  0.01  1.01  128      up          osd.8               
11    ssd   3.00000   1.00000  3.0 TiB  292 MiB  2.2 MiB   0 B  290 MiB  3.0 TiB  0.01  0.99  131      up          osd.11              
14    ssd   3.00000   1.00000  3.0 TiB  292 MiB  2.2 MiB   0 B  290 MiB  3.0 TiB  0.01  0.99  110      up          osd.14              
17    ssd   3.00000   1.00000  3.0 TiB  296 MiB  2.2 MiB   0 B  294 MiB  3.0 TiB  0.01  1.01  140      up          osd.17              
                        TOTAL   54 TiB  5.2 GiB   40 MiB   0 B  5.1 GiB   54 TiB  0.01                                                 
MIN/MAX VAR: 0.99/1.01  STDDEV: 0

Great! Ceph is up and running!

@tiagonix
Copy link
Author

Hey, @guits! I don't have access to Ceph's Slack space.

@guits
Copy link
Collaborator

guits commented Mar 19, 2024

Hey, @guits! I don't have access to Ceph's Slack space.

if you share an email, i'll send you an invitation

@guits
Copy link
Collaborator

guits commented Mar 19, 2024

or better: https://ceph-storage.slack.com/join/shared_invite/zt-2b91em8b6-NQOKhGYReEIrE28OVncnLQ#/shared-invite/email

@tiagonix
Copy link
Author

So, for some reason, my Deployments 3 and 4 (stable-7) are also failing due to a wrong Ansible version in my setup (seems an easy fix, though, not sure what happened lol), I'll ignore it for now. That's why I compared "2 with 5" and not "4 with 5" (and also changed my previous message with the diff). I just deployed 15 min ago with stable-6.0 for a sanity check, and all is good.

My focus in with on Ceph Reef on Ubuntu 22.04 with UCA Bobcat (which brings Ceph Reef to 22.04), and Ceph Ansible stable-8.0. I'll run again the Heat template (5) to grab more logs!

@tiagonix
Copy link
Author

HEY! It's working now! LOL

I just tried my "Deployment 5" with the OpenStack Heat template I just shared, and it worked! The current Ceph Ansible stable-8.0 deploys Ceph Reef on Ubuntu 22.04 with UCA Bobcat, with no issues!

Check, this, out:

Ceph Ansible final lines of its run:

 TASK [set ceph crash install 'Complete'] ***************************************
 ok: [cephao-1-ceph-cp-1]
 PLAY [mons] ********************************************************************
 TASK [get ceph status from the first monitor] **********************************
 ok: [cephao-1-ceph-cp-1]
 TASK [show ceph status for cluster ceph] ***************************************
 ok: [cephao-1-ceph-cp-1] =>
   msg:
   - '  cluster:'
   - '    id:     a46d52d3-5a8b-4609-9a1f-e22e06f710f9'
   - '    health: HEALTH_WARN'
   - '            mons are allowing insecure global_id reclaim'
   - '            Degraded data redundancy: 2/668 objects degraded (0.299%), 2 pgs degraded, 96 pgs undersized'
   - ' '
   - '  services:'
   - '    mon: 3 daemons, quorum cephao-5-ceph-cp-1,cephao-5-ceph-cp-2,cephao-5-ceph-cp-3 (age 6m)'
   - '    mgr: cephao-5-ceph-cp-2(active, since 3s), standbys: cephao-5-ceph-cp-3, cephao-5-ceph-cp-1'
   - '    mds: 1/1 daemons up, 2 standby'
   - '    osd: 18 osds: 18 up (since 3m), 18 in (since 4m)'
   - '    rgw: 3 daemons active (3 hosts, 1 zones)'
   - ' '
   - '  data:'
   - '    volumes: 1/1 healthy'
   - '    pools:   12 pools, 337 pgs'
   - '    objects: 222 objects, 588 KiB'
   - '    usage:   513 MiB used, 54 TiB / 54 TiB avail'
   - '    pgs:     2/668 objects degraded (0.299%)'
   - '             241 active+clean'
   - '             94  active+undersized'
   - '             2   active+undersized+degraded'
   - ' '
 PLAY RECAP *********************************************************************
 cephao-1-ceph-client-1     : ok=79   changed=12   unreachable=0    failed=0    skipped=205  rescued=0    ignored=0
 cephao-1-ceph-cp-1         : ok=526  changed=66   unreachable=0    failed=0    skipped=532  rescued=0    ignored=0
 cephao-1-ceph-cp-2         : ok=370  changed=48   unreachable=0    failed=0    skipped=465  rescued=0    ignored=0
 cephao-1-ceph-cp-3         : ok=382  changed=51   unreachable=0    failed=0    skipped=465  rescued=0    ignored=0
 cephao-1-ceph-dash-1       : ok=58   changed=25   unreachable=0    failed=0    skipped=30   rescued=0    ignored=0
 cephao-1-ceph-osd-1        : ok=160  changed=25   unreachable=0    failed=0    skipped=259  rescued=0    ignored=0
 cephao-1-ceph-osd-2        : ok=147  changed=24   unreachable=0    failed=0    skipped=242  rescued=0    ignored=0
 cephao-1-ceph-osd-3        : ok=149  changed=25   unreachable=0    failed=0    skipped=240  rescued=0    ignored=0
 localhost                  : ok=1    changed=0    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0
 INSTALLER STATUS ***************************************************************
 Install Ceph Monitor           : Complete (0:02:03)
 Install Ceph Manager           : Complete (0:00:16)
 Install Ceph OSD               : Complete (0:01:05)
 Install Ceph MDS               : Complete (0:00:20)
 Install Ceph RGW               : Complete (0:00:15)
 Install Ceph Client            : Complete (0:00:30)
 Install Ceph RGW LoadBalancer  : Complete (0:00:16)
 Install Ceph Dashboard         : Complete (0:00:26)
 Install Ceph Grafana           : Complete (0:00:21)
 Install Ceph Node Exporter     : Complete (0:01:29)
 Install Ceph Crash             : Complete (0:00:07)
 real        9m44.665s
 user        1m44.966s
 sys        0m52.544s
 ~

YAY!!!

Here's how I'm doing it:

$ openstack stack create -t ceph-basic-stack-ubuntu-5.yaml cephao-5
+---------------------+---------------------------------------------------------------------+
| Field               | Value                                                               |
+---------------------+---------------------------------------------------------------------+
| id                  | b846688a-a2d5-4963-adc9-992bd9cd8cbd                                |
| stack_name          | cephao-5                                                            |
| description         |                                                                     |
|                     | HOT template to create standard setup for a Ceph Cluster, with      |
|                     | Security Groups and Floating IPs.                                   |
|                     |                                                                     |
|                     | Total of 8 Instances for a basic Ceph Cluster.                      |
|                     | * 1 Ubuntu as Ceph Ansible                                          |
|                     | * 3 Ubuntu as Ceph Control Plane, for MONs, MGRs, Dashboard and etc |
|                     | * 3 Ubuntu as Ceph OSDs                                             |
|                     | * 1 Ubuntu as Ceph Client (RBD)                                     |
|                     |                                                                     |
|                     | Network Diagram - ASCII                                             |
|                     |                                                                     |
|                     |  Control Network (with Internet access via router_i_1)              |
|                     |                                                                     |
|                     |   ---------|ctrl_subnet|---------------------------------           |
|                     |   |     |     |     |     |    |                        |           |
|                     |   |     |     |     |     |    |                        |           |
|                     |  ---   ---   ---   ---   ---  ---                       |           |
|                     |  | |   | |   | |   | |   | |  | |                       |           |
|                     |  | |   | |   | |   | |   | |  | |   ----|CEPH ANSIBLE|--|           |
|                     |  |-|---|-|ceph_pub_subnet|-|--|-|---|                   |           |
|                     |  | |   | |   | |   | |   | |  | |   |---|CEPH GRAFANA|--|           |
|                     |  |C|   |C|   |C|   |C|   |C|  |C|   |   |& PROMETHEUS|  |           |
|                     |  |E|   |E|   |E|   |E|   |E|  |E|   |                   |           |
|                     |  |P|   |P|   |P|   |P|   |P|  |P|   ----|CEPH CLIENT|----           |
|                     |  |H|   |H|   |H|   |H|   |H|  |H|                                   |
|                     |  | |   | |   | |   | |   | |  | |                                   |
|                     |  |C|   |C|   |C|   |O|   |O|  |O|                                   |
|                     |  |P|   |P|   |P|   |S|   |S|  |S|                                   |
|                     |  | |   | |   | |   |D|   |D|  |D|                                   |
|                     |  ---   ---   ---   ---   ---  ---                                   |
|                     |                     |     |    |                                    |
|                     |                   |ceph_pri_subnet|                                 |
|                     |                                                                     |
| creation_time       | 2024-03-19T16:35:53Z                                                |
| updated_time        | None                                                                |
| stack_status        | CREATE_IN_PROGRESS                                                  |
| stack_status_reason | Stack CREATE started                                                |
+---------------------+---------------------------------------------------------------------+

After about 10 minutes, Ceph Ansible finally worked! Look:

$ openstack server list
+--------------------------------------+-------------------------+--------+----------------------------------------------------------------------------------------------------------------------+-------------------------+-----------+
| ID                                   | Name                    | Status | Networks                                                                                                             | Image                   | Flavor    |
+--------------------------------------+-------------------------+--------+----------------------------------------------------------------------------------------------------------------------+-------------------------+-----------+
| b3668daf-704c-4613-b976-51e64a10ce1f | cephao-5-ceph-cp-1      | ACTIVE | cephao-5-ceph-public=10.192.0.11; cephao-5-control=10.232.244.42, 192.168.192.11                                     | ubuntu-22.04-20230110   | m1.small  |
| f31ce00f-d72f-423a-8802-fa4c5dd609c8 | cephao-5-ceph-osd-3     | ACTIVE | cephao-5-ceph-cluster=10.192.1.23; cephao-5-ceph-public=10.192.0.23; cephao-5-control=10.232.242.64, 192.168.192.23  | ubuntu-22.04-20230110   | m1.medium |
| f3b3aef7-cd4f-4f0e-a15d-ed1903a33b14 | cephao-5-ceph-osd-1     | ACTIVE | cephao-5-ceph-cluster=10.192.1.21; cephao-5-ceph-public=10.192.0.21; cephao-5-control=10.232.244.87, 192.168.192.21  | ubuntu-22.04-20230110   | m1.medium |
| 382c2d04-59db-436b-9f82-973199dbc6d5 | cephao-5-ceph-cp-3      | ACTIVE | cephao-5-ceph-public=10.192.0.13; cephao-5-control=10.232.241.244, 192.168.192.13                                    | ubuntu-22.04-20230110   | m1.small  |
| 78d089ad-e6ea-42af-8b6b-5c01724d7fe3 | cephao-5-ceph-client-1  | ACTIVE | cephao-5-ceph-public=10.192.0.5; cephao-5-control=10.232.242.113, 192.168.192.5                                      | ubuntu-22.04-20230110   | m1.small  |
| d6df57d2-5adb-4644-8f7a-e82bd0558219 | cephao-5-ceph-cp-2      | ACTIVE | cephao-5-ceph-public=10.192.0.12; cephao-5-control=10.232.244.182, 192.168.192.12                                    | ubuntu-22.04-20230110   | m1.small  |
| 241866c6-364d-4dfb-9b94-e828c32f020c | cephao-5-ceph-osd-2     | ACTIVE | cephao-5-ceph-cluster=10.192.1.22; cephao-5-ceph-public=10.192.0.22; cephao-5-control=10.232.244.105, 192.168.192.22 | ubuntu-22.04-20230110   | m1.medium |
| 82b90435-3e7c-41b3-8643-d3a01bae7583 | cephao-5-ceph-ansible-1 | ACTIVE | cephao-5-ceph-public=10.192.0.4; cephao-5-control=10.232.243.103, 192.168.192.4                                      | ubuntu-22.04-20230110   | m1.small  |
| c0d897e9-8c8d-48c6-ad3e-abf1115d0321 | cephao-5-ceph-dash-1    | ACTIVE | cephao-5-ceph-public=10.192.0.10; cephao-5-control=10.232.242.248, 192.168.192.10                                    | ubuntu-22.04-20230110   | m1.small  |
+--------------------------------------+-------------------------+--------+----------------------------------------------------------------------------------------------------------------------+-------------------------+-----------+
$ ssh manager@10.232.244.42
manager@cephao-5-ceph-cp-1:~$ sudo ceph status
  cluster:
    id:     a46d52d3-5a8b-4609-9a1f-e22e06f710f9
    health: HEALTH_WARN
            mons are allowing insecure global_id reclaim
            Degraded data redundancy: 2/668 objects degraded (0.299%), 2 pgs degraded, 96 pgs undersized
 
  services:
    mon: 3 daemons, quorum cephao-5-ceph-cp-1,cephao-5-ceph-cp-2,cephao-5-ceph-cp-3 (age 11m)
    mgr: cephao-5-ceph-cp-2(active, since 4m), standbys: cephao-5-ceph-cp-3, cephao-5-ceph-cp-1
    mds: 1/1 daemons up, 2 standby
    osd: 18 osds: 18 up (since 7m), 18 in (since 8m)
    rgw: 3 daemons active (3 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    pools:   12 pools, 337 pgs
    objects: 222 objects, 588 KiB
    usage:   513 MiB used, 54 TiB / 54 TiB avail
    pgs:     2/668 objects degraded (0.299%)
             241 active+clean
             94  active+undersized
             2   active+undersized+degraded
$ sudo ceph osd df tree
ID  CLASS  WEIGHT    REWEIGHT  SIZE     RAW USE  DATA     OMAP  META     AVAIL    %USE  VAR   PGS  STATUS  TYPE NAME                   
-1         54.00000         -   54 TiB  513 MiB   23 MiB   0 B  490 MiB   54 TiB     0  1.00    -          root default                
-7         18.00000         -   18 TiB  170 MiB  7.8 MiB   0 B  162 MiB   18 TiB     0  0.99    -              host cephao-5-ceph-osd-1
 2    ssd   3.00000   1.00000  3.0 TiB   27 MiB  1.2 MiB   0 B   26 MiB  3.0 TiB     0  0.96    0      up          osd.2               
 5    ssd   3.00000   1.00000  3.0 TiB   28 MiB  1.7 MiB   0 B   26 MiB  3.0 TiB     0  0.98   65      up          osd.5               
 8    ssd   3.00000   1.00000  3.0 TiB   32 MiB  1.3 MiB   0 B   30 MiB  3.0 TiB  0.00  1.11   64      up          osd.8               
11    ssd   3.00000   1.00000  3.0 TiB   27 MiB  1.2 MiB   0 B   26 MiB  3.0 TiB     0  0.96   32      up          osd.11              
14    ssd   3.00000   1.00000  3.0 TiB   28 MiB  1.2 MiB   0 B   26 MiB  3.0 TiB     0  0.97  144      up          osd.14              
17    ssd   3.00000   1.00000  3.0 TiB   28 MiB  1.2 MiB   0 B   26 MiB  3.0 TiB     0  0.97   32      up          osd.17              
-3         18.00000         -   18 TiB  174 MiB  7.8 MiB   0 B  166 MiB   18 TiB     0  1.02    -              host cephao-5-ceph-osd-2
 0    ssd   3.00000   1.00000  3.0 TiB   27 MiB  1.2 MiB   0 B   26 MiB  3.0 TiB     0  0.96   32      up          osd.0               
 3    ssd   3.00000   1.00000  3.0 TiB   28 MiB  1.2 MiB   0 B   26 MiB  3.0 TiB     0  0.97  112      up          osd.3               
 6    ssd   3.00000   1.00000  3.0 TiB   27 MiB  1.2 MiB   0 B   26 MiB  3.0 TiB     0  0.96   96      up          osd.6               
 9    ssd   3.00000   1.00000  3.0 TiB   32 MiB  1.7 MiB   0 B   30 MiB  3.0 TiB  0.00  1.12   33      up          osd.9               
12    ssd   3.00000   1.00000  3.0 TiB   27 MiB  1.2 MiB   0 B   26 MiB  3.0 TiB     0  0.96   32      up          osd.12              
15    ssd   3.00000   1.00000  3.0 TiB   32 MiB  1.3 MiB   0 B   30 MiB  3.0 TiB  0.00  1.11   32      up          osd.15              
-5         18.00000         -   18 TiB  170 MiB  7.8 MiB   0 B  162 MiB   18 TiB     0  0.99    -              host cephao-5-ceph-osd-3
 1    ssd   3.00000   1.00000  3.0 TiB   27 MiB  1.2 MiB   0 B   26 MiB  3.0 TiB     0  0.96   32      up          osd.1               
 4    ssd   3.00000   1.00000  3.0 TiB   28 MiB  1.2 MiB   0 B   26 MiB  3.0 TiB     0  0.97   64      up          osd.4               
 7    ssd   3.00000   1.00000  3.0 TiB   32 MiB  1.3 MiB   0 B   30 MiB  3.0 TiB  0.00  1.11   96      up          osd.7               
10    ssd   3.00000   1.00000  3.0 TiB   28 MiB  1.2 MiB   0 B   26 MiB  3.0 TiB     0  0.97   48      up          osd.10              
13    ssd   3.00000   1.00000  3.0 TiB   28 MiB  1.7 MiB   0 B   26 MiB  3.0 TiB     0  0.98    1      up          osd.13              
16    ssd   3.00000   1.00000  3.0 TiB   27 MiB  1.2 MiB   0 B   26 MiB  3.0 TiB     0  0.96   96      up          osd.16              
                        TOTAL   54 TiB  513 MiB   23 MiB   0 B  490 MiB   54 TiB     0                                                 
MIN/MAX VAR: 0.96/1.12  STDDEV: 0
manager@cephao-5-ceph-cp-1:~$ lsb_release -ra
No LSB modules are available.
Distributor ID:	Ubuntu
Description:	Ubuntu 22.04.4 LTS
Release:	22.04
Codename:	jammy
manager@cephao-5-ceph-cp-1:~$ dpkg -l | grep ceph-common
ii  ceph-common                          18.2.0-0ubuntu3~cloud0                  amd64        common utilities to mount and interact with a ceph storage cluster
ii  python3-ceph-common                  18.2.0-0ubuntu3~cloud0                  all          Python 3 utility libraries for Ceph

AWESOME!!! Ceph Reef is up and running on Ubuntu 22.04!

Let me test on Ubuntu 24.04 next.

@tiagonix
Copy link
Author

tiagonix commented Mar 19, 2024

I can confirm that Ceph Ansible stable-8.0 works great to deploy Ceph Reef on Ubuntu 22.04/Bobcat and Ubuntu 24.04 too!

Check it out:

root@cephao-6-ceph-cp-1:~# lsb_release -ra
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu Noble Numbat (development branch)
Release:        24.04
Codename:       noble
root@cephao-6-ceph-cp-1:~# dpkg -l | grep ceph-common
ii  ceph-common                          18.2.0-0ubuntu7                         amd64        common utilities to mount and interact with a ceph storage cluster
ii  python3-ceph-common                  18.2.0-0ubuntu7                         all          Python 3 utility libraries for Ceph
root@cephao-6-ceph-cp-1:~# ceph status
  cluster:
    id:     a46d52d3-5a8b-4609-9a1f-e22e06f710f9
    health: HEALTH_WARN
            mons are allowing insecure global_id reclaim
            Degraded data redundancy: 2/662 objects degraded (0.302%), 2 pgs degraded, 70 pgs undersized
            3 mgr modules have recently crashed
 
  services:
    mon: 3 daemons, quorum cephao-6-ceph-cp-1,cephao-6-ceph-cp-2,cephao-6-ceph-cp-3 (age 106m)
    mgr: cephao-6-ceph-cp-3(active, since 104m), standbys: cephao-6-ceph-cp-2, cephao-6-ceph-cp-1
    mds: 1/1 daemons up, 2 standby
    osd: 18 osds: 18 up (since 103m), 18 in (since 103m); 26 remapped pgs
    rgw: 3 daemons active (3 hosts, 1 zones)
 
  data:
    volumes: 1/1 healthy
    pools:   12 pools, 337 pgs
    objects: 220 objects, 586 KiB
    usage:   525 MiB used, 54 TiB / 54 TiB avail
    pgs:     2/662 objects degraded (0.302%)
             241 active+clean
             68  active+undersized
             22  active+clean+remapped
             3   active+clean+remapped+scrubbing
             2   active+undersized+degraded
             1   active+clean+remapped+scrubbing+deep
 
  progress:
    Global Recovery Event (102m)
      [======================......] (remaining: 26m)

Awesome!

It's worth mentioning that if dashboard_enabled: True, then it'll install docker-ce anyway, even if you set:

docker: false
containerized_deployment: false

If you want a "Docker-free Ceph deployment", also set dashboard_enabled: False. But then you lose those features... ¯\_(ツ)_/¯

Also, the dashboard_enabled: True doesn't work on Ubuntu 24.04 right now, because there's no docker-ce for 24.04 yet, and there's a task failing to install docker-ce on 24.04, so I just disabled it for now.

Alternatively (edit: should be the first try, actually lol) docker.io to Ceph Ansible, so it'll use Ubuntu's 24.04 native package:

container_package_name: docker.io
container_service_name: docker

Then the dashboard_enable: True also works on Ubuntu 24.04!

It seems that this issue can be closed after all! It was easier than I thought... Let's see how it'll play with OpenStack Ansible! lol

Good opportunity to share ideas and Heat templates! :-D

guits added a commit that referenced this issue Mar 20, 2024
This enforces docker.io and docker respectively for
`container_package_name` and `container_service_name` by default
for Ubuntu distribution.

Fixes: #7496

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
@guits
Copy link
Collaborator

guits commented Mar 20, 2024

I can confirm that Ceph Ansible stable-8.0 works great to deploy Ceph Reef on Ubuntu 22.04/Bobcat and Ubuntu 24.04 too!

glad to read that.

It's worth mentioning that if dashboard_enabled: True, then it'll install docker-ce anyway, even if you set:

docker: false
containerized_deployment: false

As far as I remember, docker: false is a CI thing only.
As for, containerized_deployment won't have any effect for this.
If you want to force the container engine, you have to set the following variables:

container_package_name
container_service_name

If you want a "Docker-free Ceph deployment", also set dashboard_enabled: False. But then you lose those features... ¯\_(ツ)_/¯

The dashboard uses grafana / prometheus / etc.. as containerized daemons, whatever your Ceph deployment is containerized or not.

Also, the dashboard_enabled: True doesn't work on Ubuntu 24.04 right now, because there's no docker-ce for 24.04 yet, and there's a task failing to install docker-ce on 24.04, so I just disabled it for now.

Alternatively (edit: should be the first try, actually lol) docker.io to Ceph Ansible, so it'll use Ubuntu's 24.04 native package:

container_package_name: docker.io
container_service_name: docker

In fact, we probably want to update the ./roles/ceph-container-engine/vars with new files Ubuntu-22.yml, Ubuntu-24.yml

It seems that this issue can be closed after all! It was easier than I thought... Let's see how it'll play with OpenStack Ansible! lol

See #7523

guits added a commit that referenced this issue Mar 20, 2024
This enforces docker.io and docker respectively for
`container_package_name` and `container_service_name` by default
for Ubuntu distribution.

Fixes: #7496

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
mergify bot pushed a commit that referenced this issue Mar 20, 2024
This enforces docker.io and docker respectively for
`container_package_name` and `container_service_name` by default
for Ubuntu distribution.

Fixes: #7496

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
(cherry picked from commit ef5a09d)
guits added a commit that referenced this issue Mar 20, 2024
This enforces docker.io and docker respectively for
`container_package_name` and `container_service_name` by default
for Ubuntu distribution.

Fixes: #7496

Signed-off-by: Guillaume Abrioux <gabrioux@ibm.com>
(cherry picked from commit ef5a09d)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants