Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AP-ALB v0.8.x adaptations to support RHEL/CentOS 8 family and design changes to allow flexibility even for non-tested OSs #34

Closed
fititnt opened this issue Dec 6, 2019 · 14 comments

Comments

@fititnt
Copy link
Owner

fititnt commented Dec 6, 2019

Refs #17 (comment).

AP-ALB, since 3 august (~ 125 days) was only tested on Ubuntu/Debian systems. The Ansible roles used on the demo fititnt/ansible-linux-ha-cluster, with exception of the AP-ALB itself, already are compatible with RHEL/CentOS.

At least for Roles related to MariaDB/MySQL clusters, is easier to find well maintaned that works on RHEL/CentOS than recent ones for Debian/Ubuntu. So this is already good reason to make AP-ALB on very short term compatible with RHEL/CentOS.

@fititnt
Copy link
Owner Author

fititnt commented Dec 6, 2019

CentOS 8 do not even have any python at all installed by default.

Captura de tela de 2019-12-06 05-22-42

Not that the ideia behind force users to choose what python to use would be bad, just not expected. I'm not just sure the best strategy to do this without make slower for all others runs.

@fititnt
Copy link
Owner Author

fititnt commented Dec 6, 2019

From https://ansible-tips-and-tricks.readthedocs.io/en/latest/ansible/commands/#when-all-else-fails, it would just require this before executing the role.

ansible all -m raw -a "sudo dnf install python3 -y" -i apd.etica.ai,ape.etica.ai,apf.etica.ai,apg.etica.ai

Captura de tela de 2019-12-06 05-54-44

Maybe I will just document it instead of automate just this step.

fititnt added a commit that referenced this issue Dec 6, 2019
@fititnt
Copy link
Owner Author

fititnt commented Dec 6, 2019

Debian/Ubuntu uses: www-data.
RHEL/CentOS uses: apache or httpd
FreeBSD uses: www.

Some places use nginx user instead of these.

We will have to decide if enforce same user for all distros, or use based on each distro. And yes, I really interested on get AP-ALB working as cluster even with mixed linux distributions.

fititnt added a commit that referenced this issue Dec 6, 2019
…cted by OS; semi-implemented rhel-centos installer
fititnt added a commit to fititnt/ansible-linux-ha-cluster that referenced this issue Dec 6, 2019
@fititnt
Copy link
Owner Author

fititnt commented Dec 6, 2019

Humm. The firewalls.

CentOS/RHEL by default use firewalld as frontend for iptables/netfilter. And the Ansible FirewallD module (https://docs.ansible.com/ansible/latest/modules/firewalld_module.html) requires cnsiderable effort to make it compatible with UFW https://docs.ansible.com/ansible/latest/modules/ufw_module.html.

I guess for sake of long-term simplicity (also to allow more easily to use clusters of different OSs), if the user decides to use the AP-ALB, we install UFW from EPEL instead of maintaining two firewalls.

@fititnt
Copy link
Owner Author

fititnt commented Dec 6, 2019

About HAProxy: most sources about HAProxy 2.0.x on CentOS/RHEL suggest compiling from source. For this moment I'm trying with default HAProxy 1.8 to see if works without changes (maybe it works, even if I enable 2.0.x, I'm curious if it would).

I'm also thinking about making explicit options for user decide if AP-ALB can or not try to add repositories. This may be useful if users want to use their own logic. Or if along the road the new versions of AP-ALB could break something and users would like to not want update the versions (something that I guess is very plausible, in special for HAProxy).

fititnt added a commit that referenced this issue Dec 6, 2019
… alb_manange_haproxy_repository, alb_manange_ufw_install, alb_manange_ufw_repository and (internal usage) alb_manange_openresty_install, alb_manange_openresty_repository
@fititnt
Copy link
Owner Author

fititnt commented Dec 6, 2019

TASK [ap-application-load-balancer : openresty (CentOS/RHEL) | Install OpenResty] *****************************************************************************************************************************************************
fatal: [ap_foxtrot]: FAILED! => {"changed": false, "msg": "Failed to synchronize cache for repo 'openresty'", "rc": 1, "results": []}

OpenResty still not released official repositories for the newer Centos 8 (released September 24th, 2019).

So this

- name: "openresty (CentOS/RHEL) | OpenResty Repository"
  yum_repository:
    name: "openresty"
    description: "OpenResty CentOS repo"
    baseurl: 'https://openresty.org/package/centos/openresty.repo'
    state: present
    enabled: yes
  when:
    - is_openresty is failed
    - "ansible_distribution == 'CentOS' or ansible_distribution == 'Red Hat Enterprise Linux'"
    - "alb_manange_openresty_repository is defined and alb_manange_openresty_repository|bool"

Deploy this

File /etc/yum.repos.d/openresty.repo

[openresty]
name=Official OpenResty Open Source Repository for CentOS
baseurl=https://openresty.org/package/centos/$releasever/$basearch
skip_if_unavailable=False
gpgcheck=1
repo_gpgcheck=0
gpgkey=https://openresty.org/package/pubkey.gpg
enabled=1
enabled_metadata=1

But https://openresty.org/package/centos/8/x86_64/ get 404 not found, but https://openresty.org/package/centos/7/x86_64/ exists. Maybe I will just do some monkey patching and ping the upstream.

fititnt added a commit that referenced this issue Dec 7, 2019
… since OpenResty did not yet released a new repository
@fititnt
Copy link
Owner Author

fititnt commented Dec 7, 2019

I may be wrong, but most of the Ansible tasks that AP-ALB have (maybe at least 90-95%) should be compatible without any change with non-Debian systems (except Windows without WSL)

With the #33 (trying to use folder conventions), using a single, consistent user (instead of www-data / apache / httpd / www or nginx), except maybe firewall related tasks (that _we actually CAN reuse UFW on non-debian if the user really want use ALB/UFW) most of the tasks related to support other distributions (this is my guess now) would be:

  1. Reliable repositories to get updated versions of the core components (the source repositories)
  2. Work on documentation

The design choices actually matter a lot to reduce rework. But for example some tasks of the playbook, mostly the ones related to apt module, sometimes could be just written using the less powerfully but yet generic package https://docs.ansible.com/ansible/latest/modules/package_module.html. I will take care on next ansible roles and tasks to already write even more flexible.

But about the source repositories, this is likely to be the main thing that really cannot be fully consistent. Even if eventually I took some time to try just to test support FreeBSD is very likely that I would have to implement ways to allow compile form source. So, actually, by making the playbooks more flexible to other OSs, we're maybe improvint even the features for the main Supported OSs.

I know I could rush faster the basic implementation of this issue for CentOS/RHEL, but I guess I will take some extra hours just to make the base AP-ALB more generic to near any OS. Considering the time already spend on this repository, is actually not even more time need.

Hummm....

@fititnt fititnt changed the title AP-ALB adaptations to support RHEL/CentOS family AP-ALB v0.8.x adaptations to support RHEL/CentOS 8 family and design changes to allow flexibility even for non-tested OSs Dec 7, 2019
fititnt added a commit to fititnt/ansible-linux-ha-cluster that referenced this issue Dec 8, 2019
fititnt added a commit that referenced this issue Dec 8, 2019
…ly (first case very likely is when boostraping a node with gather_facts: false)
@fititnt
Copy link
Owner Author

fititnt commented Dec 8, 2019

From Ansible source code at https://github.com/ansible/ansible/blob/devel/lib/ansible/module_utils/facts/system/distribution.py#L466

    OS_FAMILY_MAP = {'RedHat': ['RedHat', 'Fedora', 'CentOS', 'Scientific', 'SLC',
                                'Ascendos', 'CloudLinux', 'PSBM', 'OracleLinux', 'OVS',
                                'OEL', 'Amazon', 'Virtuozzo', 'XenServer', 'Alibaba'],
                     'Debian': ['Debian', 'Ubuntu', 'Raspbian', 'Neon', 'KDE neon',
                                'Linux Mint', 'SteamOS', 'Devuan', 'Kali', 'Cumulus Linux'],
                     'Suse': ['SuSE', 'SLES', 'SLED', 'openSUSE', 'openSUSE Tumbleweed',
                              'SLES_SAP', 'SUSE_LINUX', 'openSUSE Leap'],
                     'Archlinux': ['Archlinux', 'Antergos', 'Manjaro'],
                     'Mandrake': ['Mandrake', 'Mandriva'],
                     'Solaris': ['Solaris', 'Nexenta', 'OmniOS', 'OpenIndiana', 'SmartOS'],
                     'Slackware': ['Slackware'],
                     'Altlinux': ['Altlinux'],
                     'SGML': ['SGML'],
                     'Gentoo': ['Gentoo', 'Funtoo'],
                     'Alpine': ['Alpine'],
                     'AIX': ['AIX'],
                     'HP-UX': ['HPUX'],
                     'Darwin': ['MacOSX'],
                     'FreeBSD': ['FreeBSD', 'TrueOS'],
                     'ClearLinux': ['Clear Linux OS', 'Clear Linux Mix']}

fititnt added a commit that referenced this issue Dec 8, 2019
fititnt added a commit that referenced this issue Dec 8, 2019
…rride of variables without replacing full original files
@fititnt
Copy link
Owner Author

fititnt commented Dec 8, 2019

I really liked the way OpenStack uses to organize variables (https://github.com/openstack/openstack-ansible-galera_server) and take they approach as inspiration. They, for example, have a task to detect the OS Family and Distribution and Major version to try to load the first one more specific, e.g.

  • ubuntu-18.04.yml (one Ubuntu 18.04 would load this)
  • ubuntu-18.yml
  • ubuntu.yml (one ubuntu 16, would fallback to this one)
  • debian-18.yml (ok, weird, but)
  • debian.yml (beyond Debian, distros like Raspbian, Linux Mint, Kali, etc would load this)
  • (error, not found)

But the actual usage on their ansible repository was possible using only 4 files, so https://github.com/openstack/openstack-ansible-galera_server/tree/master/vars actually have only

  • debian.yml
  • gentoo.yml
  • redhat.yml
  • suse.yml

And their configuration is somewhat simpler compared with the ALB full stack. But the main issue here is that, by the way Ansible load variables, if the same logic used by they had one ubuntu-18.yml file on the directory, everyting from the would have to be copy and pasted again from debian.yml.

That's why on the commit 1bc6b88 I changed a bit for 3 levels of variable loading, something like this directory structure

# fititnt at bravo in /alligo/code/fititnt/ap-application-load-balancer/vars on git:master o [4:50:18]
$ tree
.
├── main.yml
└── os-family
    ├── debian.yml
    ├── distribution
    │   ├── no-os-family-customization.yml
    │   ├── ubuntu.yml
    │   └── version
    │       ├── no-distribution-customization.yml
    │       ├── ubuntu-18.yml
    │       └── user-custom-overrides
    │           └── readme.yml
    ├── freebsd.yml
    ├── redhat.yml
    ├── unknown.yml
    └── untested.yml

4 directories, 11 files

The advantage of this way (if really need both OS family and version of Distribution support) is that is actually be possible to only override what is need.

fititnt added a commit that referenced this issue Dec 8, 2019
…hostname and alb_node_timezone; removed module Common (now part of Bootstrap); Minimal Ansible version on coltrol node: 2.9
fititnt added a commit that referenced this issue Dec 8, 2019
…rap/alb-standard; created bootstrap/ansible-control-node
fititnt added a commit that referenced this issue Dec 8, 2019
fititnt added a commit to fititnt/ansible-linux-ha-cluster that referenced this issue Dec 9, 2019
…o_debian10, ap_foxtrot_centos8, ap_golf_archilinux, rocha_anortosito_freebsd12, rocha_basalto_opensuse15 (refs fititnt/ap-application-load-balancer#34)
fititnt added a commit that referenced this issue Dec 9, 2019
…archilinux; added alb_internal_root_user & alb_internal_root_group (FreeBSD case)
@fititnt
Copy link
Owner Author

fititnt commented Dec 11, 2019

OpenSUSE 15.1 will does not have the error pages at the same path

rocha-basalto-opensuse15:~ # /usr/sbin/haproxy -c -V -f /etc/haproxy/haproxy.cfg
[ALERT] 344/092537 (2187) : parsing [/etc/haproxy/haproxy.cfg:58] : error opening file </etc/haproxy/errors/400.http> for custom error message <400>.
[ALERT] 344/092537 (2187) : parsing [/etc/haproxy/haproxy.cfg:59] : error opening file </etc/haproxy/errors/403.http> for custom error message <403>.
[ALERT] 344/092537 (2187) : parsing [/etc/haproxy/haproxy.cfg:60] : error opening file </etc/haproxy/errors/408.http> for custom error message <408>.
[ALERT] 344/092537 (2187) : parsing [/etc/haproxy/haproxy.cfg:61] : error opening file </etc/haproxy/errors/500.http> for custom error message <500>.
[ALERT] 344/092537 (2187) : parsing [/etc/haproxy/haproxy.cfg:62] : error opening file </etc/haproxy/errors/502.http> for custom error message <502>.
[ALERT] 344/092537 (2187) : parsing [/etc/haproxy/haproxy.cfg:63] : error opening file </etc/haproxy/errors/503.http> for custom error message <503>.
[ALERT] 344/092537 (2187) : parsing [/etc/haproxy/haproxy.cfg:64] : error opening file </etc/haproxy/errors/504.http> for custom error message <504>.
[WARNING] 344/092537 (2187) : parsing [/etc/haproxy/haproxy.cfg:77] : backend 'backend_http' : 'option tcplog' directive is ignored in backends.
[ALERT] 344/092537 (2187) : Error(s) found in configuration file : /etc/haproxy/haproxy.cfg
[ALERT] 344/092537 (2187) : Fatal errors found in configuration.
rocha-basalto-opensuse15:~ # ls -lha /etc/haproxy/errors/
ls: cannot access '/etc/haproxy/errors/': No such file or directory
rocha-basalto-opensuse15:~ # ls -lha /etc/haproxy/
total 20K
drwxr-x---  2 root haproxy 4.0K Dec 11 08:15 .
drwxr-xr-x 86 root root    4.0K Dec 11 07:48 ..
-rw-r--r--  1 root root    4.3K Dec 11 08:15 haproxy.cfg
-rw-r-----  1 root haproxy  799 Nov 29 17:15 haproxy.cfg.31149.2019-12-11@08:15:16~
rocha-basalto-opensuse15:~ # vim /etc/haproxy/haproxy.cfg.31149.2019-12-11@08\:15\:16~ 
rocha-basalto-opensuse15:~ # cat /etc/haproxy/haproxy.cfg.31149.2019-12-11@08\:15\:16~ 
global
  log /dev/log daemon
  maxconn 32768
  chroot /var/lib/haproxy
  user haproxy
  group haproxy
  daemon
  stats socket /var/lib/haproxy/stats user haproxy group haproxy mode 0640 level operator
  tune.bufsize 32768
  tune.ssl.default-dh-param 2048
  ssl-default-bind-ciphers ALL:!aNULL:!eNULL:!EXPORT:!DES:!3DES:!MD5:!PSK:!RC4:!ADH:!LOW@STRENGTH

defaults
  log     global
  mode    http
  option  log-health-checks
  option  log-separate-errors
  option  dontlog-normal
  option  dontlognull
  option  httplog
  option  socket-stats
  retries 3
  option  redispatch
  maxconn 10000
  timeout connect     5s
  timeout client     50s
  timeout server    450s

listen stats
  bind 0.0.0.0:80
  bind :::80 v6only
  stats enable
  stats uri     /
  stats refresh 5s
  rspadd Server:\ haproxy/1.6

fititnt added a commit that referenced this issue Dec 11, 2019
…orfiles; bsd (#37) with exclusive non-systemd manangement
fititnt added a commit that referenced this issue Dec 11, 2019
…t on how to deal with fact that on FreeBSD HAproxy user is already not created
fititnt added a commit that referenced this issue Dec 11, 2019
…oxy now have options alb_haproxy_system_user and alb_haproxy_system_group
fititnt added a commit that referenced this issue Dec 11, 2019
…ackages that requires compilation on OSs without precompiled sources
@fititnt
Copy link
Owner Author

fititnt commented Dec 11, 2019

For all OSs that does not have reliable repositories that could provide long term updates for HAproxy or OpenResty, I will not enable these features by default.

In practice, it mostly means operational systems that require OpenResty from source. To my surprise, HAproxy actually have versions 2.0 or 2.1 on Arch Linux, OpenSUSE and FreeBSD 12. Even CentOS and Debian requires extra repositories to have such versions.

I will take some time just to finish the code refactoring, in special for the part related to OpenResty. This was the lastest Ansible Play, with 6 operational systems:

Captura de tela de 2019-12-11 12-24-40

Both Debian 10 and Centos8 (ones that would be likely to be main targets for production usage along with Ubuntu) are broken. I obviously know how to do it (maybe CentOS 8 will take some extra work to force work temporary with CentOS 7 packages). But one of the reasons I took some extra time with other releases was to check how to reorganize the variables for each operational system and release of operational system.

the vars file now is like this

# fititnt at bravo in /alligo/code/fititnt/ap-application-load-balancer/vars on git:master x [12:29:42]
$ tree
.
├── main.yml
└── os-family
    ├── archlinux.yml
    ├── debian.yml
    ├── distribution
    │   ├── cloudlinux.yml
    │   ├── no-os-family-customization.yml
    │   ├── ubuntu.yml
    │   └── version
    │       ├── debian-10.yml
    │       ├── debian-11.yml
    │       ├── debian-9.yml
    │       ├── no-distribution-customization.yml
    │       ├── ubuntu-18.yml
    │       └── user-custom-overrides
    │           └── readme.yml
    ├── freebsd.yml
    ├── redhat.yml
    ├── suse.yml
    ├── unknown.yml
    └── untested.yml

4 directories, 17 files

Another difference from the version of 5 days ago is that a lot of logic that was more hardcoded on the tasks files are now variables (so it make easier to me or someone else just change in one place (or know what to do for a new or different version) than would be go diretly on the tasks files.

Also I took a lot of care with the ALB bootstrap group of tasks (similar to common) #36 be compatible somewhat even with non-tested OSs. This new role alone took good time, but after this stage, even across different operational systems, the differences tend to be much smaller.

And when trying to make the ALB on BSD Systems #37, it actually forced to make better design and better structure even if is likely to mainly to be used for Debian/RedHat family.

fititnt added a commit to fititnt/ansible-linux-ha-cluster that referenced this issue Dec 13, 2019
… ap_golf_centos7, rocha_anortosito_centos8, rocha_basalto_freebsd12 (refs fititnt/ap-application-load-balancer#34)
fititnt added a commit that referenced this issue Dec 13, 2019
…to allow customization of all versions of a base OS Family (useful for RHEL/CentOS with lots of distros)
fititnt added a commit that referenced this issue Dec 13, 2019
fititnt added a commit that referenced this issue Dec 14, 2019
fititnt added a commit that referenced this issue Dec 14, 2019
fititnt added a commit that referenced this issue Dec 14, 2019
fititnt added a commit that referenced this issue Dec 14, 2019
…itories ( {{ ansible_lsb.codename }} instead of bugged $(lsb_release -sc) ); other fixes related to dependencies
@fititnt
Copy link
Owner Author

fititnt commented Dec 24, 2019

Done.

Repository at https://github.com/fititnt/ansible-linux-ha-cluster. Full release on https://github.com/fititnt/ansible-linux-ha-cluster/releases/tag/demo-001-ap-alb-v0.8.5-alpha. AP-ALB v0.8.6-alpha had some fixes and a demo at

asciicast

Implement both features related to clustering and cross-platform on the same minor version, even if it worked better than I would expect initially, is less smooth than I would like to admit. But worked!

But definitely was something that would make sense to implement before the v1.0 release.

@fititnt fititnt closed this as completed Dec 24, 2019
@fititnt
Copy link
Owner Author

fititnt commented Dec 24, 2019

Important note

One of the reasons to not have at least one beta of v0.8.x was that RHEL/CentOS 8 is so recent that it caused this issue here RHEL/CentOS 8 and missing lua/luarocks base repositories for OpenResty #39, and the hotfix to just make it work on very short term on the version 8 was ugly.

Some features, like the MVP of AP-ALB demo with wireguard for private networking #29 we could say it was not a core feature (and also, at least when using on the same datacenter and with some cloud providers is possible to not be forced to implement a VPN), but OpenResty dependencies are implicitly strong dependencies of AP-ALB.

The bootstrap group of tasks

All tested OSs (inclusing ArchLinux and OpenSUSE) already on v0.8.x where tested to make near a full functional AP-ALB node. Similar to issue with RHEL/CentOS 8, these OSs also did not have some more official way to manange the OpenResty and it's dependencies.

A very short explanation of the status would be that, if if the user already have OpenResty installed on the default paths (and maybe override some variables that I disabled on vars/os-family) they are likely to get similar experience of other OSs.`

In special for HAProxy, most of these OSs already have some reliable repository. So if the lastest version of v0.8 did not enabled by default the installation was just some last minute lack of time to make it run more smootly. FreeBSD 12, for example, to my surprise already ship with HAProxy 2.0 (better than the 1.8 from RHEL/CentOS 8), so actually a node without OpenResty compiled from sources could be very reliable on FreeBSD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant