New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Add support for configuring HA cluster #62
Conversation
tests/tests_configure_ha_cluster.yml
Outdated
permanent: true | ||
runtime: true | ||
include_role: | ||
name: fedora.linux_system_roles.firewall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm - I guess this is ok for now - for CI you'll need to add a meta/requirements.yml with the fedora.linux_system_roles
collection
collections:
- name: fedora.linux_system_roles
tox -e qemu[-ansible-core-x.xx] will also use this file to install necessary collections
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm planing to take a look at the CI and how to implement it a bit later
32d6264
to
48127f2
Compare
[citest pending] |
1 similar comment
[citest pending] |
@richm |
[citest pending] |
[citest bad] |
1a64fb4
to
42780c5
Compare
I'll also need to add comments to new tasks files to describe why these files are in |
@tomjelinek can you please take a look at this PR and review the usage of the ha_cluster role? |
tasks/main.yml
Outdated
- name: with-rsc-role | ||
value: Master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already specified as resource_leader.role
. Please, remove these two lines.
tasks/main.yml
Outdated
ha_cluster_constraints_colocation: | ||
- resource_leader: | ||
id: ag_cluster-clone | ||
role: master |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
role: master | |
role: Promoted |
Master
is deprecated and should not be used. The ha_cluster role will automatically change Promoted
to Master
if needed.
README.md
Outdated
2.4. Configure the user provided with the ``mssql_ha_login` variable for | ||
Pacemaker. | ||
3. Include the System Roles ha_cluster role to configure pacemaker. | ||
Note that this role does not configure STONITH devices in Pacemaker. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been done probably due to the fact, that STONITH depends on customers' environment and thus cannot be defined by the role. However, running a cluster without STONITH is unsupported, as it may lead to data corruption and other issues. I suggest adding a variable, e.g. mssql_ha_stonith_resources
as a list defaulting to []
, add the list to ha_cluster_resource_primitives
, and instruct users to define their STONITH resources.
Not all environments include devices for traditional STONITH, Therefore, SBD variables should be exposed in a similar way to make it possible to configure SBD STONITH in such cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My intent here was to avoid making the mssql role wrap around the ha_cluster role much. We are going to provide example playbooks that will run the mssql role and then run the ha_cluster role with all requried ha_cluster-related variables.
I thought that setting ha_cluster_resource_primitives
is not required because stonith-enabled
is set to true
by default, and it seemed to work. Could you please provide an example of what mssql_ha_stonith_resources
value should look like?
In the mssql role, I guess adding mssql_ha_stonith_resources
would look like this?
ha_cluster_resource_primitives:
- "{{ mssql_ha_stonith_resources }}"
- id: ag_cluster
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are going to provide example playbooks that will run the mssql role and then run the ha_cluster role with all requried ha_cluster-related variables.
Ok, that would work for me. Just bear in mind this part of the ha_cluster role readme: "The role replaces the configuration of HA Cluster on specified nodes. Any settings not specified in the role variables will be lost."
I thought that setting
ha_cluster_resource_primitives
is not required becausestonith-enabled
is set totrue
by default
That merely enables stonith. You still need to tell the cluster which stonith devices it can use, how it can connect to them, and which nodes they can fence. That is done by creating stonith resources. For further details, see for example:
- https://clusterlabs.org/pacemaker/doc/2.1/Clusters_from_Scratch/html/fencing.html
- https://clusterlabs.org/pacemaker/doc/2.1/Pacemaker_Explained/html/fencing.html
- https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_high_availability_clusters/assembly_configuring-fencing-configuring-and-managing-high-availability-clusters
Could you please provide an example of what mssql_ha_stonith_resources value should look like?
ha_cluster_resource_primitives:
- id: myapc
agent: stonith:fence_apc_snmp
instance_attrs:
- attrs:
- name: ipaddr
value: apc-switch.example.com
- name: pcmk_host_map
value: rhel8-node1.example.com:1;rhel8-node2.example.com:2
- name: login
value: apc
- name: passwd
value: apc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that would work for me. Just bear in mind this part of the ha_cluster role readme: "The role replaces the configuration > of HA Cluster on specified nodes. Any settings not specified in the role variables will be lost."
Oh, so in the case where a user runs the mssql role and then runs the ha_cluster role with ha_cluster_resource_primitives
containing one item, the configuration applied by the mssql role would be removed? In this case, the mssql role would need to completely wrap around the ha_cluster role. As well as any other role that would use ha_cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomjelinek, in the case where a user runs the mssql
role and then runs the ha_cluster
role with ha_cluster_resource_primitives
containing one item, the cluster configuration applied previously by the mssql
role would be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possible outcome is that we document that you cannot use the mssql in conjunction with the ha_cluster role on the same set of hosts. I think this is ok. If you want to manage mssql and ha_cluster in the same playbook, you'll have to create separate host groups for mssql and ha_cluster, and do not run the ha_cluster role against the mssql host group, and do not run the mssql role against the ha_cluster host group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tomjelinek, in the case where a user runs the
mssql
role and then runs theha_cluster
role withha_cluster_resource_primitives
containing one item, the cluster configuration applied previously by themssql
role would be removed?
In the case where a user runs the mssql
role and then runs the ha_cluster
role (with any settings), the cluster configuration applied previously by the mssql
role would be removed.
It is unfortunate, but it is the best I can do now. Making the ha_cluster
role work in a way where users can specify only partial cluster configuration and the role would not touch other parts of the cluster requires a lot more resources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is unfortunate, but it is the best I can do now. Making the ha_cluster role work in a way where users can specify only partial cluster configuration and the role would not touch other parts of the cluster requires a lot more resources.
I think this is fine. We just document that you cannot use the ha_cluster role on mssql nodes, and vice versa.
[citest] |
README.md
Outdated
[`mssql_ha_cluster_print_vars:`](#mssql_ha_cluster_print_vars)`true` to print | ||
planned `ha_cluster` variables. Then you can merge the printed variables with | ||
your custom `ha_cluster` variables and specify the resulting set of variables | ||
with the `microsoft.sql.server` role invocation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the use case for this? If you are using the ha_cluster role and the mssql role in the same playbook, and you are configuring mssql for ha, and you want to ensure you use the same ha parameters for both ha_cluster and mssql? I guess this is if you first run mssql and configure it for ha, then do a subsequent playbook run with ha_cluster (with or without mssql) and you want to ensure that you do not overwrite the previous ha cluster settings used when deploying mssql?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or vice versa - you deploy ha_cluster role first, then you want to grab those ha_cluster variables to use with the mssql role? If that's the case, then it seems like we need support in the ha_cluster role - have it export the settings in a format that you can use to configure something else to use ha_cluster on those same machines. Other roles have a similar feature - the ability to export the role configuration - linux-system-roles/kernel_settings#58 - linux-system-roles/firewall#83 (these are wip, but I believe some other roles have already implemented this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, we need a functionality to export settings in both the ha_cluster role and in mssql or other roles that include ha_cluster. Seems like here I do the same thing as in linux-system-roles/kernel_settings#58 but with pure ansible.
[citest pending] |
Add firewall, ha_cluster roles, required vars Take setup tasks form ha_cluster role, the mssql_ha_replica_type fact Print sqlcmd output only when it's available Add HA functionality description to README.md Add dependency for fedora.linux-system-roles in galaxy.yml Not print output of gathering facts because output is too long with -v Wait for mssql-server to prepare for client connections after restart Only print sqlcmd output when it exists Add tests for templates, fix bugs in templates Add error message when running against RHEL < 8 Add mssql_ha_cluster_run_role and other related variables Add mssql_ha_cluster_run_role Add mssql_ha_stonith_resources Add mssql_ha_cluster_print_vars
TODO: