Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean up the PKI securitydomain when removing a server #5915

Closed
wants to merge 2 commits into from

Conversation

rcritten
Copy link
Contributor

PKI has its own internal knowledge of servers and services
in its securitydomain. This has not been cleaned up in the
past but is becoming more of an issue as PKI now relies on its
securitydomain for more things, and it has a healthcheck that
reports inconsistencies.

Removing entries is straightforward using the PKI REST API.

In order to operate on the API access is needed. There was an
unused Security Domain Administrators group that I've added to
the resourceACLS we created for managing the securitydomain.
The ipara user is added as a member of this group. The REST
API binds to the CA using the IPA RA certificate.

Related commits are b3c2197
and ba4df64.

These resourceACLS were originally created as a backwards
compatibility mechanism for dogtag v9 and later only created when a
replica was installed purportedly to save a restart. I don't see
any reason to not have these defined. They are apparently needed due
to the PKI database upgrade issues.

In any case if the purpose was to suppress these ACLS it failed
because as soon as a replica with a CA was installed they were as
well, and we need this ACL in order to manage the securitydomain.

https://pagure.io/freeipa/issue/8930

Signed-off-by: Rob Crittenden rcritten@redhat.com

@rcritten rcritten added the ipa-4-9 Mark for backport to ipa 4.9 label Jul 21, 2021
@rcritten rcritten force-pushed the issue_8930 branch 4 times, most recently from 3d3c9ba to af4744e Compare July 22, 2021 11:48
@@ -753,6 +753,9 @@ def pre_callback(self, ldap, dn, *keys, **options):
self._ensure_last_of_role(
pkey, ignore_last_of_role=options.get('ignore_last_of_role', False)
)
with self.api.Backend.ra_securitydomain as domain_api:
domain_api.delete_domain(pkey, 'KRA')
domain_api.delete_domain(pkey, 'CA')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the topology is CA-less, this fails with:

# ipa server-del replica.ipa.test
Removing replica.ipa.test from replication topology, please wait...
ipa: ERROR: an internal error has occurred

The code must be called only when CA is enabled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed and added a CAless test to the temp commit. Nice catch, thanks.

@rcritten
Copy link
Contributor Author

rcritten commented Aug 9, 2021

test_integration/test_server_del.py::TestLastServices::test_forced_removal_of_master due to KDC being unavailable. Not sure why this would be affected, going to re-run it.

@rcritten rcritten added the re-run Trigger a new run of PR-CI label Aug 9, 2021
@freeipa-pr-ci freeipa-pr-ci removed the re-run Trigger a new run of PR-CI label Aug 9, 2021
@flo-renaud
Copy link
Contributor

The real issue is this one:

INFO     ipatests.pytest_ipa.integration.host.Host.replica1.IPAOpenSSHTransport:transport.py:391 RUN ['ipa', 'server-del', 'master.ipa.test', '--ignore-last-of-role']
DEBUG    ipatests.pytest_ipa.integration.host.Host.replica1.cmd139:transport.py:513 RUN ['ipa', 'server-del', 'master.ipa.test', '--ignore-last-of-role']
DEBUG    ipatests.pytest_ipa.integration.host.Host.replica1.cmd139:transport.py:557 ipa: ERROR: cannot connect to 'https://master.ipa.test:443/ca/rest/account/login': [Errno 111] Connection refused
DEBUG    ipatests.pytest_ipa.integration.host.Host.replica1.cmd139:transport.py:557 Removing master.ipa.test from replication topology, please wait...
DEBUG    ipatests.pytest_ipa.integration.host.Host.replica1.cmd139:transport.py:217 Exit code: 1

If the last CA is force-removed, the server-del probably removes the CA role from the master before it tries to call the CA to update the security domain. I suspect the code is failing because it doesn't know which server to contact for the security domain op.

@rcritten
Copy link
Contributor Author

rcritten commented Aug 9, 2021

Ok. I added a broad try/except. I don't think this should block removal but the user should be notified that the server hasn't been removed from the security domain.

@flo-renaud
Copy link
Contributor

Hi @rcritten
the test_server_del error happened because the previous test did not cleanup properly.

    def test_removal_of_server_raises_error_about_last_kra(self):
        """
        test that removal of server fails on the last KRA

        We shut it down to verify that it can be removed if it failed.
        """
        tasks.install_kra(self.master)
        self.master.run_command(['ipactl', 'stop'])
        tasks.assert_error(
            tasks.run_server_del(self.replicas[0], self.master.hostname),
            "Deleting this server is not allowed as it would leave your "
            "installation without a KRA.",
            1
        )
        # Restarting the server we stopped is not necessary as it will
        # be removed in the next test.

After test_removal_of_server_raises_error_about_last_kra, ipa services are stopped on the master and that explains the failure detected by PRCI. Could you fix this test (stop/restart ipa services in a fixture)?
We will need the try/except modification that you added anyway (for instance if calling server-del for a server that crashed, was decommissioned and was the last CA).

@rcritten
Copy link
Contributor Author

Huh, that test worked fine locally. I wonder if I specifically just ran that one test instead of the whole suite. I'll add a fixture.

@rcritten rcritten force-pushed the issue_8930 branch 3 times, most recently from 9dc6c71 to a2a2ae8 Compare August 10, 2021 18:43
@flo-renaud
Copy link
Contributor

Hi @rcritten
thanks for the updated patch, WFM. You can remove the temp commit and I will ack when prci completes.

PKI has its own internal knowledge of servers and services
in its securitydomain. This has not been cleaned up in the
past but is becoming more of an issue as PKI now relies on its
securitydomain for more things, and it has a healthcheck that
reports inconsistencies.

Removing entries is straightforward using the PKI REST API.

In order to operate on the API access is needed. There was an
unused Security Domain Administrators group that I've added to
the resourceACLS we created for managing the securitydomain.
The ipara user is added as a member of this group. The REST
API binds to the CA using the IPA RA certificate.

Related commits are b3c2197
and ba4df64.

These resourceACLS were originally created as a backwards
compatibility mechanism for dogtag v9 and later only created when a
replica was installed purportedly to save a restart. I don't see
any reason to not have these defined. They are apparently needed due
to the PKI database upgrade issues.

In any case if the purpose was to suppress these ACLS it failed
because as soon as a replica with a CA was installed they were as
well, and we need this ACL in order to manage the securitydomain.

https://pagure.io/freeipa/issue/8930

Signed-off-by: Rob Crittenden <rcritten@redhat.com>
For every server-del ensure that the server being deleted is
also removed from the PKI securitydomain.

https://pagure.io/freeipa/issue/8930

Signed-off-by: Rob Crittenden <rcritten@redhat.com>
@rcritten
Copy link
Contributor Author

vagrant provisioning errors, re-running.

@rcritten rcritten added the re-run Trigger a new run of PR-CI label Aug 12, 2021
@freeipa-pr-ci freeipa-pr-ci removed the re-run Trigger a new run of PR-CI label Aug 12, 2021
@flo-renaud flo-renaud added the ack Pull Request approved, can be merged label Aug 13, 2021
@flo-renaud flo-renaud added the pushed Pull Request has already been pushed label Aug 16, 2021
@flo-renaud
Copy link
Contributor

master:

  • db69855 Clean up the PKI securitydomain when removing a server
  • c0d6c05 ipatests: Verify that securitydomain is updated on server-del

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ack Pull Request approved, can be merged ipa-4-9 Mark for backport to ipa 4.9 pushed Pull Request has already been pushed
Projects
None yet
3 participants