Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ballot SC34: Account Management #210

Closed
wants to merge 2 commits into from

Conversation

tobij
Copy link

@tobij tobij commented Aug 24, 2020

Purpose of Ballot

Encourage the use of continuous monitoring for the identification of
access permissions on Certificate Systems that are no longer necessary
for operations.

Motivation

Section 2(j) of the Network and Certificate System Security Requirements
("NSRs") provide that:

Each CA or Delegated Third Party SHALL: [...] Review all system
accounts at least every three (3) months and deactivate any accounts
that are no longer necessary for operations;

Effectiveness of Human Reviews

This wording suggests that CA should identify and remove obsolete access
permissions by performing human reviews of their Certificate Systems. In
a large CA environment, consisting of numerous systems and accounts,
such a human review is impractical to perform and therefore likely to be
less effective than the use of a monitoring solution.

Internal Consistency of the NSRs

Ballot SC29 has made it a requirement that CAs continuously monitor
security relevant configurations and to alert on unauthorized changes so
that these can be addressed within at most twenty-four (24) hours.

This ballot proposes the use of a similar approach for access
permissions. It would be incoherent to require CAs to address security
relevant misconfigurations within 24 hours while allowing (a maximum of)
90 days for the detection of obsolete access permissions.

Terminology

This ballot proposes that the requirement in Section 2(j) shall apply to
"accounts and access permissions" generally. In its current version
Section 2(j) only applies to "system accounts" but that term was
considered ambiguous by the NetSec Subcommittee. A mapping of similar
provisions across the WebTrust Principles and Criteria yielded that all
types of accounts should be in scope of Section 2(j) because the
corresponding WebTrust requirements apply to "Access rights" and
"Logical access controls" generally. It would be surprising if there was
no corresponding requirement in the NSRs.

Data Sources

The Subcommittee further considered whether a recommendation could be
added that CAs "SHOULD" perform some type of automatic comparison
between access configurations and HR systems, but did not want to
dictate one particular data source or method of implementation. Instead,
the Subcommittee believes that the CA's auditor will assess as part of
its test of design whether the data sources are appropriate for the
stated purpose of the requirement namely to identify whether the
accounts and permissions are still "necessary for operation".

@sleevi
Copy link
Contributor

sleevi commented Aug 25, 2020

In a large CA environment, consisting of numerous systems and accounts, such a human review is impractical to perform and therefore likely to be less effective than the use of a monitoring solution.

I don't understand this positioning of the ballot, and as worded, I think it'd be quite difficult for Google to support.

That is, this makes a compelling argument for why automatic scanning no doubt plays an important role within CAs, and I'm all in favor of making such automatic scanning normative, if we believe CAs are not naturally adopting to it as part of a risk/security management posture. However, given the significant issues we've seen with CAs with respect to misconfigurations of critical systems, it seems equally important that we should be looking at defense in depth; that is, rather than "either manual OR automatic" we should be looking at "BOTH manual AND automatic" scanning.

By every objective measure, this seems to be a strict worsening of the security for CAs when contemplating failure modes. 6 months to detect an issue is significantly worse than 3 months, which itself is already admittedly troubling.

@tobij
Copy link
Author

tobij commented Aug 25, 2020

Just curious, not trying to be funny, but when you speak of "manual scanning", do you think of something substantially different from e.g. perusing less /etc/passwd?

@sleevi
Copy link
Contributor

sleevi commented Aug 25, 2020

I'm not sure how to meaningfully interpret your question, perhaps because it's making unclear assumptions about a CA's implementation that aren't stated here.

Assuming such a system was a unified authentication system, which seems exceedingly unlikely in the "large CA" example you set forward, examining the accounts that have access, sampling to ensure that access was correct, and cross-sampling HR events (e.g. promotion, termination) with account access and privileges "seems" like a bare minimum. Nominally, this is where a Detailed Control Report would be more illustrative as to what the controls are and what the tests the auditor performs are.

However, I see significant danger in a system that says "Every time someone is terminated, we manually update Scanner X, and Scanner X makes sure systems Y and Z are correct", particularly in the "update Scanner X" phase. It's turtles all the way down, and that's why you want robust controls that emphasize a multi-party examination.

@tobij
Copy link
Author

tobij commented Aug 25, 2020

I am not making any assumptions, I was just wondering if you did - in a way.

To be more helpful, I am surprised by your use of "manual scanning". I now wonder if you understand "review" to be much more than a human "scrolling" through the account database (and maybe, who knows, with a little luck, the immediately adjacent configuration making the account database in question "authoritative" and "exclusive") and surely painstakingly and diligently checking the accounts against suitable lists or adequate knowledge of who still works at CA and what they should have access to. Specifically, maybe in your experience and/or expectation some sort of additional tooling would be employed to perform the human review?

@sleevi
Copy link
Contributor

sleevi commented Aug 25, 2020

Right, I'm supportive of tooling, but I don't think it can replace the necessity for a human-involved scan, and if anything, we should be looking to increase the frequency rather than decreasing it, and to support the human, but not replace them.

I don't think the solution for how it's interpreted is going to be solved by the NCSSRs; the only way we're going to improve this ecosystem is through more transparency about precisely what happens, when, and how, and that effectively necessitates greater transparency by the CAs and auditors than is presently employed. In the absence of that, any option such as promoting more automation, or automation-only solutions, enable further regressions in quality while still meeting the perfunctory obligations.

At the core, this is about changing "or" into "and" or ("should also", to be less strict), and it seems like it meets the purpose of "encouraging"

@tobij
Copy link
Author

tobij commented Aug 26, 2020

I agree that changing from "or" to "and" would on a fundamental level result in "your" concerns as well as the concerns underlying this (draft) ballot being addressed. However, I feel I am missing something here that - if I understood it - might potentially lead to a different, more elegant suggestion with more synergies in implementation.

Specifically it seems that understanding the value you see in human review would be very helpful to me, considering I see little to almost none. Maybe that is indeed connected to assumptions of mine; for example I do assume that the human review (as per the current NCSSRs) precludes automation and/or tooling at the very least above some threshold of complexity.

If you considered some levels of automation and/or tooling to be acceptable for performing the review (in accordance with the current NCSSRs), I could understand a lot better the value you place on the human element and why you consider this (draft) ballot to lead to a worse situation if passed with "or".

At the same time, all automation we intend to encourage and foster with this ballot in my mind clearly is under the control of people, working with and for people.

So maybe this is actually just a question of how much automation/tooling is "too much" or might be used by CAs to dodge their responsibilities ("monitoring software failed; hardly anyone's fault!!")? Or maybe the proposal is entirely unsuitable at least in the sense that all automation/tooling considered beneficial would be considered acceptable already under the current NCSSRs?

Otherwise, if you actually were of the opinion that non-automated human review (my twisted imagination imagines a Trusted Role spending two workdays each 3 months to look at /etc/{passwd,group,{,g}shadow} on plenty of machines in the most manual way possible) provides great value I really would not mind understanding the mechanism at play there as well, for maybe it could be captured in its essence and retained in a more automated solution.

Or maybe what I am missing is something else entirely?

@sleevi
Copy link
Contributor

sleevi commented Aug 26, 2020

At the same time, all automation we intend to encourage and foster with this ballot in my mind clearly is under the control of people, working with and for people.

So maybe this is actually just a question of how much automation/tooling is "too much" or might be used by CAs to dodge their responsibilities ("monitoring software failed; hardly anyone's fault!!")? Or maybe the proposal is entirely unsuitable at least in the sense that all automation/tooling considered beneficial would be considered acceptable already under the current NCSSRs?

At the core, the issue is one of transparency and consistency: do we understand what happens and why, and does it consistently happen regardless of the CA? We know this is a deficiency industry-wide, and not specific to the NCSSRs: we see wildly inconsistent interpretations that are unconscionably bad ideas (e.g. auto-updates on CA systems).

Absent improvements to the transparency (of audit reports) and the consistency (of CAs and auditors), proposals which offer CAs additional flexibility or interpretation are regressions, rather than improvements. If the goal is to encourage something, either SHOULD or new normative requirements can fulfill that goal. If the goal is to improve an existing practice, then I don't believe we'll meaningfully achieve that without either more transparency or more normative requirements.

To this specific ballot, we've seen a number of CAs have repeat failures with "automatic" controls, such as detecting affected certificates, as well as with manual controls (such as validation of O, C, J, etc). We know both are failure prone, and the mitigation for that failure prone tends to be a combination of "involving more humans, more carefully" and "more transparency" (e.g. Ballot SC30). We also have focused on trying to reduce the "time to detect failure" - e.g. certificate linting (pre- and post-issuance), regular sampling reviews, etc.

This Ballot seems to go against those trends in a two key areas:

  • It increases the time to detect failure
  • It reduces the transparency and accountability controls (from "a human did it" to "the system is at fault")

I totally understand the apprehension that the NCSSRs currently are describing a process, rather than an objective, and that may be causing some CAs pain, because they see opportunities to improve the process. Without the information from the past 5 years re: CA incidents, it'd be extremely tempting to allow CAs to achieve the objective however they see fit, without specifying the process. But as we saw with the BRs 3.2.2.4, that "any other (equivalent) method" is a recipe for security disaster. I worry this ballot is more closer to that "any equivalent method" approach, hence the concern.

So how do we resolve this? How do we reduce the pain for CAs?

  • I think we increase the 'detection' requirement for correctness. 3 months? Why not 1 week?
  • I think we increase the 'transparency' requirement. This isn't not easy to do as a one-off. Saying "document in your CP/CPS" still doesn't provide the same degree of transparency or consistency; this is likely something at the audit level.
  • In the interim, if the goal is solely to encourage X, then we can just do that, while still keeping the normative process and objectives stated.

@wthayer
Copy link
Contributor

wthayer commented Aug 26, 2020

This ballot seems to conflate the account management process (automated, manual or some combination) with validation of that process. If my process is automated, I am 'continuously monitoring' and thus don't need to periodically verify that access permissions are correct (or at least I don't need to do it as often - it's not clear that reviewing 'access permission configurations' extends to actually auditing access rights). So I also think this is a step back from the current requirements.

It would also be helpful to be more specific about the scope of 'accounts and access permissions'. The prior language 'system accounts' is also vague but seems to imply that it's referring to certificate systems. Are network devices in scope? Offline CA systems? Physical security systems? I think the answer should be Yes to all, but I also suspect that many CAs are unable to 'continuously monitor' all of these systems.

Finally, the term 'continuously monitoring' is also vague. Does a job that runs daily constitute 'continuous monitoring'? Can this be defined in terms of the time it takes to detect a problem?

@tobij
Copy link
Author

tobij commented Aug 26, 2020

This ballot seems to conflate the account management process (automated, manual or some combination) with validation of that process. If my process is automated, I am 'continuously monitoring' and thus don't need to periodically verify that access permissions are correct (or at least I don't need to do it as often - it's not clear that reviewing 'access permission configurations' extends to actually auditing access rights). So I also think this is a step back from the current requirements.

The conflation of management and verification indeed is something I actually thought would be beneficial to see. In particular I would have considered e.g. (some suitable) configuration management solutions/approaches as their immediate own continuous monitoring.

My thinking there is that, first of all, review needs to be performed by Trusted Roles (as ideally nobody else would have the access required to perform it). Right at this point I suspect we end up with people reviewing the very systems they are responsible for on a day to day basis, and if these people are any good at their job relative to the responsibility placed onto them, they will of course at all times know and/or "know" that their systems are in fantastic shape.

I further assume that these people then need to invest what can probaby amount to more than one workday into reviewing system accounts only using comparatively primitive means (so as to ensure they performed a review and did not just run some programs instead).

After 4 hours of reviewing, say, /etc/passwd, /etc/shadow, /etc/group and /etc/gshadow on one system after the next, and a lot of scrolling, a user crond shows up on the screen but evades the Trusted Role's attention. Or maybe it is user cron, but what is missed that the account has a password and a shell set.

It seems possible to me that this actually kind of might be the best case scenario we can expect from the current NCSSRs.

Maybe I am way off here, I have a vivid imagination sometimes. But in case this might not be extremely unrealistic, I would have thought we might be in a better place if we instead tried to effect that the states we review for cannot exist in principle (forgot to delete account X from system Y vs. configuration management) or without detection within short periods of time (attacks+misconfigurations vs. continuous monitoring) outside of a relatively compact description or definition ("approved access permission configurations") at least for such systems in scope of 2j that can be covered by such solutions.

@sleevi
Copy link
Contributor

sleevi commented Aug 27, 2020 via email

@wthayer
Copy link
Contributor

wthayer commented Aug 31, 2020

My thinking there is that, first of all, review needs to be performed by Trusted Roles (as ideally nobody else would have the access required to perform it). Right at this point I suspect we end up with people reviewing the very systems they are responsible for on a day to day basis, and if these people are any good at their job relative to the responsibility placed onto them, they will of course at all times know and/or "know" that their systems are in fantastic shape.

I further assume that these people then need to invest what can probaby amount to more than one workday into reviewing system accounts only using comparatively primitive means (so as to ensure they performed a review and did not just run some programs instead).

After 4 hours of reviewing, say, /etc/passwd, /etc/shadow, /etc/group and /etc/gshadow on one system after the next, and a lot of scrolling, a user crond shows up on the screen but evades the Trusted Role's attention. Or maybe it is user cron, but what is missed that the account has a password and a shell set.

In my experience this isn't quite how the review should work. Yes, someone in a 'system administration' trusted role will need to gather account data from all in-scope systems. I suspect this is often automated. Then someone in a 'compliance' trusted role is tasked with crunching the data, identifying exceptions, documenting the review and driving remediation.

It seems possible to me that this actually kind of might be the best case scenario we can expect from the current NCSSRs.

Maybe I am way off here, I have a vivid imagination sometimes. But in case this might not be extremely unrealistic, I would have thought we might be in a better place if we instead tried to effect that the states we review for cannot exist in principle (forgot to delete account X from system Y vs. configuration management) or without detection within short periods of time (attacks+misconfigurations vs. continuous monitoring) outside of a relatively compact description or definition ("approved access permission configurations") at least for such systems in scope of 2j that can be covered by such solutions.

To the extent that CAs are able to achieve 'continuous monitoring' within the scope of this requirement, I agree. But I still think there is value in a periodic human review as a "belt + suspenders" approach to access management.

Tobias S. Josefowitz added 2 commits May 13, 2021 17:41
Purpose of Ballot
=================

Encourage the use of continuous monitoring for the identification of
access permissions on Certificate Systems that are no longer necessary
for operations.

Motivation
==========

Section 2(j) of the Network and Certificate System Security Requirements
("NSRs") provide that:

    Each CA or Delegated Third Party SHALL: [...] Review all system
    accounts at least every three (3) months and deactivate any accounts
    that are no longer necessary for operations;

Effectiveness of Human Reviews
------------------------------

This wording suggests that CA should identify and remove obsolete access
permissions by performing human reviews of their Certificate Systems. In
a large CA environment, consisting of numerous systems and accounts,
such a human review is impractical to perform and therefore likely to be
less effective than the use of a monitoring solution.

Internal Consistency of the NSRs
--------------------------------

Ballot SC29 has made it a requirement that CAs continuously monitor
security relevant configurations and to alert on unauthorized changes so
that these can be addressed within at most twenty-four (24) hours.

This ballot proposes the use of a similar approach for access
permissions. It would be incoherent to require CAs to address security
relevant misconfigurations within 24 hours while allowing (a maximum of)
90 days for the detection of obsolete access permissions.

Terminology
-----------

This ballot proposes that the requirement in Section 2(j) shall apply to
"accounts and access permissions" generally. In its current version
Section 2(j) only applies to "system accounts" but that term was
considered ambiguous by the NetSec Subcommittee. A mapping of similar
provisions across the WebTrust Principles and Criteria yielded that all
types of accounts should be in scope of Section 2(j) because the
corresponding WebTrust requirements apply to "Access rights" and
"Logical access controls" generally. It would be surprising if there was
no corresponding requirement in the NSRs.

Data Sources
------------

The Subcommittee further considered whether a recommendation could be
added that CAs "SHOULD" perform some type of automatic comparison
between access configurations and HR systems, but did not want to
dictate one particular data source or method of implementation. Instead,
the Subcommittee believes that the CA's auditor will assess as part of
its test of design whether the data sources are appropriate for the
stated purpose of the requirement namely to identify whether the
accounts and permissions are still "necessary for operation".
It seems to be more natural this way. By appending to the end of Section
2, we avoid confusion with references to the provisions. So 2j becomes
2q and 2l becomes 2p.
@tobij tobij force-pushed the ballot/SC34_account_management branch from b01398c to 63d2d1e Compare May 13, 2021 15:45
@tobij tobij closed this Oct 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants