# Security Operations & Administration

As an SSCP, you'll have to help people and organizations identify their information security needs, build the systems to secure their information, and keep that information secure.
## Comply with Codes of Ethics

Privacy is freedom from intrusion, and security is the protection of something or someone from loss, harm, or injury, now or in the future. 

Whether it's the business of business, the functions of government, or the actions and choices of individuals in our society, we can see that information is what makes everything work. Information provides the context for our decisions; it’s the data about price and terms that we negotiate about as buyers or sellers, and it’s the weather forecast that’s part of our choice to have a picnic today at the beach. Three characteristics of information are key to our ability to make decisions about anything:

* If it is publicly known, we must have confidence that everybody knows it or can know it; if it is private to us or those we are working with, we need to trust that it stays private or confidential.

* The information we need must be reliable. It must be accurate enough to meet our needs and come to us in ways we can trust. It must have integrity.

* The information must be there when we need it. It must be available.

Those three attributes or characteristics - the confidentiality, integrity, and availability of the information itself - reflect the needs we all have to be reasonably sure that we are making well-informed decisions, when we have to make them, and that our competitors (or our enemies!) cannot take undue or unfair advantage over us in the process. Information security practitioners refer to this as the CIA of information security. Every information user needs some CIA; for some purposes, you need a lot of it; for others, you can get by with more uncertainty (or "less CIA").

### $(ISC)^{2}$ Code of Ethics

(ISC)2 provides us a **Code of Ethics**, and to be an SSCP you agree to abide by it. It is short and simple. It starts with a **preamble**, which we quote in its entirety:

> The safety and welfare of society and the common good, duty to our principals, and to each other, requires that we adhere, and be seen to adhere, to the highest ethical standards of behavior.

> Therefore, strict adherence to this Code is a condition of certification.

Let's operationalize that preamble - take it apart, step by step, and see what it really asks of us:

1. Safety and welfare of society: Allowing information systems to come to harm because of the failure of their security systems or controls can lead to damage to property, or injury or death of people who were depending on those systems operating correctly.
2. The common good: All of us benefit when our critical infrastructures, providing common services that we all depend on, work correctly and reliably.
3. Duty to our principals: Our duties to those we regard as leaders, rulers, or our supervisors in any capacity.
4. Our duty to each other: To our fellow SSCPs, others in our profession, and to others in our neighborhood and society at large.
5. Adhere and be seen to adhere to: Behave correctly and set the example for others to follow. Be visible in performing our job ethically (in adherence with this Code) so that others can have confidence in us as a professional and learn from our example.

The code is equally short, containing four **canons** or principles to abide by:

> Protect society, the common good, necessary public trust and confidence, and 
the infrastructure.

> Act honorably, honestly, justly, responsibly, and legally.

> Provide diligent and competent service to principals.

> Advance and protect the profession.

The canons do more than just restate the preamble's two points. They show us how to adhere to the preamble. We must take action to protect what we value; that action should be done with honor, honesty, and justice as our guide. Due care and due diligence are what we owe to those we work for (including the customers of the businesses that employ us).

The final canon addresses our continued responsibility to grow as a professional. We are on a never-ending journey of learning and discovery; each day brings an opportunity to make the profession of information security stronger and more effective. We as SSCPs are members of a worldwide community of practice - the informal grouping of people concerned with the safety, security, and reliability of information systems and the information infrastructures of our modern world.

### Organizational Code of Ethics

#### Privacy

**Privacy**, which refers to a person (or a business), is the freedom from intrusion by others into one's own life, place of residence or work, or relationships with others. Privacy means that you have the freedom to choose who can come into these aspects of your life and what they can know about you. Privacy is an element of common law, or the body of unwritten legal principles that are just as enforceable by the courts as the written laws are in many countries. It starts with the privacy rights and needs of one person and grows to treat families, other organizations, and other relationships (personal, professional, or social) as being free from unwarranted intrusion.

Businesses create and use company confidential or proprietary information almost every day. Both terms declare that the business owns this information; the company has paid the costs to develop this information (such as the salaries of the people who thought up these ideas or wrote them down in useful form for the company), which represents part of the business' competitive advantage over its competitors. Both terms reflect the legitimate business need to keep some data and ideas private to the business.

Part of the concept of privacy is connected to the reasonable expectation that other people can see and hear what you are doing, where you are (or where you are going), and who might be with you. It's easy to see this in examples; walking along a sidewalk, you have every reason to think that other people can see you, whether they are out on the sidewalk as well or looking out the windows of their homes and offices, or from passing vehicles. The converse is that when out on that public sidewalk, out in the open spaces of the town or city, you have no reason to believe that you are not visible to others. This helps us differentiate between public places and private places: 

* Public places are areas or spaces in which anyone and everyone can see, hear, or notice the presence of other people, and observe what they are doing, intentionally or unintentionally. There is little to no degree of control as to who can be in a public place. A city park is a public place.

*　Private places are areas or spaces in which, by contrast, you as owner (or the person responsible for that space) have every reason to believe that you can control who can enter, participate in activities with you (or just be a bystander), observe what you are doing, or hear what you are saying. You choose to share what you do in a private space with the people you choose to allow into that space with you. By law, this is your reasonable expectation of privacy, because it is "your" space, and the people you allow to share that space with you share in that reasonable expectation of privacy.

The pervasive use of the Internet and the World Wide Web, and the convergence of personal information technologies, communications and entertainment, and computing, have blurred these lines. Your smart watch or personal fitness tracker uplinks your location and exercise information to a website, and you've set the parameters of that tracker and your Web account to share with other users, even ones you don't know personally. Are you doing your workouts today in a public or private place? Is the data your smart watch collects and uploads public or private data?

"Facebook-friendly" is a phrase we increasingly see in corporate policies and codes of conduct these days. The surfing of one's social media posts, and even one's browsing histories, has become a standard and important element of prescreening procedures for job placement, admission to schools or training programs, or acceptance into government or military service. Such private postings on the public Web are also becoming routine elements in employment termination actions. The boundary between "public" and "private" keeps moving, and it moves because of the ways we think about the information, and not because of the information technologies themselves.

The General Data Protection Regulation 2016/679 (GDPR) and other data protection regulations require business leaders, directors, and owners to make clear to customers and employees what data they collect and what they do with it, which in turn implements the separation of that data into public and private data. As an SSCP, you probably won't make specific determinations as to whether certain kinds of data are public or private, but you should be familiar with your organization's privacy policies and its procedures for carrying out its data protection responsibilities. Many of the information security measures you will help implement, operate, and maintain are vital to keeping the dividing line between public and private data clear and bright.

#### Confidentiality

Often thought of as "keeping secrets", **confidentiality** is actually about sharing secrets. Confidentiality is both a legal and ethical concept about **privileged communications** or **privileged information**. Privileged information is information you have, own, or create, and that you share with someone else with the agreement that they cannot share that knowledge with anyone else without your consent, or without due process in law. You place your trust and confidence in that other person's adherence to that agreement. Relationships between professionals and their clients, such as the doctor-patient or attorney-client ones, are prime examples of this privilege in action. Except in very rare cases, courts cannot compel parties in a privileged relationship to violate that privilege and disclose what was shared in confidence.

Confidentiality refers to how much we can trust that the information we're about to use to make a decision has not been seen by unauthorized people. The term unauthorized people generally includes anybody or any group of people who could learn something from our confidential information, and then use that new knowledge in ways that would thwart our plans to attain our objectives or cause us other harm.

Confidentiality needs dictate who can read specific information or files, or who can download or copy them. This is very different from who can modify, create, or delete those files.

One way to think about this is that integrity violations change what we think we know; confidentiality violations tell others what we think is our private knowledge.

#### Integrity

**Integrity**, in the most common sense of the word, means that something is whole and complete, and that its parts are smoothly joined together. People with high personal integrity are ones whose actions and words consistently demonstrate the same set of ethical principles. You know that you can count on them and trust them to act both in ways they have told you they would and in ways consistent with what they've done before.

Integrity for information systems has much the same meaning. Can we rely on the information we have and trust in what it is telling us?

This attribute reflects two important decision-making needs:

* First, is the information accurate? Have we gathered the right data, processed it in the right ways, and dealt with errors, wild points, or odd elements of the data correctly so that we can count on it as inputs to our processes? We also have to have trust and confidence in those processes - do we know that our business logic that combined experience and data to produce wisdom actually works correctly?
* Next, has the information been tampered with, or have any of the intermediate steps in processing from raw data to finished "decision support data" been tampered with? This highlights our need to trust not only how we get data, and how we process it, but also how we communicate that data, store it, and how we authorize and control changes to the data and the business logic and software systems that process that data.

Integrity applies to three major elements of any information-centric set of processes: to the people who run and use them, to the data that the people need to use, and to the systems or tools that store, retrieve, manipulate, and share that data. Let's look at **DIKW, or data, information, knowledge, and wisdom**:

* **Data** are the individual facts, observations, or elements of a measurement, such as a person's name or their residential address.
* **Information** results when we process data in various ways; information is data plus conclusions or inferences.
* **Knowledge** is a set of broader, more general conclusions or principles that we've derived from lots of information.
* **Wisdom** is (arguably) the insightful application of knowledge; it is the "a-ha!" moment in which we recognize a new and powerful insight that we can apply to solve problems with or to take advantage of a new opportunity - or to resist the temptation to try!

#### Availability

Is the data there, when we need it, in a form we can use?

We make decisions based on information; whether that is new information we have gathered (via our data acquisition systems) or knowledge and information we have in our memory, it's obvious that if the information is not where we need it, when we need it, we cannot make as good a decision as we might need to:

* The information might be in our files, but if we cannot retrieve it, organize it, and display it in ways that inform the decision, then the information isn't available.

* If the information has been deleted, by accident, sabotage, or systems failure, then it's not available to inform the decision.

These might seem obvious, and they are. Key to **availability** requirements is that they specify what information is needed; where it will need to be displayed, presented, or put in front of the decision makers; and within what span of time the data is both available (displayed to the decision makers) and meaningful. Yesterday's data may not be what we need to make today's decision.

## Understand Security Concepts

### Confidentiality

### Integrity

### Availability

### Accountability

### Privacy

### Non-Repudiation

### Least Privilege

### Separation of Duties

## Document, Implement, & Maintain Functional Security Controls

### Deterrent Controls

### Preventative Controls

### Detective Controls

### Corrective Controls

### Compensating Controls

## Participate in Asset Management

**Risk management** decides what risks to try to control; **risk mitigation** is how SSCPs take those decisions to the operational level. 

Four key ideas helps SSCPs keep a balanced perspective on risk as she looks to translate strategic thinking about information risk into action plans that implement, operate, and assess the use of risk management controls:

* First, we make strategic choices about which risks to pay attention to - to actively work to detect, deter, avoid, or prevent. In doing so, we also quite naturally choose which risks to just accept or ignore. These choices are driven by our sense of what's important to the survival of the organization, its growth, or its other longer-term objectives. Then we decide what to "cure" or "fix" somehow.

* Second, we must remember that many words we use to talk about risk - such as mitigation - have multiple meanings as we shift from strategic, through tactical, and into day-to-day operations. Mitigate and remediate, for example, can often be used to refer to applying patches to a system, or even to replacing components or subsystems with ones of completely different design; other times, we talk about mitigating a risk by taking remedial (curative, restorative) actions.

* Third, all of these processes constantly interact with one another; there are no clean boundaries between one "step" of risk management and the next.

* Finally, we must accept that we are never finished with information risk management and mitigation. We are always chasing residual risk, whether to keep accepting it or to take actions to mitigate or remedy it.

#### Observe, Orient, Decide, Act (OODA)

The four steps of **observe, orient, decide, and act**, known as the **OODA** loop, provide a process by which you can keep from overreacting to circumstances. The following figure shows the OODA loop, its four major steps, and the importance of feedback loops within the OODA loop itself. It shows how the OODA loop is a continually learning, constantly adjusting, forward-leaning decision-making and control process.

![John Boyd's OODA Loop](images/ooda-loop.png)

* **Observe**: Look around you! Gather information about what is happening, right now, and what's been happening very recently. Notice how events seem to be unfolding; be sensitive to what might be cause and effect being played out in front of you. Listen to what people are saying, and watch what they are doing. Look at your instruments, alarms, and sensors. Gather the data. Feed all of this into the next step.

* **Orient**: Apply your memory, your training, and your planning! Remember why you are here - what your organization's goals and objectives are. Reflect upon similar events you've seen before. Combine your observations and your orientation to build the basis for the next step.

* **Decide**: Make an educated guess as to what's going on and what needs to be done about it. This hypothesis you make, based on having oriented yourself to put the "right now" observations in a proper mental frame or context, suggests actions you should take to deal with the situation and continue toward your goals.

* **Act**: Take the action that you just decided on. Make it so! And go right back to the first step and observe what happens! Assess the newly unfolding situation (what was there plus your actions) to see if your hypothesis was correct. Check your logic. Correct your decision logic if need be. Decide to make other, different observations.

Think about the above figure in the context of two or more decision systems working in the same decision space, such as a marketplace. Suppliers and purchasers all are using OODA loops in their own internal decision making, whether they realize it or not. When the OODA loops of customers and suppliers harmonize with one another, the marketplace is in balance; no one party has an information advantage over the other. Now imagine if the customers can observe the actions of multiple suppliers, maybe even ones located in other marketplaces in other towns. If such customers can observe more information and think "around their OODA loop" more quickly than the suppliers can, the customers can spot better deals and take advantage of them faster than the suppliers can change prices or deliveries to the markets.

Let's shift this to a less-than-cooperative situation and look at a typical adversary intrusion into an organization's IT systems. On average, the IT industry worldwide reports that it takes businesses about $220$ days to first observe that a threat actor has discovered previously unknown or unreported vulnerability and exploited it to gain unauthorized access to the business's systems. It also takes about $170$ days, on average, to find a vulnerability, develop a fix (or patch) for it, apply the fix, and validate that the fix has removed or reduced the risk of harm that the vulnerability could allow to occur. Best case, one cycle around the OODA loop takes the business from observing the penetration to fixing it; that's $220$ plus $170$ days, or $13$ months of being at the mercy of the intruder! By contrast, the intruder is probably running on an OODA loop that might take a few days to go from initially seeking a new target, through initial reconnaissance, to choosing to target a specific business. Once inside the target's systems, the decision cycle time to seek information assets that suit the attacker's objectives, to formulate actions, to carry out those actions, and then to cover their tracks might run into days or weeks. It's conceivable that the attacker could have executed multiple exploits per week over those $13$ months of "once-around-the-OODA" that the business world seems to find acceptable.

It's worth emphasizing this aspect of the zero day exploit in OODA loop terms. The attacker does not need to find the vulnerability before anybody else does; she needs to develop a way to exploit it, against your systems, before that vulnerability has been discovered and reported through the normal, accepted vulnerability reporting channels, and before the defenders have had reasonable opportunity to become aware of its existence. Once you, as one of the white hats, could have known about it, it's no longer a zero day exploit - just one you hadn't implemented a control for yet.

We introduced the concept of the value chain, which shows each major set of processes a business uses to go from raw inputs to finished products that customers have bought and are using. Each step in the value chain creates value - it creates greater economic worth, or creates more of something else that is important to customers. Business uses what it knows about its methods to apply energy (do work) to the input of each stage in the value chain. The value chain model helps business focus on improving the individual steps, the lag time or latency within each step and between steps, and the wastage or costs incurred in each step. But business and the making of valuable products is not the only way that value chain thinking can be applied.

Modern military planners adapted the value chain concept as a way to focus on optimally achieving objectives in warfare. The kill chain is the set of activities that show, step by step, how one side in the conflict plans to achieve a particular military objective (usually a "kill" of a target, such as neutralizing the enemy's air defense systems). The defender need not defeat every step in that kill chain - all they have to do is interrupt it enough to prevent the attacker from achieving their goals, when their plans require them to.

It's often said that criminal hackers and cyber threat actors only have to be lucky once, in order to achieve their objectives, but that the cyber defender must be lucky every day to prevent all attacks. This is no doubt true if the defender's OODA loops run slower than those of their attackers. As you'll see, it takes more than just choosing and applying the right physical, logical, and administrative risk treatments or controls to achieve this.

#### Risk Mitigation

Let's start by taking apart our definition of risk mitigation, and see what it reveals in the day-to-day of business operations.

Risk mitigation is the process of implementing risk management decisions by carrying out actions that contain, transfer, reduce, or eliminate risk to levels the organization finds acceptable, which can include accepting a risk when it simply is not practical to do anything else about it.

The below figure shows the major steps in the risk mitigation process we'll use here, which continues to put the language of NIST SP 800-37 and ISO 31000:2018 into more pragmatic terms. These steps are:

1. Assess the information architecture and the information technology architectures that support it.
2. Assess vulnerabilities, and conduct threat modeling as necessary.
3. Choose risk treatments and controls.
4. Implement risk mitigation controls.
5. Verify control implementations.
6. Engage and train users as part of the control.
7. Begin routine operations with new controls in place.
8. Monitor and assess system security with new controls in place.

![Risk Mitigation Major Steps](images/risk-mitigation-major-steps.png)

The boundary between planning and doing, as we cross from Step $3$ into Step $4$, is the point where the SSCP helps the organization fit its needs for risk treatment and control into its no-doubt very constrained budget of people, money, resources, and time. In almost all circumstances, the SSCP will have to operate within real constraints. No perfect solution will exist; after all of your effort to put in place the best possible risk treatments and controls, there will be residual risk that the organization has by default chosen to accept. If you and your senior leaders have done your jobs well, that residual risk should be within the company's risk tolerance. If it is not, that becomes the priority for the next round of risk mitigation planning!


| Step | | | Special Case |  
| :---: | :---: | :---: | :---: | :---: |  
| Assess the Existing Architectures | | | |  
| Assess Vulnerabilities and Threats | | | |  
| Select Risk Treatment and Controls | | | |  
| Implement Controls | | | |  
| Authorize: Senior Leader Acceptance and Ownership | | | |  



### Lifecycle (Hardware, Software, & Data)

### Hardware Inventory

### Software Inventory & Licensing

### Data Storage

## Implement Security Controls & Assess Compliance

Think back to how much work it was to discover, understand, and document the information architecture that the organization uses, and then the IT architectures that support that business logic and data. Chances are that during your discovery phase, you realized that a lot of elements of both architectures could be changed or replaced by local work unit managers, group leaders, or division directors, all with very little if any coordination with any other departments. If that's the case, you and the IT director, or the chief information security officer and the CIO, may have an uphill battle on your hands as you try to convince everyone that proper stewardship does require more central, coordinated change management and control than the company is accustomed to.

The definitions of these three management processes are important to keep in mind:

* **Asset management** is the process of identifying everything that could be a key or valuable asset and adding it to an inventory system that tracks information about its acquisition costs, its direct users, its physical (or logical) location, and any relevant licensing or contract details. Asset management also includes processes to periodically verify that tagged property (items that have been added to the formal inventory) are still in the company's possession and have not disappeared, been lost, or been stolen. It also includes procedures to make changes to an asset's location, use, or disposition.


* **Configuration management** is the process by which the organization decides what changes in controlled systems baselines will be made, when to implement them, and the verification and acceptance needs that the change and business conditions dictate as necessary and prudent. Change management decisions are usually made by a configuration management board, and that board may require impact assessments as part of a proposed change.


* **Configuration control** is the process of regulating changes so that only authorized changes to controlled systems baselines can be made. Configuration control implements what the configuration management process decides and prevents unauthorized changes. Configuration control also provides audit capabilities that can verify that the contents of the controlled baseline in use today are in fact what they should be.

### Technical Controls (e.g., Session Timeout, Password Aging)

### Physical Controls (e.g., Mantrap, Cameras, Locks)

### Administrative Controls (e.g., Security Policies & Standards, Procedures, Baselines)

#### Baseline

There's been a lot of hard work accomplished to get to where a set of information risk controls have been specified, acquired (or built), installed, tested, and signed off by the senior leaders as meeting the information security needs of the business or organization. The job thus far has been putting in place countermeasures and controls so that the organization can roll with the punches, and weather the rough seas that the world, the competition, or the willful threat actors out there try to throw at it. 

Now it's on to the really hard part of the job - keeping this information architecture and its IT architectures safe, secure, and resilient so that confidentiality, integrity, and authorization requirements are met and stay met. How do we know all of those safety nets, countermeasures, and control techniques are still working the way we intended them to and that they're still adequate to keep us safe?

The good news is that this is no different than the work we did in making our initial security assessments of our information architecture, the business logic and business processes, and the IT architectures and systems that make them possible. The bad news is that this job never ends. We must continually monitor and assess the effectiveness of those risk controls and countermeasures, and take or recommend action when we see they no longer are adequate. Putting the controls in place was taking due care; due diligence is achieved through constant vigilance.

More good news: the data sources you used originally, to gain the insight you needed to make your first assessments, are still there, just waiting for you to come around, touch base, and ask for an update. Let's take a closer look at some of them.

As you selected and implemented each new or modified information risk mitigation control, you had to identify the training needs for end users, their managers, and others. You had to identify what users and people throughout the organization needed to know and understand about this control and its role in the bigger picture. Achieving this minimum set of awareness and understanding is key to acceptance of the control by everyone concerned. This need for acceptance is continual, and depending on the nature of the risk control itself, the need for ongoing refresher training and awareness may be quite great. Let's look at how different risks might call for different approaches to establish initial user awareness and maintain it over time:

* Suppose your organization has adopted a policy that prohibits end users from installing their own software onto company-provided computer systems. Your IT department has established logical controls throughout all computers to enforce this. Initial user training communicates and gains new employees' acknowledgment of this. Annual employee performance reviews are opportunities to reaffirm the importance of this policy and the need for employees to comply.


* Some users in your organization need to access company information systems and networks via their personal computers or smartphones. This means that the risk of commingling personal data and company data on these employee-owned devices is very real. You determine that currently available mobile device management technologies don't quite fit your circumstances, but even if they did, mobile or personal device users need to appreciate that the risks of data compromise, device loss or theft, misuse of the device by a family member, or conflicts between company-approved software and personal-use software on these devices could pose additional risks. Getting these mobile or personal device users to be actively part of keeping company data and systems secure is a daily challenge.

The key to keeping users engaged with risk management and risk mitigation controls is simple: align their own, individual interests with the interests the controls are supporting, protecting, or securing.

By this time, our newly implemented risk mitigation controls have gone operational. Day by day, users across the organization are using them to stay more secure, (hopefully) achieving improved levels of CIA in their information processing tasks. The SSCP and the information security team now need to shift their mental gears and look to ongoing monitoring and assessment of these changes. In one respect, this seems easy; the identified risk, and therefore the related vulnerability, focused us on changing something in our physical, logical, or administrative processes so that our information could be more secure, resilient, reliable, and confidential; our decisions should now be more assured.

Are they?

The rest of the world did not stand still while we were making these changes. Our marketplace continued to grow and change; no doubt other users in other organizations were finding problems in the underlying hardware, software, or platforms we use; and the vendors who build and support those systems elements have been working to make fixes and patches available (or at least provide a procedural workaround) to resolve these problems. Threat actors may have discovered new zero day exploits. And these or other threat actors have been continuing to ping away at our systems.

We do need to look at whether this new fix, patch, control, or procedural mitigation is working correctly, but we've got to do that in the context of today's system architecture and the environment it operates in... and not just in the one in which we first spotted the vulnerability or decided to do something about the risk it engendered.

The SSCP may be part of a variety of ongoing security assessment such as **penetration testing** or **operational test and evaluation (OT&E)** activities, all intended to help understand what the security posture of the organization is at the time that the tests or evaluations are conducted. Let's take a closer look at some of these types of testing. This kind of test and evaluation is not to be confused with the acceptance testing or verification that was done when a new control was implemented - that verification test is necessary to prove that you did that fix correctly. It should also be kept distinct in your mind from regression testing, the verification that a fix to one systems element did not break others. Ongoing security test and evaluation is looking to see if things are still working correctly now that the users - and the threat actors - have had some time to put the changes and the total system through their paces.

As an SSCP, consider asking (or looking yourself for the answers to!) the following kinds of questions:
* How do we know when a new device, such as a computer, phone, packet sniffer, etc., has been attached to our systems or networks?
* How do we know that one of our devices has gone missing, possibly with a lot of sensitive data on it?
* How do we know that someone has changed the operating system, updated the firmware, or updated the applications that are on our end users' systems?
* How do we know that an update or recommended set of security patches, provided by the systems vendor or our own IT department, has actually been implemented across all of the machines that need it?
* How do we know that end users have received updated training to make good use of these updated systems?
If you're unable to get good answers to those kinds of questions, from policy and procedural directives, from your managers, or from your own investigations, you may be working in an environment that is ripe for disaster.

To be effective, any management system or process must collect and record the data used to make decisions about changes to the systems being managed; they must also include ways to audit those records against reality. For most business systems, we need to consider three different kinds of baselines: recently archived, current operational, and ongoing development. Audits against these baselines should be able to verify that:

* The recently archived baseline is available for fallback operations if that becomes necessary. If this happens, we also need to have an audited list of what changes (including security fixes) are included in it and which documented deficiencies are still a part of that baseline.
* The current operational baseline has been tested and verified to contain proper implementation of the changes, including security fixes, which were designated for inclusion in it.
* The next ongoing development baseline has the set of prioritized changes and security fixes included in its work plan and verification and test plan.

Audits of configuration management and control systems should be able to verify that the requirements and design documentation, source code files, builds and control systems files, and all other data sets necessary to build, test, and deploy the baseline contain authorized content and changes only.

#### OT&E

**OT&E**, in its broadest sense, is attempting to verify that a given system and the people-powered processes that implement the overall set of business logic and purpose actually get work done correctly and completely, when seen from the end users' or operators' perspective. That may sound straightforward, but quite often, it is a long, complex process that produces some insight rather than clear, black-and-white "succeed" or "fail" scorecard results. Without going into too much detail, this is mainly because unavoidable differences exist between the system that business analysts thought was needed and what operational users in the organization are actually doing, day by day, to get work done. Some of those differences are caused by the passage of time; if it takes months to analyze a business' needs, and more months to build the systems, install, test, and deliver them, the business has continued to move on. Some reflect different perceptions or understanding about the need; it's difficult for a group of systems builders to understand what a group of systems users actually have to do in order to get work done. (And quite often, users are not as clear and articulate as they think they are when they try to tell the systems analysts what they need from the new system. Nor are the analysts necessarily the good listeners that they pride themselves on being.)

OT&E in security faces the same kind of lags in understanding, since quite often the organization doesn't know it has a particular security requirement until it is revealed (either by testing and evaluation, or by enemy action via a real incident). This does create circular logic: we think we have a pretty solid system that fulfills our business logic, so we do some OT&E on it to understand how well it is working and where it might need to be improved—but the OT&E results cause us (sometimes) to rethink our business logic, which leads to changes in the system we just did OT&E on, and in the meantime, the rest of the world keeps changing around us.

The bottom line is that operational test and evaluation is one part of an ongoing learning experience. It has a role to play in continuous quality improvement processes; it can help an organization understand how mature its various business processes and systems are. And it can offer a chance to gain insight into potentially exploitable vulnerabilities in systems, processes, and the business logic itself.

Ethical Penetration Testing is security testing focused on trying to actively find and exploit vulnerabilities in an organization's information security posture, processes, procedures, and systems. Pen-testing, as it's sometimes called, often looks to use "ethical hackers" who attempt to gain access to protected, secure elements of those systems. There are some significant legal and ethical issues that the organization and its testers must address, however, before proceeding with even the most modest of controlled pen-testing. In most jurisdictions around the world, it is illegal for anyone to attempt to gain unauthorized entry into someone else's information systems without their express written permission; even with that permission in hand, mistakes in the execution of pen-testing activities can expose the requesting company or the penetration testers to legal or regulatory sanctions.

The first major risk to be considered in pen-testing is that first and foremost, pen testers are trying to actively and surreptitiously find exploitable vulnerabilities in your information security posture and systems. This activity could disrupt normal business operations, which in turn could disrupt your customers' business operations. For this reason, the scope of pen-testing activities should be clearly defined. Reporting relationships between the people doing the pen-testing, their line managers, and management and leadership within your own organization must be clear and effective.

Another risk comes into play when using external pen-testing consulting firms to do the testing, analyze the results, and present these results to you as the client. Quite often, pen-testing firms hire reformed former criminal hackers (or hackers who narrowly escaped criminal prosecution), because they've got the demonstrated technical skills and hacker mindset to know how to conduct all aspects of such an attack. Yet, you are betting your organization's success, if not survival, on how trustworthy these hackers might be. Can you count on them actually telling you about everything they find? Will they actually turn over all data, logs, and so forth that they capture during their testing and not retain any copies for their own internal use? This is not an insurmountable risk, and your contract with the pen-testing firm should be adamant about these sorts of risk containment measures. That said, it is not a trivial risk.

#### Assessment-Driven Training

Whether security assessments are done via formalized penetration testing, as part of normal operational test and evaluation, or by any of a variety of informal means, each provides the SSCP an opportunity to identify ways to make end users more effective in the ways they contribute to the overall information security posture. Initial training may instill a sense of awareness, while providing a starter set of procedural knowledge and skills; this is good, but as employees or team members grow in experience, they can and should be able to step up and do more as members of the total information security team.

End user questions and responses during security assessment activities, or during debriefs of them, can illuminate such opportunities to improve awareness and effectiveness. Make note of each "why" or "how" that surfaces during such events, during your informal walk-arounds to work spaces, or during other dialogue you have with others in the organization. Each represents a chance to improve awareness of the overall information security need; each is an opportunity to further empower teammates be more intentional in strengthening their own security hygiene habits.

A caution is in order: some organizational cultures may believe that it's more cost-effective to gather up such questions and indicators, and then spend the money and time to develop and train with new or updated training materials when a critical mass of need has finally arisen. You'll have to make your own judgment, in such circumstances, whether this is being penny-wise but pound-foolish.

### Periodic Audit & Review

#### Ongoing, Continuous Monitoring

Prudent risk managers have been doing this for thousands of years. Guards would patrol the city and randomly check to see that doors were secured at the end of the workday and that gates were closed and barred. Tax authorities would select some number of taxpayers' records and returns for audit, to look for both honest mistakes and willful attempts to evade payment. Merchants and manufacturers, shipping companies, and customers make detailed inventory lists and compare those lists after major transactions (such as before and after a clearance sale or a business relocation). Banks and financial institutions keep detailed transaction ledgers and then balance them against statements of accounts. These are all examples of regular operational use, inspection, audit, and verification that a set of risk mitigation controls are still working correctly.

We monitor our risk mitigation controls so that we can conclude that either we are safe or we are not. Coming to a well-supported answer to that question requires information and analysis, and that can require a lot of data just to answer "Are we safe today?" Trend analysis (to see if safety or security has changed over time, with an eye to discovering why) requires even more data. The nature of our business, our risk appetite (or tolerance), and the legal and regulatory compliance requirements we face may also dictate how often we have to collect such data and for how long we have to keep it available for analysis, audit, or review.

Where does the monitoring data come from? This question may seem to have an obvious answer, but it bears thinking about the four main types of information that we deliberately produce with each step of a business process:

* First, we produce the outputs or results we require for business reasons. We calculate the new throttle setting; we transact the sale of an airline ticket; we post a debit to an account balance. Those are examples of required outputs that help achieve required outcomes of the business logic.

* Next, we produce verification outputs - additional information that lets the end user and their quality management processes look at the primary process outputs so that they can verify that the process steps have run correctly. This verification is a routine part of the business logic. An example might be where the business logic requires a confirmation (of a credit or debit card transaction by the card processing agent) before it allows the next step to proceed.

* Third, we look at safety and security requirements that add additional steps to our business logic. Administrative policy might require valid authentication and authorization of a user before they can access a customer file, and our access control systems enforce those policies. But it is the audit or accounting requirements that drive access control builders to log all attempts by all processes or people to access protected resources. From a safety perspective, we might have requirements that dictate systems are built with interlocks - hardware or software components that do not permit a potentially hazardous step being initiated if all of the safety prerequisite steps have not been met. If nobody else requires it, our liability insurers probably want us to keep good log information on each hazardous step - who initiated it, were all initial conditions correct, and what happened?

* Finally, we consider diagnostic information, sometimes called **fault detection and fault isolation (FDFI)** information. Most hardware systems have features built into their design that facilitate finding failed hardware. Sometimes these **built-in test equipment (BITE)** systems use industry-standard communications and data protocols, such as what we see in modern computer-controlled automotive systems. Other times they use proprietary protocols and interfaces. Software, too, will often have test features built into its source code so that during development testing, the programmers can demonstrate that the software functions correctly. All of these debug features can be rich sources of systems security monitoring and assessment information.

Notice one important fact: no useful data gets generated unless somebody, somewhere, decided to create a process to get the data generated by the system, output in a form that is useful, and then captured in some kind of document, log file, or other memory device. When we choose to implement controls and countermeasures, we choose systems and components that help us deal with potential problems and inform us when problems occur.

All of that monitoring data does you absolutely no good at all unless you actually look at it. Analyze it. Extract from it the stories it is trying to tell you. This is perhaps the number one large-scale set of tasks that many cybersecurity and information security efforts fail to adequately plan for or accomplish. Don't repeat this mistake.

Mistake number two is to not have somebody on watch to whom the results of monitoring and event data analysis are sent to so that when (not if) a potentially emergency situation is developing, the company doesn't find out about it until the Monday morning after the long holiday weekend is over. Those watch-standers can be on call (and receive alerts via SMS or other mobile communications means) or on site, and each business will make that decision based on their mission needs and their assessment of the risks. Don't repeat this mistake either.

Mistake number three is to not look at the log data at all unless some other problem causes you to think, "Maybe the log files can tell me what's going on".
These three mistakes suggest that we need what emergency medicine calls a triage process: a way to sort out patients with life-threatening conditions needing immediate attention from the ones who can wait a while (or should go see their physician during office hours).

Let's look at the analysis problem from the point of view of those who need the analysis done and work backward from there to develop good approaches to the analytical tasks themselves. But let's not repeat mistake number four, often made by the medical profession - that more often than not, when the emergency room triage team sends you back home and says "See your doctor tomorrow", their detailed findings don't go to your doctor with you.

#### Alert Team

The alert team is watching over the deployed, in-use operational IT systems and support infrastructures. That collection of systems elements is probably supporting ongoing customer support, manufacturing, shipping, billing and finance operations, and website and public-facing information resources, as well as the various development and test systems used by different groups in the company. Their job is to know the status, state, and health of these in-use IT systems, but not necessarily the details of how or for what purpose any particular end user or organization is using those systems.

Who is the alert team? It might be a part of the day shift help desk team, the people everybody calls whenever any kind of IT issue comes up. In other organizations, the alert team is part of a separate IT security group, and their focus is on IT security issues and not normal user support activities.

What does this alert team do? The information security alert team has as their highest priority being ready and able to receive alerts from the systems they monitor and respond accordingly. That response typically includes the following:

* Receive and review alarm, alert, and systems performance reporting data in real time.
* Identify and characterize alarms as emergency or non-emergency, based on predetermined criteria.
* Take immediate corrective or containment action as dictated by predetermined procedures, if any are required for the alarm in question.
* Notify designated emergency responders, such as police, fire, and so forth, if required.
* Notify designated technical support staff, or the internal computer emergency response team (CERT), if required.
* Notify designated point of contact in management and leadership, if required.
* Log this alarm event, and their disposition of it, in the alert team's own logs.

What we can see from that list of alert team tasks is that we're going to need the help of our systems designers, builders, and maintainers to help figure out
* What data to look for in the monitoring and event data outputs
* What logic to apply to the data to determine that an alarm state requiring urgent action is indicated
* What, if any, immediate action is required or recommended
The immediacy of the alert team's needs suggests that lots of data has to be summarized up to some key indicators, rather like a dashboard display in an automobile or an airplane. There are logical places on that dashboard for "idiot lights", the sort of red-yellow-green indicators designed to get the operator's attention and then direct them to look at other displays to be better informed. There are also valid uses on this dashboard for indicator gauges, such as throughput measures on critical nodes and numbers of users connected.

The alert team may also need to be able to see the data about an incident shown in some kind of timeline fashion, especially if there are a number of systems elements that seem to be involved in the incident. Timeline displays can call attention to periods that need further investigation and may even reveal something about cause and effect.

Before we jump to a conclusion and buy a snazzy new security information management dashboard system, however, take a look at what the other monitoring and event data analysis customers in our organization might need.

#### IT Support Staff

The IT support team is actually looking at a different process: the process of taking user needs, building systems and data structures to meet those needs, deploying those systems, and then dealing with user issues, problems, complaints, and ideas for improvements with them. That process lends itself to a fishbone or Ishikawa diagram that takes the end users' underlying value chain and reveals all of the inputs, the necessary preconditions, the processing steps, the outputs, and how outputs relate to outcomes. This process may have many versions of the information systems and IT baselines that it must monitor, track, and support at any one time. In some cases, some of those versions may be subsets of the entire architecture, tailor-made to support specific business needs. IT and the configuration management and control board teams will be controlling these many different product baseline versions, which includes keeping track of which help desk tickets or requests for changes are assigned to (scheduled to be built into) which delivery. The IT staff must also monitor and be able to report on the progress of each piece of those software development tasks.

Some of those "magic metrics" may lend themselves to a dashboard-style display. For large systems with hundreds of company-managed end-user workstations, for example, one such status indicator could be whether all vendor-provided updates and patches have been applied to the hardware, operating systems, and applications platform systems. Other indicators could be an aggregate count of the known vulnerabilities that are still open and in need of mitigation and the critical business logic affected by them.

Trend lines are also valuable indicators for the IT support staff. Averages of indicators such as system uptime, data or user logon volumes, accesses to key information assets, or transaction processing time can be revealing when looked at over the right timeframe - and when compared to other systems, internal, or external events to see if cause-and-effect relationships exist.

#### End Users

What end users require may vary a lot depending on the needs of the organization and which users are focused on which parts of its business logic. That said, end users tend to need traffic-light kind of indications that tell them whether systems, components, platforms, or other elements they need are ready and available, down for maintenance, or in a "hands-off" state while a problem is being investigated. They may also appreciate being able to see the scheduled status of particular changes that are of interest to them. Transparent change management systems are ones in which end users or other interested parties in the business have this visibility into the planned, scheduled builds and the issues or changes allocated to them.

#### Leadership & Management

We might rephrase "What do leadership and management need?" and ask how the analysis of monitoring and event data can help management and leadership fulfill their due care and due diligence responsibilities. Depending on the management and leadership style and culture within the organization, the same dashboard and summary displays used by the alert team and IT support staff may be just what they need. (This is sometimes called a "high-bandwidth-in" style of management, where the managers need to have access to lots of detailed data about what's going on in the organization.) Other management and leadership choose to work with high-level summaries, aggregates, or alarm data as their daily feeds.

One key lesson to remember is suggested by the number of alert team tasks that lead to notifying management and leadership of an incident or alarm condition. Too many infamous data breach incidents became far too costly for the companies involved because the company culture discouraged late-night or weekend calls to senior managers for "mere" IT systems problems.

#### Incident Investigation, Analysis, & Reporting

At some point, the SSCP must determine that an incident of interest has occurred. Out of the millions of events that a busy datacenter's logging and monitoring systems might take note of every $24$ hours, only a handful might be worthy of sounding an alarm:

* Unplanned shutdown of any asset, such as a router, switch, or server
* Unauthorized attempts to elevate a user's or process's privilege state to systems owner or root level
* Unauthorized attempts to extract, download, or otherwise exfiltrate restricted data from the facility
* Unauthorized attempts to change, alter, delete, or replace any data, software, or other controlled elements of the baseline system
* Unplanned or unauthorized attempts to initiate system backup or recovery tasks
* Unplanned or unauthorized attempts to connect a device, cable, or process to the system
* Unauthorized attempts to access system resources of any kind as part of trying to cause any of these events to occur, or to hide, alter, or mask data that would reveal these attempts
* Alarms or alerts from malware, intrusion detection, or other defensive systems

That's a pretty substantial list, but in a well-managed and well-secured datacenter, most of those kinds of incidents shouldn't happen often. When they do (not if they do), several important things have to occur properly and promptly:

1. Alarm or notify the right first responders, whether they are normal IT staff, IT security staff, or a specialized CERT.
2. Perform immediate steps to characterize the incident and determine whether affected users should cease business operations as normal (but not log off or shut down their systems without IT responder direction!).
3. Alert appropriate management and leadership in case they need to make other decisions as part of responding to the incident.

Part of that initial triage kind of response involves determining whether the incident is sufficiently serious or disruptive that the organization should activate its incident response plans and procedures.

Immediate response to an incident may mean that the first person to notice it has to make an immediate decision: is this an emergency that threatens life or property and thus requires initiating emergency alarms and procedures? Or is it "merely" an information systems incident not requiring outside emergency responders? Before you take on operational responsibilities, make sure you know how your company wants to handle these decisions.

We said at the onset of this book that the commitment by senior business leadership and management is pivotal to the success of the company's information risk management and mitigation efforts. As an SSCP, you and the rest of the team went to great efforts to get those senior leaders involved, gain their understanding, and acceptance of your risk assessments. You then gained their pledges to properly fund, staff, and support your risk mitigation strategies, as well as your chosen risk countermeasures and controls.

Much like any other accountable, reportable function in the company, information security must make regular reports to management and leadership. The good news (no incidents of interest) as well as the bad news about minor or major breaches of security must be brought to the attention of senior leaders and managers. They need to see that their investments in your efforts are still proving to be successful—and if they are not, then they need to understand why, and be informed to consider alternative actions to take in the face of new threats or newly discovered vulnerabilities.

Management and leadership may also have legal and regulatory reporting requirements of their own to meet, and your abilities to manage security systems event data, incident data, and the results of your investigations may be necessary for them to meet these obligations. These will, of course, vary as to jurisdiction; a multinational firm with operating locations in many countries may face a bewildering array of possibly conflicting reporting requirements in that regard.

Whatever the reporting burden, the bottom line is that the information security team must report its findings to management and leadership. Whether those findings are routine good news about the continued secure good health of the systems or dead-of-night emergency alarms when a serious incident seems to be unfolding, management and leadership have an abiding and enduring need to know.
No bad news about information security incidents will ever get better by waiting until later to tell management about it.

## Participate in Change Management

### Execute Change Management Process

### Identify Security Impact

### Testing/Implementing Patches, Fixes, & Updates (e.g., Operating System, Applications, SDLC)


## Participate in Security Awareness & Training



## Participate in Physical Security Operations (e.g., Data Center Assessment, Badging)