# Risk Identification, Monitoring, & Analysis

## Understand the Risk Management Process

Defense is a set of strategies, management is about making decisions, and mitigation is a set of tactics chosen to implement those decisions. **Integrated information risk management** is about protecting what's important to the organization. It's about what to protect and why; risk mitigation addresses how.

The key to risk and risk management is simple: it's about making decisions in reliable ways and using the CIA triad to help you know when the decision you’re about to make is a reliable one…and when it is a blind leap into the dark. From the SSCP’s perspective, information security is necessary because it enables more decisions to be made on time and on target. Reliable decision making is as much about long-range planning as it is about incident response. This means that you can rely on the following:

* Your individual and organizational memory (the information and knowledge you think you already have, know, and understand)
* New information that you've gathered, processed, and used as inputs to this decision
* Your ability to deliberate, examine, review, think, and then to decide, free from disruption
* Your ability to communicate our decision (the "new marching orders") to those elements of your organization and systems that have to carry them out

![Risk](images/risk)

Two important questions must be asked about such failures or risk occurrences as incidents:

* First, how predictable are incidents like these? How often do the sorts of mistakes that lead to such incidents happen? When might they happen? If we can predict how often such circumstances might occur, or identify conditions that increase the likelihood of such mistakes or failures, we might gain insight into ways to prevent them. In risk management terms, this asks us to make reasonable assumptions that help us estimate the frequencies of occurrence and probabilities of occurrence for such events.

* Second, how much impact do they have on the organization, its goals and objectives, and its assets, people, or reputation? What did this cost us, in terms of money, lost business, real damages, injuries or deaths, and loss of goodwill among our customers and suppliers?

These answers suggest that if something we do, use, or depend on can fail, no matter what the cause, then we can start to look at the how of those failures—but we let those frequencies, probabilities, and possible impacts guide us to prioritize which risks we look at first, and which we can choose to look at later.

We care about risks because when they occur (when they become an incident), they disrupt our plans. Incidents disrupt us in two ways:

* They break our chain of thought. They interrupt the flow of decision making that we “normally” would be using to carry out our planned, regular, normal activities.

* They cause us to react to their occurrence. We divert time, labor, money, effort, and decision making into responding to that incident.

Every one of those decisions, large or small, is an opportunity for somebody or something to "mess with" what you had planned and what you want and need to accomplish:

* Competitors can learn what you’re planning to do.
* Customer requests can be mishandled, misrouted, or ignored, which may lead to customers taking their business elsewhere.
* Costs can be erroneously increased, and revenues can be lost.

**Decision assurance**, then, consists of protecting the availability, reliability, and integrity of the four main components of the decision process:

* The knowledge we already have (our memory and experience), including knowledge of our goals, objectives, and priorities
* New information we receive from others (the marketplace, customers, others in the organization, and so on)
* Our cognitive ability to think and reason with these two sets of information and to come to a decision
* Taking action to carry out that decision or to communicate that decision to others, who will then be responsible for taking action

One of the most powerful decision assurance tools that managers and leaders can use at almost any organizational level is to “sanity-check” the inputs, the thinking, and the proposed actions with other people before committing to a course of action. “Does this make sense?” is a question that experience suggests ought to be asked often but isn’t. For information security specialists, checking your facts, your stored knowledge, your logic, and your planning with others can take many different forms:

* Sharing or pooling risk management information with others in your marketplace, with insurers or re-insurers, or with key stakeholders
* Actively participating in threat and risk reduction communities of practice, information exchanges, and community emergency response planning groups, which might include representation from local and national government authorities
* Using “anti-groupthink” processes and techniques to prevent your decision processes from stifling new voices or contrary views
* Finding ways to be “surprise-tolerant” so that unanticipated observations about day-to-day operational events can generate possible new insight
* Building, maintaining, and using mentors, peer groups, and trusted advisory groups, both from within the organization and from outside

### Risk Visibility & Reporting (e.g., Risk Register, Sharing Threat Intelligence, Common Vulnerability Scoring System (CVSS))

Three observations are important here, so important that they are worth considering as rules in and of themselves:
* Rule 1: All things will end. Systems will fail; parts will wear out. People will get sick, quit, die, or change their minds. Information will never be complete or absolutely accurate or true.
* Rule 2: The best you can do in the face of Rule 1 is spend money, time, and effort making some things more robust and resilient at the expense of others, and thus trade off the risk of one kind of failure for another.
* Rule 3: There’s nothing you can do to avoid Rule 1 and Rule 2.

![Four Faces of Risk](images/four-faces-of-risk.png)

Risk management, then, is trading off effort and resources now to reduce the possibility of a risk occurring later, and if it does occur, in limiting the damage it can cause to us or those things, people, and objectives we hold important. The impact or loss that can happen to us when a risk goes from being a possibility to a real occurrence - when it becomes an incident - is often looked at first in terms of how it affects our organization's goals, objectives, systems, and our people. This provides four ways of looking at risk, no one of which is the one best right way. All of these perspectives have something to reveal to us about the information risks our organization may be facing.

![The Layered View](images/the-layered-view.png)

These layers of function may take physical, logical, and administrative forms throughout every human enterprise:

* **Physical systems elements** are typically things such as buildings, machinery, wiring systems, and the hardware elements of IT systems. The land surrounding the buildings, the fences and landscaping, lighting, and pavements are also some of the physical elements you need to consider as you plan for information risk management. The physical components of infrastructures, such as electric power, water, sewer, storm drains, streets and transportation, and trash removal, are also important. What’s missing from this list? People. People are of course physical (perhaps illogical?) elements that should not be left out of our risk management considerations!
*　**Administrative elements** are the policies, procedures, training, and expectations that we spell out for the humans in the organization to follow. These are typically the first level at which legal and regulatory constraints or directives become a part of the way the organization functions.
* **Logical elements** (sometimes called **technical elements**) are the software, firmware, database, or other control systems settings that you use to make the physical elements of the organization’s IT systems obey the dictates and meet the needs of the administrative ones

#### Outcomes-Based Risk
This face of risk looks at why people or organizations do what they do or set out to achieve their goals or objectives. The outcomes of achieving those goals or objectives are the tangible or intangible results we produce, the harvest we reap.

Here's a hypothetical example: Search Improvement Engineering (SIE) is a small software development company that makes and markets web search optimization aids targeted to mobile phone users. SIE’s chief of product development wants to move away from in-house computers, servers, and networks and start using cloud-based integrated development and test tools instead; this, she argues, will reduce costs, improve overall product quality and sustainability, and eliminate risks of disruption that owning (and maintaining) their own development computer systems can bring. The outcome is to improve software product quality, lower costs, and enable the company to make new products for new markets. This further supports the higher-level outcomes of organizational survival, financial health, growth, and expansion. One outcomes-based risk would be the disclosure, compromise, or loss of control over SIE’s designs, algorithms, source code, or test data to other customers operating on the cloud service provider’s systems.

#### Process-Based Risk

Everything we want to achieve or do requires us to take some action; action requires us to make a decision. Even if it’s only one action that flows from one decision, that's a process. In organizational terms, a **business process** takes a logical sequence of purpose, intention, conditions, and constraints and structures them as a set of systematic actions and decisions in order to carry them out. This **business logic**, and the business processes that implement it, also typically provide indicators or measurements that allow operators and managers to monitor the execution of the process, assess whether key steps are working correctly, signal completion of the process (and thus perhaps trigger the next process), or issue an alarm to indicate that attention and action are required. When a task (a process step) fails to function properly, this can either stop the process completely or lead to erroneous results.

#### Asset-Based Risk

Broadly speaking, an asset is anything that the organization (or the individual) has, owns, uses, or produces as part of its efforts to achieve some of its goals and objectives. Buildings, machinery, or money on deposit in a bank are examples of hard, or tangible assets. The people in your organization (including you!), the knowledge that is recorded in the business logic of your business processes, your reputation in the marketplace, the intellectual property that you own as patents or trade secrets, and every bit of information that you own or use are examples of soft, or intangible assets. Assets are the tools you use to perform the steps in your business processes; without assets, the best business logic cannot do anything.

#### Threat-Based (or Vulnerability-Based) Risk

These are two sides of the same coin, really. Threat actors (natural or human) are things that can cause damage and distruction leading to loss. Vulnerabilities are weaknesses within systems, processes, assets, and so forth that are points of potential failure. When (not if) they fail, they result in damage, disruption, and loss. Typically, threats or threat actors exploit (make use of) vulnerabilities. Threats can be natural (such as storms or earthquakes), accidental (failures of processes or systems due to unintentional actions or normal wear and tear, causing a component to fail), or deliberate actions taken by humans or instigated by humans. Such intentional attackers have purposes, goals, or objectives they seek to accomplish; Mother Nature or a careless worker does not intend to cause disruption, damage, or loss.

As an example, consider a typical small office/home office (SOHO) IT network, consisting of a modem/router, a few PCs or laptops, and maybe a network attached printer and storage system. A thunderstorm can interrupt electrical power; the lack of a backup power supply is a weakness or vulnerability that the thunderstorm unintentionally exploits. By contrast, the actions of the upstairs neighbors or passers-by who try to “borrow some bandwidth” and make use of the SOHO network’s wireless connection will most likely degrade service for authorized users, quite possibly leading to interruptions in important business or personal tasks. This is deliberate action, taken by threat actors, that succeeds perhaps by exploiting poorly configured security settings in the wireless network, whether its intention was hostile (e.g., willful disruption) or merely inconsiderate.

#### Risk Register

At this point, the organization or business needs to be building a risk register, a central repository or knowledge bank of the risks that have been identified in its business and business process systems. This register should be a living document, constantly refreshed as the company moves from risk identification through mitigation to the “new normal” of operations after instituting risk controls or countermeasures.

As an internal document, a company’s risk register is a compendium of its weaknesses and should be considered as closely held, confidential, proprietary business information. It provides a would-be attacker, competitors, or a disgruntled employee with powerful insight into ways that the company might be vulnerable to attacks. This need to protect the confidentiality of the risk register becomes even more acute as the risk register is updated from first-level outcomes or process-based identification through impact assessments, and then linked (as you’ll see in the next chapter, “Operationalizing Risk Mitigation”) with systems vulnerability or root cause/proximate cause assessments.


### Risk Management Concepts (e.g., Impact Assessments, Threat Modelling, Business Impact Analysis (BIA))

Information security best practices suggest a good minimum set of "when in doubt" actions to ensure that the organization:
* Physically protects and secures information systems, information storage (paper or electronic), and supporting infrastructure
* Controls access by all users, visitors, and guests, such as with usernames and passwords, for all computer systems
* Controls disclosure and disposal of information and information systems
* Trains all staff (or anyone with access) on these minimum security measures

This "safe computing" or **computing hygiene** standard, is a proven place for any organization to start with.

Two sets of information provide a rich source of information security requirements for an organization. The first is the legal, regulatory, and cultural context in which the organization must exist. As stated before, failure to fulfill these obligations can put the organization out of existence, and its leaders, owners, stakeholders (and even its employees) at risk of civil or criminal prosecution. The second set of information that should drive the synthesis of information security requirements is the organization’s BIA.

There are typically two major ways that information security requirements take form or are expressed or stated within an organization. The first is to write a system requirements specification (SRS), which is a formal document used to capture high-level statements of function, purpose, and intent. An SRS also contains important system-level constraints. It guides or directs analysts and developers as they design, build, test, deploy, and maintain an information; it also drives end-user training activities.

Organizations also write and implement policies and procedures that state what the information security requirements are and what the people in the organization need to do to fulfill them and comply with them:
* **Policies** are broad statements of direction and intention; in most organizations, they establish direction and provide constraints to leaders, managers, and the workforce. Policies direct or dictate what should be done, to what standards of compliance, who does it, and why they should do it. Policies are usually approved (“signed out”) by senior leadership, and are used to guide, shape, direct, and evaluate the performance of the people who are affected by the policies; they are thus considered administrative in nature.
* **Procedures** take the broad statements expressed in policies and break them down into step-by-step detailed instructions to those people who are assigned responsibility to perform them. Procedures state how a task needs to be performed and should also state what constraints or success criteria apply. As instructions to people who perform these tasks, procedures are administrative in nature.
You might ask which should come first, the SRS or the policies and procedures. Once senior leadership agrees to a statement of need, it's probably faster to publish a policy and a new procedure than it is to write the SRS, design the system, test it, deliver it, and train users on the right ways to use it. But be careful! It often takes a lot of time and effort for the people in an organization to operationalize a new policy and the procedures that come with it. Overlooking this training hurdle can cause the new policy or procedures to fail.

#### Business Impact Analysis (BIA)

The business impact analysis (BIA) is where the rubber hits the road, so to speak. Risk management must be a balance of priorities, resources, probabilities, and impacts, as you’ve seen throughout this chapter. All this comes together in the BIA. As its name implies, the BIA is a consolidated statement of how different risks could impact the prioritized goals and objectives of an organization.

The BIA reflects a combination of due care and due diligence in that it combines "how we do business" with "how we know how well we're doing it".

There is no one right, best format for a BIA; instead, each organization must determine what its BIA needs to capture and how it has to present it to achieve a mix of purposes:
* BIAs should inform, guide, and shape risk management decisions by senior leadership.
* BIAs should provide the insight to choose a balanced, prudent mix of risk mitigation tactics and techniques.
* BIAs should guide the organization in accepting residual risk to goals, objectives, processes, or assets in areas where this is appropriate.
* BIAs may be required to meet external stakeholder needs, such as for insurance, financial, regulatory, or other compliance purposes.

You must recognize one more important requirement at this point: to be effective, a BIA must be kept up to date. The BIA must reflect today's set of concerns, priorities, assets, and processes; it must reflect today's understanding of threats and vulnerabilities. Outdated information in a BIA could at best lead to wasted expenditures and efforts on risk mitigation; at worst, it could lead to failures to mitigate, prevent, or contain risks that could lead to serious damage, injury, or death, or possibly put the organization out of business completely.

At its heart, making a BIA is pretty simple: you identify what's important, estimate how often it might fail, and estimate the costs to you of those failures. You then rank those possible impacts in terms of which basis for risk best suits your organization, be that outcomes, processes, assets, or vulnerabilities. For all but the simplest and smallest of organizations, however, the amount of information that has to be gathered, analyzed, organized, assessed, and then brought together in the BIA can be overwhelming. The BIA is one of the most critical steps in the information risk management process, end to end; it's also perhaps the most iterative, the most open to reconsideration as things change, and the most in need of being kept alive, current, and useful. Most of that is well beyond the scope of the SSCP examination, and so we won’t go into the mechanics of the business impact analysis process in any further detail. As an SSCP, however, you’ll be expected to continue to grow your knowledge and skills, thus becoming a valued contributor to your organization’s BIA.

### Risk Management Frameworks (e.g., ISO, NIST)

A **risk management framework** is a set of concepts, tools, processes, and techniques that help organize information about risk. As you’ve no doubt started to see, the job of managing risks to your information is a set of many jobs, layered together. More than that, it’s a set of jobs that changes and evolves with time as the organization, its mission, and the threats it faces evolve.

Let’s start by taking a quick look at NIST Special Publication 800-37, Risk Management Framework (RMF) for Information Systems and Organizations: A System Life Cycle Approach for Security and Privacy. In its May 2018 draft updated form, this RMF establishes a broad, overarching perspective on what it calls the fundamentals of information systems risk management. Organizational leadership and management must address these areas of concern, shown conceptually in the below figure

![NIST RMF Areas of Concern](images/nist-rmf-areas-of-concern.png)

1. Organization-wide risk management
2. Information security and privacy
3. System and system elements
4. Control allocation
5. Security and privacy posture
6. Supply chain risk management

You can see that there’s an expressed top-down priority or sequence here. It makes little sense to worry about your IT supply chain (which might be a source of malware-infested hardware, software, and services) if leadership and stakeholders have not first come to consensus about risks and risk management at the broader, strategic level. (You should also note that in NIST’s eyes, the big-to-little picture goes from strategic, to operational, to tactical, which is how many in government and the military think of these levels. Business around the world, though, sees it as strategic, to tactical, to day-to-day operations.)

The RMF goes on by specifying seven major phases (which it calls steps) of activities for information risk management:
1. Prepare
2. Categorize
3. Select
4. Implement
5. Assess
6. Authorize
7. Monitor
It is tempting to think of these as step-by-step sets of activities - for example, once all risks have been categorized, you then start selecting which are the most urgent and compelling to make mitigation decisions about. Real-world experience shows, though, that each step in the process reveals things that may challenge the assumptions we just finished making, causing us to reevaluate what we thought we knew or decided in that previous step. It is perhaps more useful to think of these steps as overlapping sets of attitudes and outlooks that frame and guide how overlapping sets of people within the organization do the data gathering, inspection, analysis, problem solving, and implementation of the chosen risk controls. The figure belowed shows that there's a continual ebb and flow of information, insight, and decision between and across all elements of these "steps".

![NIST RMF Phased Approach](images/nist-rmf-phased-approach.png)

Although NIST publications are directive in nature for U.S. government systems, and indirectly provide strong guidance to the IT security market in the United States and elsewhere, many other information risk management frameworks are in widespread use around the world. For example, the International Organization for Standardization publishes ISO Standard 31000:2018, Risk Management Guidelines, in which the same concepts are arranged in slightly different fashion. First, it suggests that three main tasks must be done (and in broad terms, done in the order shown):
1. Scope, Context, Criteria
2. Risk Assessment, consisting of Risk Identification, Risk Analysis, and Risk Evaluation
3. Risk Treatment 

Three additional, broader functions support or surround these central risk mitigation tasks:
4. Recording and Reporting
5. Monitoring and Review
6. Communication and Consultation
As you can see in the belowed figure, the ISO RMF also conveys a sense that on the one hand, there is a sequence of major activities, but on the other hand, these major steps or phases are closely overlapping.

![ISO 31000:2018 Conceptual RMF](images/iso-31000-2018-conceptual-rmf.png)

#### Plan, Do, Check, Act (PDCA)

The Project Management Institute and many other organizations talk about the basic cycle of making decisions, taking steps to carry out those decisions, monitoring and assessing the outcomes, and taking further actions to correct what's not working and strengthen or improve what is.

One important idea to keep in mind is that these cycles of Plan, Do, Check, Act (PDCA) don’t just happen one time—they repeat, they chain together in branches and sequels, and they nest one inside the other, as you can see in the figure belowed. Note too that planning is a forward-looking, predictive, thoughtful, and deliberate process. We plan our next vacation before we put in for leave or make hotel and travel arrangements; we plan how to deal with a major disruption due to bad weather before the tornado season starts!

![PDCA Cycle Diagram with Subcycles](images/pdca-cycle-diagram-with-subcycles.png)

* **Planning** is the process of laying out the step-by-step path we need to take to go from “where we are” to “where we want to be.” It’s a natural human activity; we do this every moment of our lives. Our most potent tools for planning are what Kipling called his “six honest men”—asking what, why, when, how, where, and who of almost everything we are confronted with and every decision we have to make. As an SSCP, you need those six honest teammates with you at all times!
* **Doing** encompasses everything it takes to accomplish the plan. From the decisions to “execute the plan” on through all levels of action, this phase is where we see people using new or different business processes to achieve what the plan needs to accomplish, using the steps the plan asks for.
* **Checking** is part of conducting due diligence on what the plan asked us to achieve and how it asked us to get it done. We check that tasks are getting done, on time, to specification; we check that errors or exceptions are being handled correctly. And of course, we gather this feedback data and make it available for further analysis, process improvement, and leadership decision making.
* **Acting** involves making decisions and taking corrective or amplifying actions based on what the checking activities revealed. In this phase, leaders and managers may agree that a revised plan is needed, or that the existing plan is working fine but some individual processes need some fine-tuning to achieve better results.

#### Risk Assessment

Risk assessment is a systematic process of identifying risks to achieving organizational priorities.

At the heart of a risk assessment process must be the organizational goals and objectives, suitably prioritized. Typically, the highest priorities are existential ones—ones that relate to the continued existence and health of the organization. These often involve significant threats to continued operation or significant and strategic opportunities for growth. Other priorities may be vitally important in the near term, but other options may be available if the chosen favorite fails to be successful. The “merely nice to have” objectives may fall lower in the risk assessment process. This continual reevaluation of priorities allows the risk assessment team to focus on the most important, most compelling risks first.

The next major element of risk assessment is to thoroughly examine and evaluate the processes, assets, systems, information, and other elements of the organization as they relate to or support achieving these prioritized goals and objectives. This linkage of “what” and “how” with “why” helps narrow the search for system elements or process steps that, if they fail or are vulnerable to exploitation, could put these goals in jeopardy.

Most risk assessment processes typically summarize their findings in some form of BIA. This relates costs (in money, time, and resources) to the organization that could be faced if the risk events do occur. It also takes each risk and assesses how frequently it might occur. The expected cost of these risks (their costs multiplied by their frequencies and probabilities of occurrences, across the organization) represents the anticipated financial impact of that risk, over time; this is a key input to making risk mitigation or control choices.

What happens when an organization’s information is lost, compromised by disclosure to unauthorized parties, or corrupted? These questions (which reflect the CIA triad) indicate what the organization stands to lose if such a breach of information security happens. Let’s illustrate with a few examples:

* **Personally identifying information (PII)** Loss or compromise can cause customers to take their business elsewhere and can lead to criminal and civil penalties for the organization and its owners, stakeholders, leaders, and employees.
* **Company financial data, and price and cost information** Loss or compromise can lead to loss of business, to investors withdrawing their funds, or to loss of business opportunities as vendors and partners go elsewhere. Can also result in civil and criminal penalties.
* **Details about internal business processes** Loss could lead to failures of business processes to function correctly; compromise could lead to loss of competitive advantage, as others in the marketplace learn how to do your business better.
* **Risk management information** Loss or compromise could lead to insurance policies being canceled or premiums being increased, as insurers conclude that the organization cannot adequately fulfill its due diligence responsibilities.

When we view information in such terms—as “What does it cost us if we lose it?”—we decide how vital the information is to us. What this categorization or classification really does is tell us how important it is to protect that information, based on possible loss or impact. We categorize our possible losses, in terms of severity of damage, impact, or costs; we also categorize them in terms of outcomes, processes, and assets they have or depend on. Finally, we categorize them by threat or common vulnerabilities. This kind of risk analysis can help us identify critical locations, elements, or objectives that could be putting the entire organization at risk; in doing so, that focuses our risk analysis further.

Risk analysis is a complex undertaking and often involves trying to sort out what can cause a risk to become an incident. **Root cause analysis** looks to find what the underlying vulnerability or mechanism of failure is that leads to the incident, for example. By contrast, **proximate cause analysis** asks, “What was the last thing that happened that caused the risk to occur?” (This is sometimes called the “last clear opportunity to prevent” the incident, a term that insurance underwriters and their lawyers often use.) Our earlier example of backing your car out of the driveway, only to run over a child’s bicycle left in the wrong place, illustrates these ideas. You could have looked first, maybe even walked around the car before you got in and started to drive; you had the last clear opportunity to prevent damage, and thus your actions were the proximate cause. (You failed in your due diligence, in other words.) Your child, however, is the one who left the bicycle in the wrong place; the root of the problem may be the failure to help your child learn and appreciate what his responsibility of due care for his bicycle requires. And who was responsible for teaching due care to your child? (A word of advice: don’t say “My spouse.”)

We’ve looked at a number of examples of risks becoming incidents; for each, we’ve identified an outcome that describes what might happen (customers go to our competitors; we must get our car and the bicycle repaired). Outcomes are part of the basis of estimate with which we can make two kinds of **risk assessments**: quantitative and qualitative.

#### Quantitative Risk Assessment

Quantitative assessments use simple techniques (like counting possible occurrences, or estimating how often they might occur) along with estimates of the typical cost of each loss:
* Single loss expectancy (SLE): Usually measured in monetary terms, SLE is the total cost you can reasonably expect should the risk event occur. It includes immediate and delayed costs, direct and indirect costs, costs of repairs, and restoration. In some circumstances, it also includes lost opportunity costs, or lost revenues due to customers needing or choosing to go elsewhere.
* Annual rate of occurrence (ARO): ARO is an estimate of how often during a single year this event could reasonably be expected to occur.
* Annual loss expectancy (ALE): ALE is the total expected losses for a given year and is determined by multiplying the SLE by the ARO.
* Safeguard value: This is the estimated cost to implement and operate the chosen risk mitigation control. You cannot know this until you’ve chosen a risk control or countermeasure and an implementation plan for it; we’ll cover that in the next chapter.

Other numbers associated with risk assessment relate to how the business or organization deals with time when its systems, processes, and people are not available to do business. This “downtime” can often be expressed as a mean (or average) allowable downtime, or a maximum downtime. Times to repair or restore minimum functionality, and times to get everything back to normal, are also some of the numbers the SSCP will need to deal with. For example:
* The maximum acceptable outage (MAO) is the maximum time that a business process or task cannot be performed without causing intolerable disruption or damage to the business. Sometimes referred to as the maximum tolerable outage (MTO), or the maximum tolerable period of disruption (MTPOD), determining this maximum outage time starts with first identifying mission-critical outcomes. These outcomes, by definition, are vital to the ongoing success (and survival!) of the organization; thus, the processes, resources, systems, and no doubt people they require to properly function become mission-critical resources. If only one element of a mission-critical process is unavailable, and no immediate substitute or workaround is at hand, then the MAO clock starts ticking.
* The mean time to repair (MTTR), or mean time to restore, reflects our average experience in doing whatever it takes to get the failed system, component, or process repaired or replaced. The MTTR must include time to get suitable staff on scene who can diagnose the failure, identify the right repair or restoration needed, and draw from parts or replacement components on hand to effect repairs. MTTR calculations should also include time to verify that the repair has been done correctly and that the repaired system works correctly. This last requirement is very important—it does no good at all to swap out parts and say that something is fixed if you cannot assure management and users that the repaired system is now working the way it needs to in order to fulfill mission requirements.
These types of quantitative assessments help the organization understand what a risk can do when it actually happens (becomes an incident) and what it will take to get back to normal operations and clean up the mess it caused. One more important question remains: how long to repair and restore is too long? Two more “magic numbers” shed light on this question:
* The recovery time objective (RTO) is the amount of time in which system functionality or ability to perform the business process must be back in operation. Note that the RTO must be less than or equal to the MAO (if not, there’s an error in somebody’s thinking). As an objective, RTO asks systems designers, builders, maintainers, and operators to strive for a better, faster result. But be careful what you ask for; demanding too rapid an RTO can cause more harm than it deflects by driving the organization to spend far more than makes bottom-line sense.
* The recovery point objective (RPO) measures the data loss that is tolerable to the organization, typically expressed in terms of how much data needs to be loaded from backup systems in order to bring the operational system back up to where it needs to be. For example, an airline ticketing and reservations system takes every customer request as a transaction, copies the transactions into log files, and processes the transactions (which causes updates to its databases). Once that’s done, the transaction is considered completed. If the database is backed up in its entirety once a week, let’s say, then if the database crashes five days after the last backup, that backup is reloaded and then five days’ worth of transactions must be reapplied to the database to bring it up to where customers, aircrew, airport staff, and airplanes expect it to be. Careful consideration of an RPO allows the organization to balance costs of routine backups with time spent reapplying transactions to get back into business.

We’ll go into these numbers (and others) in greater depth in Chapter 10 as you learn how to help your organization plan for and manage its response to actual information security and assurance incidents. It’s important that you realize that these numbers play three critical roles in your integrated, proactive information defense efforts. All of these quantitative assessments (plus the qualitative ones as well) help you achieve the following:
* Establish the “pain points” that lead to information security requirements that can be measured, assessed, implemented, and verified.
* Shape and guide the organization’s thinking about risk mitigation control strategies, tactics, and operations, and keep this thinking within cost-effective bounds.
* Dictate key business continuity planning needs and drive the way incident response activities must be planned, managed, and performed.
One final thought about the “magic numbers” is worth considering. The organization’s leadership have their stakeholders’ personal and professional fortunes and futures in their hands. Exercising due diligence requires that management and leadership be able to show, by the numbers, that they’ve fulfilled that obligation and brought it back from the brink of irreparable harm when disaster strikes. Those stakeholders—the organization’s investors, customers, neighbors, and workers—need to trust in the leadership and management team’s ability to meet the bottom line every day. Solid, well-substantiated numbers like these help the stakeholders trust, but verify, that their team is doing their job.

#### Qualitative Risk Assessment
Qualitative assessments focus on an inherent quality, aspect, or characteristic of the risk as it relates to the outcome(s) of a risk occurrence. “Loss of business” could be losing a few customers, losing many customers, or closing the doors and going out of business entirely!

So, which assessment strategy works best? The answer is both. Some risk situations may present us with things we can count, measure, or make educated guesses about in numerical terms, but many do not. Some situations clearly identify existential threats to the organization (the occurrence of the threat puts the organization completely out of business); again, many situations are not as clear-cut. Senior leadership and organizational stakeholders find both qualitative and quantitative assessments useful and revealing.

Qualitative assessment of information is most often used as the basis of an information classification system, which labels broad categories of data to indicate the range of possible harm or impact. Most of us are familiar with such systems through their use by military and national security communities. Such simple hierarchical information classification systems often start with “Unclassified” and move up through “For Official Use Only,” “Confidential,” “Secret,” and “Top Secret” as their way of broadly outlining how severely the nation would be impacted if the information was disclosed, stolen, or otherwise compromised. Yet even these cannot stay simple for long.

Businesses, private organizations, and the military have another aspect of data categorization in common: the concept of need to know. Need to know limits who has access to read, use, or modify data based on whether their job functions require them to do so. Thus, a school’s purchasing department staff have a need to know about suppliers, prices, specific purchases, and so forth, but they do not need to know any of the PII pertaining to students, faculty, or other staff members. Need-to-know leads to compartmentalization of information approaches, which create procedural boundaries (administrative controls) around such sets of information. 

### Risk Treatment (e.g., Accept, Transfer, Mitigate, Avoid, Recast)

Four strategic choices exist when we think of how to protect prioritized assets, outcomes, or processes. These choices are at the strategic level, because just the nature of them is comparable to “life-or-death” choices for the organization. A strategic risk might force the company to choose between abandoning a market or opportunity and taking on a fundamental, gut-wrenching level of change throughout its ethics, culture, processes, or people, for example. We see such choices almost before we’ve started to think about what the alternatives might cost and what they might gain us. These strategic choices are often used in combination to achieve the desired level of assurance against risk. As an SSCP, you’ll assist your organization in making these choices across strategic, tactical, and operational levels of planning, decision making, and actions that people and the organization must take. Note that each of these choices is a verb; these are things that you do, actions you perform. This is key to understanding which ones to choose and how to use them successfully. We’ll look at each individually, and then take a closer look at how they combine and mutually reinforce each other to attain greater protective effect.

There are choices at the strategic and tactical level that seem quite similar and are often mistaken as identical. The best way to keep them separate in your mind might be as follows:

* If you’ve just completed the risk assessment and BIA, your strategic choices are about operational risk mitigation planning and which risks to deal with in other ways. This is the strategic choice (as you’ll see) of deterring, detecting, preventing, or avoiding a risk altogether. Note that prevent, deter, and detect will probably involve choices of risk mitigation controls, but you cannot make those choices until after you’ve done the architectural and vulnerability assessments.

* If you’ve already done the architectural and vulnerability assessments, as we’ll cover in Chapter 4, you’re ready to start making hard mitigation choices for the risks you’re not going to avoid altogether. These are tactical choices you’ll be making, as they will dictate how, when, and to what degree of completeness you implement operational (day-to-day), functional choices in the ways you try to control risks.

Having identified the risks and prioritized them, what next? What realistic options exist? One (more!) thing to keep in mind is that as you delve into the details of your architecture, and find, characterize, and assess its vulnerabilities against the prioritized set of risks, you will probably find some risks you thought you could and should “fix” that prove far too costly or disruptive to attempt to do so. That’s okay. Like any planning process, risk management and risk mitigation taken together are a living, breathing, dynamic set of activities. Let these assessments shed light on what you’ve already thought about, as well as what you haven’t seen before.

#### Deter

To **deter** means to discourage or dissuade someone from taking an action because of their fear or dislike of the possible consequences. Deterring an attacker means that you get them to change their mind and choose to do something else instead. Your actions and your posture convince the attacker that what they stand to gain by launching the attack will probably not be worth the costs to them in time, resources, or other damages they might suffer (especially if they are caught by law enforcement!). Your actions do this by working on the attacker’s decision cycle. Why did they pick you as a target? What do they want to achieve? How probable is it that they can complete the attack and escape without being caught? What does it cost them to prepare for and conduct the attack? If you can cast sufficient doubt into the attacker’s mind on one or more of these questions, you may erode their confidence; at some point, the attacker gives up and chooses not to go through with their contemplated or planned attack.

By its nature, deterrence is directed onto an active, willful threat actor. Try as you might, you cannot deter an accident, nor can you command the tides not to flood your datacenter. You do have, however, many different ways of getting into the attacker’s decision cycle, demotivating them, and shaping their thinking so that they go elsewhere:

* Physical assets such as buildings (which probably contain or protect other kinds of assets) may have very secure and tamper-proof doors, windows, walls, or rooflines that prevent physical forced entry. Guard dogs, human guards or security patrols, fences, landscaping, and lighting can make it obvious that an attacker has very little chance to approach the building without being detected or prevented from carrying out their attack.
* Strong passwords and other access control technologies can make it visibly difficult for an attacker to hack into your computer systems (be they local or cloud-hosted).
* Policies and procedures can be used to train your people to make them less vulnerable to social-engineering attacks.

Deterrence can be passive, active, or a combination of the two. Fences, the design of parking, access roads and landscaping, and lighting tend to be passive deterrence measures; they don’t take actions in response to the presence of an attacker, for example. Active measures give the defender the opportunity to create doubt in the attacker’s mind: Is the guard looking my way? Is anybody watching those CCTV cameras?

#### Detect

To **detect** means to notice or consciously observe that an event of interest is happening. Notice the built-in limitation here: you have to first decide what set of events to “be on the lookout for” and therefore which events you possibly need to make action decisions about in real time. While you’re driving your car down a residential street, for example, you know you have to be watching for other cars, pedestrians, kids, dogs, and others darting out from between parked cars—but you normally would “tune out” watching the skies to see if an airplane was about to try to land on the street behind you. You also need to decide what to do about false alarms, both the false positives (that alarm when an event of interest hasn’t occurred) and the false negatives (the absence of an alarm when an event is actually happening).

If you think of how many false alarms you hear every week from car alarms or residential burglar alarms in your neighborhood, you might ask why we bother to try to detect that an event of interest might possibly be happening. Fundamentally, you cannot respond to something if you do not know it is happening. Your response might be to prevent or disrupt the event, to limit or contain the damage being caused by it, or to call for help from emergency responders, law enforcement, or other response teams. You may also need to activate alternative operations plans so that your business is not severely disrupted by the event. Finally, you do need to know what actually happened so that you can decide what corrective actions (or remediation) to take—what you must do to repair what was damaged and to recover from the disruption the incident has caused.

#### Prevent
To **prevent** an attack means to stop it from happening or, if it is already underway, to halt it in its tracks, thus limiting its damage. A thunderstorm might knock out your commercial electrical power (which is an attack, even if a nondeliberate one), but the uninterruptible power supplies keep your critical systems up and running. Heavy steel fire doors and multiple dead-bolt locks resist all but very determined attempts to cut, pry, or force an entry into your building. Strong access control policies and technologies prevent unauthorized users from logging into your computer systems. Fire-resistant construction of your home’s walls and doors is designed to increase the time you and your family have to detect the fire and get out safely before the fire spreads from its source to where you’re sleeping. (We in the computer trades owe the idea of a firewall to this pre-computer-era, centuries-old idea of keeping harm on one side of a barrier from spreading through to the other.)

Preventive defense measures provide two immediate paybacks to the defender: 
they limit or contain damage to that which you are defending, and they cost the attacker time and effort to get past them. Combination locks, for example, are often rated in terms of how long it would take someone to just “play with the dial” to guess the combination or somehow sense that they’ve started to make good guesses at it. Fireproof construction standards aim to prevent the fire from burning through (or initiating a fire inside the protected space through heat transfer) for a desired amount of time.

Note that we gain these benefits whether we are dealing with a natural, nonintentional threat, an accident, or a deliberate, intentional attack.

#### Avoid

To avoid an attack means to change what you do, and how you do it, in such ways as to not be where your attacker is expecting you to be when they try to attack you. This can be a temporary change to your planned activities or a permanent change to your operations. In this way, you can reduce or eliminate the possible disruptions or damages of an attack from natural, accidental, or deliberate causes:

* Physically avoiding an attack might involve relocating part of your business or its assets to other locations, shutting down a location during times of extremely bad weather, or even closing a branch location that’s in too dangerous a market or location.

* Logically avoiding an attack can be done by using cloud service providers to eliminate your business’s dependence on a specific computer system or set of services in a particular place. At a smaller scale, you do this by making sure that the software, data, and communications systems allow your employees to get business done from any location or while traveling, without regard to where the data and software are hosted. Using a virtual private network (VPN) to mask your IP and Media Access Control (MAC) addresses is another example of using logical means to avoid the possible consequences of an attack on your IT infrastructure and information systems.

* A variety of administrative methods can be used, usually in conjunction with physical or logical ones such as those we’ve discussed. Typically they will be implemented in policies, procedural documents, and quite possibly contracts or other written agreements.

Like everything in risk management and risk mitigation, these basic elements of choice can be combined in a wide variety of ways:

* Alarms combine detection and notification to users and systems owners; by alerting the attacker that they’ve been spotted “in the act,” the sound of the alarms may motivate the attacker to stop the attack and leave the scene (which is a combination of preventing further damage while it deters and prevents continued or repeated attack).
* Strong protective systems can limit or contain damage during an attack, which prevents the attack from spreading; to the degree that these protective systems are visible to the attacker, they may also deter the attack by raising the costs to the attacker to commence or continue the attack. They may also raise the attacker’s fear of capture, arrest, or other losses and thus further deter attack.
* Most physical and logical attack avoidance methods require a solid policy and procedural framework, and they quite often require users and staff members to be familiar with them and even trained in their operational use.

This last point bears some further emphasis. Organizations will often spend substantial amounts of money, time, and effort to put physical and even logical risk management systems into use, only to then put minimal effort into properly defining the who, what, when, where, how, and why of their use, maintenance, and ongoing monitoring. The money spent on a strong, imposing fence around your property will ultimately go to waste without routinely inspecting it and keeping it maintained. (Has part of it been knocked down by frost heave or a fallen tree? Has someone cut an opening in it? You’ll never know if you don’t walk the fence line often.)

This suggests that continuous follow-through is in fact the weakest link in our information risk management and mitigation efforts. We’ll look at ways to improve on this in the remainder of this book.

## Perform Security Assessment Activities

### Participate in Security Testing

### Interpretation & Reporting of Scanning & Testing Results

### Remediation Validation

### Audit Finding Remediation

## Operate & Maintain Monitoring Systems

## Analyze Monitoring Results