# DS103 Metrics and Data Processing : Lesson Three Companion Notebook

### Table of Contents <a class="anchor" id="DS103L3_toc"></a>

* [Table of Contents](#DS103L3_toc)
    * [Page 1 - Introduction](#DS103L3_page_1)
    * [Page 2 - The Capability Index C<sub>p</sub>](#DS103L3_page_2)
    * [Page 3 - Calculating Process Capability](#DS103L3_page_3)
    * [Page 4 - The Capability Index C<sub>pk</sub>](#DS103L3_page_4)
    * [Page 5 - Calculating Capability Index (C<sub>pk</sub>)](#DS103L3_page_5)
    * [Page 6 - Out of Control and Out of Spec](#DS103L3_page_6)
    * [Page 7 - Key Terms](#DS103L3_page_7)
    

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 1 - Introduction<a class="anchor" id="DS103L3_page_1"></a>

[Back to Top](#DS103L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

In [1]:
from IPython.display import VimeoVideo
# Tutorial Video Name: Statistical Process Control
VimeoVideo('236613260', width=720, height=480)

# Introduction

In the last lesson, you learned about statistical process control and when a process is out of control. The control limits play a big role in determining when the process is broken and in need of repair.

In this lesson, you will learn about another type of limits, called spec limits. You will also learn how these two sets of limits interact and what conclusions can be drawn based on their interaction. You will then be introduced to the concept of capability and demonstrate how to measure capability with a few different metrics.

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 2 - The Capability Index C<sub>p</sub> <a class="anchor" id="DS103L3_page_2"></a>

[Back to Top](#DS103L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">



# The Capability Index C<sub>p</sub>

Any process that can be monitored using an SPC chart is eligible to be evaluated for capability. Recall that any process has some inherent variability in it. If the process is running consistently and behaving well, you can determine the extent of variability by using an SPC chart.

However, there is another set of important limits in a control strategy, called *spec limits*. The word "spec" is an abbreviated form of "specification," and is universally written and spoken as "spec." In the strictest of terms, a spec limit is a limit beyond which material is usually no good and has to be scrapped.

Take a look at an example: 

> Suppose you work for a company that makes wire for very high-end applications. The wire is very fine and high-purity. The application in which the wire is used is such that if the wire gets too thick, the resistance is too high, and the electronic parts containing the wire will not function properly. On the other end of the spectrum, if the wire gets too thin, the reliability of the part is not good because the electronic signal is not consistent.

This is a good example of spec limits. "Not good" just means that the material cannot be used for the purpose for which it was created.

---

## Process Capability 

Process capability has two parts, which are compared: 

* **Control Width:** A measure of variability of the process. You can think of this as the distance between the upper control limit (UCL) and the lower control limit (LCL). Control limits are calculated.
* **Spec Limit:** The variability of the specification or tolerance of the process. This is determined to be the distance from the *upper spec limit* (USL) and the *lower spec limit* (LSL). Spec limits are set. 

---

## Spec Limits

Spec limits are determined based on any one of the following criteria:

* Some sort of physical limitation based on downstream need.
* Personal tolerance as determined by a manager.
* Personal tolerance as determined by a customer.

You will examine each of these criteria in depth.

---

## Tolerance Determined by Customer

For example, a small business makes tags to be used on the front of control panels. The control panels have dozens of lights, gauges, and dials, and each needs a tag to identify the location being monitored. It might look something like this:

![A control panel full of buttons, switches, and lights.](Media/L05-01.png)

The majority of the tags are sold to a customer who has specific size needs, and so the width of the tags must be between 2.25 and 2.5 inches wide. They simply won't accept tags outside of those limits. 

---

## Tolerance Determined by Manager

A manager for a call center has determined that service calls need to last between four minutes and 12 minutes, since shorter calls are often an indication that the call center person has not had enough time to make a "personal connection" with the caller, and more than 12 minutes is an indication that the call center person is not working efficiently. However, these time limits may be very different than what another manager might put into place in the same situation. 

---

## Tolerance Determined by Downstream Need

On an automobile assembly line, there are all sorts of tolerances required by downstream processes. For example, on the frame of the car, there are engine mounts. If the engine mounts aren't in the right place, it makes no difference how beautiful the finish work on the mounts may be, because the line won't be able to install an engine into the auto frame.

---

# Is out of Spec Truly Bad?

If something is "out of spec," it is typically not used, which would imply that the material is garbage. This may or may not be the case. In the case of the call center worker who likes to gab, and therefore his phone calls are all "out of spec," he may have the highest customer satisfaction rank in the call center. His calls aren't garbage; they are just longer than the manager wants them to be. For the small company making tags for control panels, there might be other customers that want tags that are between 2.5 inches and 2.6 inches, so the tags that are too large for their main customer might be just right for another customer. However, for the auto assembly line, if the engine mounts are in the wrong place, it is likely that the car frame would literally be scrapped, and melted down to raw steel, awaiting fabrication into a new auto frame.

In many industries, there are secondary markets where material that is out of spec will be sold at a deep discount and used somewhere else. This happens in the semiconductor industry all the time. Did you ever wonder why a child's toy that costs just a couple dollars could have multiple semiconductors in it? It is not because the toy manufacturer is taking a huge loss on each toy, it is because the toy manufacturer doesn't need a computer chip that is pristine. The chip might be half dead or more, and not suitable at all for a computer, but it is adequate for the toy, and the toy manufacturer can get it for a nickel, rather than the $250 price tag on the fully functional chip.

---


<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 3 - Calculating Process Capability<a class="anchor" id="DS103L3_page_3"></a>

[Back to Top](#DS103L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


# Calculating Process Capability with the Capability Index

The capability of a process is usually expressed using a single number, called the *capability index*. There are actually a couple of different flavors of capability indices, which you will get to in a few minutes. For now, you will investigate the theoretical approach to capability and what it means in the real world.

The symbol for a capability index is C<sub>p</sub>. It is calculated using this simple formula:

![The formula for capability index. C sub P equals U S L minus L S L divided by six sigma.](Media/L05-02.png)

where sigma is the standard deviation of the process being measured.

Now recall that a normally distributed variable looks like this:

![A normal distribution with vertical lines showing the mean and three standard deviations below and above the mean.](Media/L05-03.png)

You will expect virtually all of the data to be between x-bar - 3 stdev and x-bar + 3 stdev. In other words, the width of almost all of the data is about 6 standard deviations, or 6 sigma. That is also where the UCL and LCL are located, so in essence, the formula for C<sub>p</sub> becomes: 

![C sub P equals U S L minus L S L divided by U C L minus L C L.](Media/L05-04.png)

Can you see that the formula for C<sub>p</sub> is simply a ratio of the spec width to the control width? Spec width is in the numerator (the top part of the fraction) and control width is in the denominator (the bottom part of the fraction).

Ideally, the capability index C<sub>p</sub> can be illustrated using a histogram to represent the process and its control limits, and then using vertical bars to indicate the spec limits. It would look something like this:

![A normal distribution. The L C L and the U C L are indicated by vertical bars extending downward, and the width between them is labeled control width. The L S L and the U S L are indicated by vertical bars extending upward, and the space between them is labeled spec width.](Media/L05-05.png)

In this situation, you can see that the spec limits are completely outside of the control limits. This scenario is ideal. Remember, C<sub>p</sub> is a measure of where the process _does_ run compared to where the process _needs_ to run.

However, it is possible that the control width and the spec width only overlap a little bit, or maybe even not at all, as illustrated in these next two pictures:

![A normal distribution. The L C L and the U C L are indicated by vertical bars extending downward, and the width between them is labeled control width. The L S L and the U S L are indicated by vertical bars extending upward, and the space between them is labeled spec width. The spec width and the control width partially overlap.](Media/L05-06.png)

![A normal distribution. The L C L and the U C L are indicated by vertical bars extending downward, and the width between them is labeled control width. The L S L and the U S L are indicated by vertical bars extending upward, and the space between them is labeled spec width. The spec width and the control width do not overlap at all.](Media/L05-07.png)

In all three of these cases listed above the C<sub>p</sub> is the same. In each case, the location of the spec window relative to the location of the control window has changed, but the C<sub>p</sub> makes no reference to where either window is located.

Here are some numbers for these control limits, and then  you can do the calculation for C<sub>p</sub>:

![A normal distribution. The L C L is 46 and the U C L is 64, and they are indicated by vertical bars extending downward, and the width between them is labeled control width. The L S L is 35 and the U S L is 77, and they are indicated by vertical bars extending upward, and the space between them is labeled spec width.](Media/L05-11.png)

C<sub>p</sub> = (77-35)/(64-46) = 42/18 = 7/3 ~ 2.33

![A normal distribution. The L C L is 46 and the U C L is 64, and they are indicated by vertical bars extending downward, and the width between them is labeled control width. The L S L is 15 and the U S L is 57, and they are indicated by vertical bars extending upward, and the space between them is labeled spec width. The spec width and the control width partially overlap.](Media/L05-09.png)

C<sub>p</sub> = (57-15)/(64-46) = 42/18 = 7/3 ~ 2.33

![A normal distribution. The L C L is 46 and the U C L is 64, and they are indicated by vertical bars extending downward, and the width between them is labeled control width. The L S L is 3 and the U S L is 45, and they are indicated by vertical bars extending upward, and the space between them is labeled spec width. The spec width and the control width do not overlap at all.](Media/L05-10.png)

C<sub>p</sub> = (45-3)/(64-46) = 42/18 = 7/3 ~ 2.33

The C<sub>p</sub> is the same for all 3 scenarios.

---

## Process Capability Guidelines

In general, if the C<sub>p</sub> is greater than one, the process is usually considered to be capable. But that leaves little margin for error. It is more common in a business setting to consider a process where the following guidelines are set:

* C<sub>p</sub> less than 1 means the process is not capable

* C<sub>p</sub> between 1 and 1.33 means the process is marginally capable

* C<sub>p</sub> greater than 1.33 means the process is fully capable

Can you see how C<sub>p</sub> doesn't really account for much except the ratio of the spec width to the control width? For this reason, C<sub>p</sub> is often called the process potential. Rather than C<sub>p</sub> being a great indicator of what is going on, it is more commonly used as an indicator of what could be going on if the process was centered up in the middle of the spec width.

---


<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 4 - The Capability Index C<sub>pk</sub><a class="anchor" id="DS103L3_page_4"></a>

[Back to Top](#DS103L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


# The Capability Index C<sub>pk</sub>

This leads us to a different capability index called C<sub>pk</sub>, which is a little more complex than the C<sub>p</sub>, but is also more sensitive to whether or not the control width is aligned with the spec width. The C<sub>pk</sub> takes into account where the process mean is.

It is important to note that process capability is designed for processes that are normally distributed. This is true for the C<sub>p</sub>, but especially true for the C<sub>pk</sub>.

Knowing the C<sub>p</sub> of a process tells you whether you are even capable of fitting a process within a spec window, and knowing the C<sub>pk</sub> tells you how you are currently doing. 

---

## Formulas for Upper and Lower Process Capability

To calculate the C<sub>pk</sub>, you have to calculate an upper and a lower C<sub>p</sub>. These are called *C<sub>pu</sub>* and *C<sub>pl</sub>*. Once those two quantities are calculated, the minimum of those two values is called the *C<sub>pk</sub>*.

The C<sub>pu</sub> is calculated like this:

![C sub P U equals U S L minus mu divided by three sigma.](Media/L05-12.png)

The calculation of C<sub>pl</sub> is similar, and looks like this:

![C sub P L equals mu minus L S L divided by three sigma.](Media/L05-13.png)

In both of these formulas, mu is the population mean, and sigma is the population standard deviation. However, it is most common to use *x*-bar as an estimate of mu, and *s* as an estimate of sigma.

---


<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 5 - Calculating Capability Index (C<sub>pk</sub>)<a class="anchor" id="DS103L3_page_5"></a>

[Back to Top](#DS103L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


# Calculating Capability Index (C<sub>pk</sub>)

Now that you have both the C<sub>pu</sub> and the C<sub>pl</sub>, you just take whichever of those is smaller, and that is the C<sub>pk</sub>.

Another observation is that C<sub>p</sub> can never be a negative number, because both the numerator and the denominator must be positive numbers. However, C<sub>pk</sub> can be negative if either C<sub>pu</sub> or C<sub>pl</sub> is negative. 

---

## Example 1

You will now revisit the previous examples, but look at the C<sub>pk</sub>. In the below picture, the control window is almost exactly in the center of the spec window. Assume mu is in the middle of the control window, and sigma is 1/6th of the width of the control window.

![A normal distribution. The L C L is 46 and the U C L is 64, and they are indicated by vertical bars extending downward, and the width between them is labeled control width. The L S L is 35 and the U S L is 77, and they are indicated by vertical bars extending upward, and the space between them is labeled spec width.](Media/L05-08.png)

Since C<sub>pk</sub> is the min of C<sub>pu</sub> and C<sub>pl</sub>, C<sub>pk</sub> = 2.22.

![Sigma equals sixty four minus forty six divided by six equals eighteen divided by six equals three. U equals sixty four plus forty six divided by two equals one hundred ten divided by two equals fifty five. C sub P L equals fifty five minus thirty five divided by nine equals twenty divided by nine equals two point two two. C sub P U equals seventy seven minus fifty five divided by nine equals twenty two divided by nine equals two point four four. C sub P K equals two point two two.](Media/process1.png)

---

## Example 2

In the below picture, the control window and the spec window overlap a bit, but not too much: 

![A normal distribution. The L C L is 46 and the U C L is 64, and they are indicated by vertical bars extending downward, and the width between them is labeled control width. The L S L is 15 and the U S L is 57, and they are indicated by vertical bars extending upward, and the space between them is labeled spec width. The spec width and the control width partially overlap.](Media/L05-09.png)

And here is how you would complete the calculations: 

![U equals fifty five. C sub P L equals fifty five minus fifteen divided by nine equals forty divided by nine equals four point four four. C sub P U equals fifty seven minus fifty five divided by nine equals two divided by nine equals point two two. C sub P K equals point two two.](Media/process2.png)

Since C<sub>pk</sub> is the min of C<sub>pu</sub> and C<sub>pl</sub>, C<sub>pk</sub> = 0.22.

---

## Example 3

In the below picture, the control window and the spec window don't overlap at all.

![A normal distribution. The L C L is 46 and the U C L is 64, and they are indicated by vertical bars extending downward, and the width between them is labeled control width. The L S L is 3 and the U S L is 45, and they are indicated by vertical bars extending upward, and the space between them is labeled spec width. The spec width and the control width do not overlap at all.](Media/L05-10.png)

To calculate C<sub>pk</sub>: 

![U equal fifty five. Sigma equals three. C sub P L equals fifty five minus three divided by nine equals fifty two divided by nine equals five point seven seven. C sub P U equals forty five minus fifty five divided by nine equals negative ten divided by nine equals negative one point one one. C sub P K equals negative one point one one.](Media/process3.png)

Since C<sub>pk</sub> is the min of C<sub>pu</sub> and C<sub>pl</sub>, C<sub>pk</sub> = -1.11.

---

## Example 4

In the image below, the control width is pretty close to being centered in the spec window: 

![A normal distribution. The L C L is 46 and the U C L is 64, and they are indicated by vertical bars extending downward, and the width between them is labeled control width. The L S L is 47 and the U S L is 61, and they are indicated by vertical bars extending upward, and the space between them is labeled spec width.](Media/L05-14.png)

Here is how you would calculate C<sub>pk</sub>: 

![C sub P U equals sixty one minus fifty five divided by nine equals six divided by nine equals point six seven. C sub P L equals fifty five minus forty seven divided by nine equals eight divided by nine equals point eight nine. C sub P K equals point six seven.](Media/process4.png)

Since C<sub>pk</sub> is the min of C<sub>pu</sub> and C<sub>pl</sub>, C<sub>pk</sub> = 0.67.

However, if you calculate out just process capability, C<sub>p</sub>: 

![C sub P equals sixty one minus forty seven divided by sixty four minus dorty six equals fourteen divided by eighteen equals point seven eight.](Media/process5.png)

You find that C<sub>p</sub> is less than 1, and so the process is still not capable. 

---

<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 6 - Out of Control and Out of Spec<a class="anchor" id="DS103L3_page_6"></a>

[Back to Top](#DS103L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


# Out of Control and Out of Spec

So far in this lesson, you have focused on the details of C<sub>p</sub> and C<sub>pk</sub>. Now zoom back out a little bit, and think about the relationship between control limits and spec limits in general.

When a process is out of control, that presents a different environment and requires a different plan of attack than for when a process is out of spec. When a process is out of control, it is broken. Something has fallen apart and needs to be fixed. This often requires someone who is familiar with the process to troubleshoot and figure out what is broken. On the other hand, when a process is out of spec, it simply isn't good enough. Either your process is not centered in the right place, or it needs to be replaced with a better process. That requires engineering. If someone does not understand that "out of control" and "out of spec" require different solutions, they are likely to apply the wrong solution. This is especially true for a process that is out of spec. If your process is out of spec, but not out of control, then it is not broken. In other words, you would be trying to fix a process that is not broken. In previous lessons, you referred to this activity as tampering. When you tamper, you are taking a process that was in control but out of spec and turning it into a process that is now out of control **and** out of spec. You've made things worse.

Here is a matrix diagram illustrating what approach should be taken when your process is out of control or out of spec:

![In spec, in control, your process is fine, do nothing. In spec, out of control, your process is broken and needs to be fixed. Out of spec and in control, your process is not good enough and needs to be re engineered. Out of spec and out of control, you have lots of problems.](Media/L05-17.png)

---

## Summary

* Process control indices help us monitor the relationship between control window and spec window
* C<sub>p</sub> is often called process potential
* C<sub>pk</sub> is a more sensitive process control index that accounts for where the process is currently running
* "Out of control" situations and "out of spec" situations require different responses


<hr style="height:10px;border-width:0;color:gray;background-color:gray">

# Page 7 - Key Terms<a class="anchor" id="DS103L3_page_7"></a>

[Back to Top](#DS103L3_toc)

<hr style="height:10px;border-width:0;color:gray;background-color:gray">


# Key Terms

Below is a list and short description of the important keywords learned in this lesson. Please read through and go back and review any concepts you do not fully understand. Great Work!

<table class="table table-striped">
    <tr>
        <th>Keyword</th>
        <th>Description</th>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Process Capability</td>
        <td>A measure of how "well-behaved" a process is.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Spec Limit</td>
        <td>The amount of variability that is allowed in a process. Can be determined by a manager, a customer, or a downstream need.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Control Width</td>
        <td>The actual variability in a process.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Capability Index - Cp</td>
        <td>Spec Limit divided by the control width. Not capable has a value of less than 1, marginally capable is 1-1.33, and fully capable has a value of greater than 1.33.</td>
    </tr>
    <tr>
        <td style="font-weight: bold;" nowrap>Cpk</td>
        <td>A measure of how your data is currently doing.</td>
    </tr>
</table>

```c-lms
topic: Exam
```

<div class="panel panel-success">
    <div class="panel-heading">
        <h3 class="panel-title">Additional Info!</h3>
    </div>
    <div class="panel-body">
        <p>You may want to watch <a href="https://vimeo.com/449957069">this recorded live workshop </a> before beginning the hands-on, which goes over the calculations you will encounter on this exam.</p>
    </div>
</div>