# Controls charts&mdash;individuals and moving range (XmR)

# Document

<table align="left">
    <tr>
        <th class="text-align:left">Title</th>
        <td class="text-align:left">Control charts---individuals and moving range</td>
    </tr>
    <tr>
        <th class="text-align:left">Last modified</th>
        <td class="text-align:left">2019-10-24</td>
    </tr>
    <tr>
        <th class="text-align:left">Author</th>
        <td class="text-align:left">Gilles Pilon <gillespilon13@gmail.com></td>
    </tr>
    <tr>
        <th class="text-align:left">Status</th>
        <td class="text-align:left">Active</td>
    </tr>
    <tr>
        <th class="text-align:left">Type</th>
        <td class="text-align:left">Jupyter notebook</td>
    </tr>
    <tr>
        <th class="text-align:left">Created</th>
        <td class="text-align:left">2017-08-26</td>
    </tr>
    <tr>
        <th class="text-align:left">File name</th>
        <td class="text-align:left">control_charts_xmr.ipynb</td>
    </tr>
    <tr>
        <th class="text-align:left">Other files required</th>
        <td class="text-align:left">xmr.csv</td>
    </tr>
</table>

# In brevi

Shewhart control charts, also called process behaviour charts, are used to determine if the variation of a process is stable and predictable, that is, in a state of statistical control. In-control variation arises from chance or common causes. No changes or adjustments to the process are needed. The charts can be used to predict future performance. Out-of-control variation arises from special or assignable causes. These charts help identify the special causes in order minimize or eliminate their effect.

# Data

Download the [data file](https://drive.google.com/open?id=0BzrdQfHR2I5DRld4MndVT2R0dEk). It consists of a "Date" column and an "X" column of floats or integers. Dates are entered using [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601) date format (yyyy-mm-dd).

# Methodology

The charts presented here are the individuals and moving range control charts, also called XmR or ImR. The data are collected using rational samples. The individual values (sample size is one) are plotted in time order. A central line (average) and control limits above and below the central line are plotted.


# Control chart formulae

## Individuals chart (X)

$$
    \begin{align}
        UCL_X, LCL_X & = \overline{X} \pm 3 \times \text{Sigma(X)} \\
                     & = \overline{X} \pm 3 \times \frac{\overline{R}}{d_2}
    \end{align}
$$

The constant $d_2$ can be found in tables of control chart constants. It is a rescaling constant that changes an average range to a standard deviation. The value of $d_2$ changes as the subgroup size n changes. It is common to use a moving range subgroup size 2.

## Moving range chart (mR)

$$
    \begin{align}
        UCL_R, LCL_R & = \overline{R} \pm 3 \times \text{Sigma(R)} \\
                     & = \overline{R} \pm 3 \times d_3 \times \text{Sigma(X)} \\
                     & = \overline{R} \pm 3 \times d_3 \times \frac{\overline{R}}{d_2}
    \end{align}
$$

The constant $d_3$ can be found in tables of control chart constants. It is a rescaling constant that changes a a standard deviation of individual values to a standard deviation of range values. The value of $d_3$ is a function of sugroup size.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt
from IPython.display import display_html
from datasense import X, mR
%matplotlib inline
%config InlineBackend.figure_format = 'svg'

In [None]:
def get_input():
    csvfile = 'xmr.csv'
    index_column = 'Sample'
    subgroup_size = 2
    chart_data = pd.read_csv(csvfile, index_col=index_column)
    return subgroup_size, chart_data

In [None]:
def control_chart_x(x):
    x_chart_title = 'X Control Chart'
    x_chart_subtitle = 'Subtitle'
    x_chart_ylabel = 'Response (units)'
    x_chart_xlabel = 'Sample'
    print('\nX chart')
    print('d2', x._d2, sep=' = ')
    print('Upper control limit ', x.ucl, sep=' = ')
    print('Average moving range', x.mean, sep=' = ')
    print('Lower control limit ', x.lcl, sep=' = ')
    print('Sigma(X)', x.sigma, sep=' = ')
    for i in range(-3, 4):
        print(f'{i} Sigma', ' '\
              .join(map(str, [x.sigmas[i]])), sep=' = ')
    ax = x.ax
    ax.set_title(x_chart_title + '\n' 'Subtitle')
    ax.set_ylabel(x_chart_ylabel)
    ax.set_xlabel(x_chart_xlabel)
    ax.figure.savefig('x.svg', format='svg')
    plt.close()

In [None]:
def control_chart_mr(x):
    mr_chart_title = 'mR Control Chart'
    mr_chart_subtitle = 'Subtitle'
    mr_chart_ylabel = 'Response (units)'
    mr_chart_xlabel = 'Sample'
    print('\nmR chart')
    print('d2', mr._d2, sep=' = ')
    print('d3', mr._d3, sep=' = ')
    print('Upper control limit ', mr.ucl, sep=' = ')
    print('Average moving range', mr.mean, sep=' = ')
    print('Lower control limit ', mr.lcl, sep=' = ')
    print('Sigma(mR)', mr.sigma, sep=' = ')
    for i in range(-3, 4):
        print(f'{i} Sigma', ' '.join(map(str, [mr.sigmas[i]])), sep=' = ')
    ax = mr.ax
    ax.set_title('mR control chart' + '\n' 'Subtitle')
    ax.set_ylabel('Response (units)')
    ax.set_xlabel('X axis label')
    ax.figure.savefig('mr.svg', format='svg')
    plt.close()

In [None]:
def rule_one(values):
    display_html('<h1>Rule one</h1>', raw=True)
    rule_one_above = chart_data[(chart_data['X'] > x.ucl)]
    for i in range(0, rule_one_above.shape[0], 10):
        display_html('Points above', raw=True)
        display_html(rule_one_above.iloc[i:i+10].T)
    rule_one_below = chart_data[(chart_data['X'] < x.lcl)]
    for i in range(0, rule_one_below.shape[0], 10):
        display_html('Points below', raw=True)
        display_html(rule_one_below.iloc[i:i+10].T)

In [None]:
if __name__ == '__main__':
    subgroup_size, chart_data = get_input()
    x = X(chart_data, subgroup_size)
    mr = mR(chart_data, subgroup_size)
    control_chart_x(x)
    control_chart_mr(x)
    rule_one(chart_data)

# Interpretation

## Moving range control chart
The moving range chart measures the within-subroup variation. If the process is in statistical control (all rules met), the estimation of dispersion should be useful. This chart should be evaluated first because $\overline{R}$ is used in the control limits of the individuals chart.

## Individuals control chart
The individuals control chart measures the *location* of the process. Use the Shewhart detection rules in the order as follows.

### Detection rule one
A process is out-of-control if one value is greater than the upper control limit or one value is less than the lower control limit. If there are no out-of-control values, proceed to the next rule. If there are out-of-control values, fix the root causes.

In [None]:
# Find values greater than the upper control limit.
rule_one_above = chart_data[(chart_data['X'] > x.ucl)]
for i in range(0, rule_one_above.shape[0], 10):
    display_html(rule_one_above.iloc[i:i+10].T)

In [None]:
# Find values less than the lower control limit.
rule_one_below = chart_data[chart_data['X'] < x.lcl]
for i in range(0, rule_one_below.shape[0], 10):
    display_html(rule_one_below.iloc[i:i+10].T)

### Detection rule five
The process is out-of-control if two out-of-three consecutive values are greater than two Sigma(X) above the average or two out-of-three consecutive values less than two Sigma(X) below the average. If there are no out-of-control values, proceed to the next rule. If there are out-of-control values, fix the root causes.

In [None]:
# Create a list with X above 2 sigma.
above_two_sigma_x_list = []
for value in chart_data['X']:
    if value >= x.sigmas[2]:
        above_two_sigma_x_list.append(1)
    else:
        above_two_sigma_x_list.append(0)
    print(above_two_sigma_x_list)
# Create a column from the list.
chart_data['above_two_sigma_x'] = above_two_sigma_x_list
# Display values where 2 of 3 consecutive X > 2 sigma.
chart_data['above_two_sigma_x_rule_5'] = chart_data['above_two_sigma_x'].rolling(2) \
                                                                        .sum()
print(chart_data['above_two_sigma_x_rule_5'])
rule_five_above = chart_data.loc[(chart_data['above_two_sigma_x_rule_5'] >= 2)][['X']]
print(rule_five_above)
for i in range(0, rule_five_above.shape[0], 10):
    display_html(rule_five_above.iloc[i:i+10].T)

In [None]:
# Create a list with X below 2 sigma.
below_two_sigma_x_list = []
for value in chart_data['X']:
    if value <= x.sigmas[-2]:
        below_two_sigma_x_list.append(1)
    else:
        below_two_sigma_x_list.append(0)
# Create a column from the list.
chart_data['below_two_sigma_x'] = below_two_sigma_x_list
# Display values where 2 of 3 consecutive X < 2 sigma.
chart_data['below_two_sigma_x_rule_5'] = chart_data['below_two_sigma_x'].rolling(3) \
                                                                        .sum()
rule_five_below = chart_data.loc[(chart_data['below_two_sigma_x_rule_5'] >= 2)][['X']]
for i in range(0, rule_five_below.shape[0], 10):
    display_html(rule_five_below.iloc[i:i+10].T)

In [None]:
chart_data = chart_data.drop(['above_two_sigma_x', 
                              'above_two_sigma_x_rule_5', 
                              'below_two_sigma_x',
                              'below_two_sigma_x_rule_5'
                             ], axis=1)

### Detection rule six
A process is out-of-control if four out-of-five consecutive values are greater than one Sigma(X) above the average or four out-of-five consecutive values are less than one Sigma(X) below the average.  If there are no out-of-control values, proceed to the next rule. If there are out-of-control values, fix the root causes.

In [None]:
# Create a list with X above 1 sigma.
above_one_sigma_x_list = []
for value in chart_data['X']:
    if value >= x.sigmas[1]:
        above_one_sigma_x_list.append(1)
    else:
        above_one_sigma_x_list.append(0)
# Create a column from the list.
chart_data['above_one_sigma_x'] = above_one_sigma_x_list
# Display values where 4 of 5 consecutive X > 1 sigma.
chart_data['above_one_sigma_x_rule_6'] = chart_data['above_one_sigma_x'].rolling(5) \
                                                                        .sum()
rule_six_above = chart_data.loc[(chart_data['above_one_sigma_x_rule_6'] >= 4)][['X']]
for i in range(0, rule_six_above.shape[0], 10):
    display_html(rule_six_above.iloc[i:i+10].T)

In [None]:
# Create a list with X below 1 sigma.
below_one_sigma_x_list = []
for value in chart_data['X']:
    if value <= x.sigmas[-1]:
        below_one_sigma_x_list.append(1)
    else:
        below_one_sigma_x_list.append(0)
# Create a column from the list.
chart_data['below_one_sigma_x'] = below_one_sigma_x_list
# Display values where 4 of 5 consecutive X < 1 sigma.
chart_data['below_one_sigma_x_rule_6'] = chart_data['below_one_sigma_x'].rolling(5) \
                                                                        .sum()
rule_six_below = chart_data.loc[(chart_data['below_one_sigma_x_rule_6'] >= 4)][['X']]
for i in range(0, rule_six_below.shape[0], 10):
    display_html(rule_six_below.iloc[i:i+10].T)

In [None]:
chart_data = chart_data.drop(['above_one_sigma_x', 
                              'above_one_sigma_x_rule_6', 
                              'below_one_sigma_x',
                              'below_one_sigma_x_rule_6'
                             ], axis=1)

 ### Detection rule two
 A process is out-of-control if eight or more consecutive values are on the same side of the average or eight or more consecutive values are on the same side of the average. If there are no out-of-control values, proceed to the next rule. If there are out-of-control values, fix the root causes.

In [None]:
# Create a list with X above the average for eight or more consecutive values.
above_average_x_list = []
for value in chart_data['X']:
    if value > x.mean:
        above_average_x_list.append(1)
    else:
        above_average_x_list.append(0)
# Create a column from the list.
chart_data['above_average_x'] = above_average_x_list
# Display values where 8 consecutive X > average.
chart_data['above_average_x_rule_2'] = chart_data['above_average_x'].rolling(8) \
                                                                    .sum()
rule_two_above = chart_data.loc[(chart_data['above_average_x_rule_2'] >= 8)][['X']]
for i in range(0, rule_two_above.shape[0], 10):
    display_html(rule_two_above.iloc[i:i+10].T)

In [None]:
# Create a list with X below the average for eight or more consecutive values.
below_average_x_list = []
for value in chart_data['X']:
    if value < x.mean:
        below_average_x_list.append(1)
    else:
        below_average_x_list.append(0)
# Create a column from the list.
chart_data['below_average_x'] = below_average_x_list
# Display values where 8 consecutive X < average.
chart_data['below_average_x_rule_2'] = chart_data['below_average_x'].rolling(8) \
                                                                    .sum()
rule_two_below = chart_data.loc[(chart_data['below_average_x_rule_2'] >= 8)][['X']]
for i in range(0, rule_two_below.shape[0], 10):
    display_html(rule_two_below.iloc[i:i+10].T)

In [None]:
chart_data = chart_data.drop(['above_average_x', 
                              'above_average_x_rule_2', 
                              'below_average_x',
                              'below_average_x_rule_2'
                             ], axis=1)

### Detection rule three
Variation is unpredictable when six consecutive values are increasing or decreasing.

In [None]:
rule_3_list_1 = []
for value in chart_data['X']:
    rule_3_list_1 = chart_data['X'].diff()
#     print(chart_data['X'].rolling(2).agg(lambda x: x[0] - x[1]))
#     rule_3_list_1 = chart_data['X'].rolling(2) \
#                                    .agg(lambda x: x[0] - x[1])
chart_data['rule_3_list_1'] = rule_3_list_1
rule_3_list_2 = []
for value in chart_data['rule_3_list_1']:
    if value > 0:
        rule_3_list_2.append(1)
    else:
        rule_3_list_2.append(0)
chart_data['rule_3_list_2'] = rule_3_list_2
rule_three_above = chart_data.loc[(chart_data['rule_3_list_2'] >= 6)][['X']]
for i in range(0, rule_three_above.shape[0], 10):
    display_html(rule_three_above.iloc[i:i+10].T)

In [None]:
rule_3_list_3 = []
for value in chart_data['rule_3_list_1']:
    if value < 0:
        rule_3_list_3.append(1)
    else:
        rule_3_list_3.append(0)
chart_data['rule_3_list_3'] = rule_3_list_3
rule_three_below = chart_data.loc[(chart_data['rule_3_list_3'] >= 6)][['X']]
for i in range(0, rule_three_below.shape[0], 10):
    display_html(rule_three_below.iloc[i:i+10].T)

In [None]:
chart_data = chart_data.drop(['rule_3_list_1', 
                      'rule_3_list_2', 
                      'rule_3_list_3'
                     ], axis=1)

### Detection rule four
Variation is unpredictable when fourteen consecutive values alternate up and down.

In [None]:
# Create a list with X alternating up and down for 14 or more
# consecutive values.
rule_4_list_1 = []
for value in chart_data['X']:
    rule_4_list_1 = chart_data['X'].diff()
#     rule_4_list_1 = chart_data['X'].rolling(2)\
#                                    .agg(lambda x: x[0] - x[1])
chart_data['rule_4_list_1'] = rule_4_list_1
rule_4_list_2 = []
for value in chart_data['rule_4_list_1']:
    if value > 0:
        rule_4_list_2.append(1)
    else:
        rule_4_list_2.append(0)
chart_data['rule_4_list_2']= rule_4_list_2
rule_4_list_3 =[]
for value in chart_data['rule_4_list_1']:
    if value < 0:
        rule_4_list_3.append(1)
    else:
        rule_4_list_3.append(0)
chart_data['rule_4_list_3'] = rule_4_list_3
chart_data['rule_4_list_4']= chart_data[['rule_4_list_2', 'rule_4_list_3']].sum(axis='columns')

In [None]:
chart_data.head(10)

In [None]:
# My logic above isn't right, but close.

In [None]:
# Display values where 14 or more consecutive values alternate
# up and down.
chart_data['rule_4_list_5'] = chart_data['rule_4_list_4'].rolling(14) \
                                                         .sum()
rule_four = chart_data.loc[(chart_data['rule_4_list_5'] >= 14)][['X']]
for i in range(0, rule_four.shape[0], 10):
    display_html(rule_four.iloc[i:i+10].T)

In [None]:
chart_data = chart_data.drop(['rule_4_list_1',
                              'rule_4_list_2',
                              'rule_4_list_3',
                              'rule_4_list_4',
                              'rule_4_list_5'
                             ], axis=1)

### Detection rule seven
Variation is unpredictable when fifteen consecutive values are between $\pm$ one Sigma(X).

In [None]:
# Create a list with X within one Sigma(X) of the average.
within_one_sigma_x_list = []
for value in chart_data['X']:
    if value <= x.sigmas[1] <= x.sigmas[-1]:
        within_one_sigma_x_list.append(1)
    else:
        within_one_sigma_x_list.append(0)
# Create a column from the list.
chart_data['within_one_sigma_x'] = within_one_sigma_x_list
# Display values where 15 consecutive values are within one Sigma(X)
# of the average.
rule_seven = chart_data.loc[(chart_data['within_one_sigma_x'] >= 15)][['X']]
for i in range(0, rule_seven.shape[0], 10):
    display_html(rule_seven.iloc[i:i+10].T)

In [None]:
chart_data = chart_data.drop(['within_one_sigma_x'], axis=1)

### Detection rule eight
Variation is unpredictable when eight consecutive values are on both sides of the average with none between $\pm$ one Sigma(X).

In [None]:
# Create a list with X beyond one Sigma(X).
beyond_one_sigma_x_list = []
for value in chart_data['X']:
    if value >= x.sigmas[1] or value <= x.sigmas[-1]:
        beyond_one_sigma_x_list.append(1)
    else:
        beyond_one_sigma_x_list.append(0)
# Create a column from the list.
chart_data['beyond_one_sigma_x'] = beyond_one_sigma_x_list
# Display values where 8 consecutive values are on both sides
# of the average with none within one Sigma(X) of the average.
rule_eight = chart_data.loc[(chart_data['beyond_one_sigma_x'] >= 8)][['X']]
for i in range(0, rule_eight.shape[0], 10):
    display_html(rule_eight.iloc[i:i+10].T)

In [None]:
chart_data = chart_data.drop(['beyond_one_sigma_x'], axis=1)

#  Development
- Add rules to the moving range control chart
- Add code to rule four
- Find (NIST?) data sets for each rule and test

# References

Wheeler, Donald J. 1995. *Advanced Topics in Statistical Process Control*. Knoxville, TN: SPC Press, Inc.