![manufacturing gears](manufacturing.jpg)

Manufacturing processes for any product is like putting together a puzzle. Products are pieced together step by step, and keeping a close eye on the process is important.

For this project, you're supporting a team that wants to improve how they monitor and control a manufacturing process. The goal is to implement a more methodical approach known as statistical process control (SPC). SPC is an established strategy that uses data to determine whether the process works well. Processes are only adjusted if measurements fall outside of an acceptable range. 

This acceptable range is defined by an upper control limit (UCL) and a lower control limit (LCL), the formulas for which are:

$ucl = avg\_height + 3 * \frac{stddev\_height}{\sqrt{5}}$

$lcl = avg\_height - 3 * \frac{stddev\_height}{\sqrt{5}}$

The UCL defines the highest acceptable height for the parts, while the LCL defines the lowest acceptable height for the parts. Ideally, parts should fall between the two limits.

Using SQL window functions and nested queries, you'll analyze historical manufacturing data to define this acceptable range and identify any points in the process that fall outside of the range and therefore require adjustments. This will ensure a smooth running manufacturing process consistently making high-quality products.

## The data
The data is available in the `manufacturing_parts` table which has the following fields:
- `item_no`: the item number
- `length`: the length of the item made
- `width`: the width of the item made
- `height`: the height of the item made
- `operator`: the operating machine

In [88]:
SELECT 
    b.operator, 
	b.row_number, 
	b.height, 
	b.avg_height, 
	b.stddev_height, 
	b.ucl, 
	b.lcl,
	CASE 
        WHEN b.height BETWEEN b.lcl AND b.ucl THEN TRUE
        ELSE FALSE
    END AS alert
FROM ( -- query from step 3
    SELECT 
        a.*,
        a.avg_height + 3 * (a.stddev_height / SQRT(5)) AS ucl, 
        a.avg_height - 3 * (a.stddev_height / SQRT(5)) AS lcl
    FROM ( -- subquery
        SELECT 
            ROW_NUMBER() OVER my_window AS row_number,
            AVG(height) OVER my_window AS avg_height, 
            STDDEV(height) OVER my_window AS stddev_height,
            height, 
            item_no, 
            operator
        FROM 
            manufacturing_parts
        WINDOW my_window AS (
            PARTITION BY operator
            ORDER BY item_no
            ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
        )
    ) AS a
WHERE a.row_number >= 5
) AS b
LIMIT 65;


Unnamed: 0,operator,row_number,height,avg_height,stddev_height,ucl,lcl,alert
0,Op-1,5,19.46,19.778,1.062812,21.203912,18.352088,True
1,Op-1,6,20.36,19.912,1.090812,21.375477,18.448523,True
2,Op-1,7,20.22,20.030,1.084574,21.485108,18.574892,True
3,Op-1,8,21.03,19.934,0.931225,21.183369,18.684631,True
4,Op-1,9,19.78,20.170,0.598832,20.973418,19.366582,True
...,...,...,...,...,...,...,...,...
60,Op-12,5,19.79,20.008,1.219106,21.643603,18.372397,True
61,Op-12,6,18.81,19.868,1.326092,21.647139,18.088861,True
62,Op-12,7,19.03,19.292,0.690594,20.218529,18.365471,True
63,Op-12,8,20.67,19.380,0.849706,20.520000,18.240000,False


In [89]:
WITH calculated_stats AS (
    SELECT 
        operator,
        item_no,
        height,
        ROW_NUMBER() OVER my_window AS row_number,
        AVG(height) OVER my_window AS avg_height,
        STDDEV(height) OVER my_window AS stddev_height
    FROM manufacturing_parts
    WINDOW my_window AS (
        PARTITION BY operator
        ORDER BY item_no
        ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
    )
)
, control_limits AS (
    SELECT 
        operator,
        row_number,
        height,
        avg_height,
        stddev_height,
        avg_height + 3 * (stddev_height / SQRT(5)) AS ucl,
        avg_height - 3 * (stddev_height / SQRT(5)) AS lcl
    FROM calculated_stats
    WHERE row_number >= 5  -- Exclure les fenêtres incomplètes
)
SELECT 
    operator,
    row_number,
    height,
    avg_height,
    stddev_height,
    ucl,
    lcl,
    CASE 
        WHEN height BETWEEN lcl AND ucl THEN TRUE
        ELSE FALSE
    END AS alert
FROM control_limits
LIMIT 65;


Unnamed: 0,operator,row_number,height,avg_height,stddev_height,ucl,lcl,alert
0,Op-1,5,19.46,19.778,1.062812,21.203912,18.352088,True
1,Op-1,6,20.36,19.912,1.090812,21.375477,18.448523,True
2,Op-1,7,20.22,20.030,1.084574,21.485108,18.574892,True
3,Op-1,8,21.03,19.934,0.931225,21.183369,18.684631,True
4,Op-1,9,19.78,20.170,0.598832,20.973418,19.366582,True
...,...,...,...,...,...,...,...,...
60,Op-12,5,19.79,20.008,1.219106,21.643603,18.372397,True
61,Op-12,6,18.81,19.868,1.326092,21.647139,18.088861,True
62,Op-12,7,19.03,19.292,0.690594,20.218529,18.365471,True
63,Op-12,8,20.67,19.380,0.849706,20.520000,18.240000,False


In [90]:
SELECT COUNT(*)
FROM(SELECT 
    b.*,
    CASE 
        WHEN b.height < b.ucl AND b.height > b.lcl THEN TRUE
        ELSE FALSE
    END AS alert
FROM (
    -- Subquery from step 2
    SELECT 
        a.*,
        a.avg_height + 3 * a.stddev_height / SQRT(5) AS ucl, 
        a.avg_height - 3 * a.stddev_height / SQRT(5) AS lcl
    FROM (
        -- Subquery from step 1
        SELECT 
            ROW_NUMBER() OVER my_window AS row_number,
            AVG(height) OVER my_window AS avg_height, 
            STDDEV(height) OVER my_window AS stddev_height,
            height, 
            item_no, 
            operator
        FROM 
            manufacturing_parts
        WINDOW 
            my_window AS (
                PARTITION BY operator
                ORDER BY item_no
                ROWS BETWEEN 4 PRECEDING AND CURRENT ROW
            )
        
    ) AS a
	WHERE a.row_number >= 5
 ) AS b
) AS c

WHERE alert = TRUE
LIMIT 65
;


Unnamed: 0,count
0,363


In [91]:
WITH height_stats AS (
    SELECT 
        item_no,
        operator,
        height,
        ROW_NUMBER() OVER (PARTITION BY operator ORDER BY item_no) AS row_number,
        AVG(height) OVER (PARTITION BY operator ORDER BY item_no ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) AS avg_height,
        STDDEV(height) OVER (PARTITION BY operator ORDER BY item_no ROWS BETWEEN 4 PRECEDING AND CURRENT ROW) AS stddev_height
    FROM 
        manufacturing_parts
)
SELECT 
    operator,
    row_number,
    height,
    avg_height,
    stddev_height,
    avg_height + (3 * stddev_height) AS ucl,
    avg_height - (3 * stddev_height) AS lcl,
    CASE 
        WHEN height > (avg_height + (3 * stddev_height/SQRT(5) )) AND height < (avg_height - (3 * stddev_height/SQRT(5))) 
        THEN TRUE 
        ELSE FALSE 
    END AS alert
FROM 
    height_stats
WHERE 
    row_number >= 5
ORDER BY 
    item_no
LIMIT 65;

Unnamed: 0,operator,row_number,height,avg_height,stddev_height,ucl,lcl,alert
0,Op-1,5,19.46,19.778,1.062812,22.966437,16.589563,False
1,Op-1,6,20.36,19.912,1.090812,23.184435,16.639565,False
2,Op-1,7,20.22,20.030,1.084574,23.283721,16.776279,False
3,Op-1,8,21.03,19.934,0.931225,22.727675,17.140325,False
4,Op-1,9,19.78,20.170,0.598832,21.966497,18.373503,False
...,...,...,...,...,...,...,...,...
60,Op-3,22,20.79,20.404,0.469819,21.813457,18.994543,False
61,Op-3,23,23.31,21.042,1.342747,25.070242,17.013758,False
62,Op-3,24,21.17,21.202,1.289252,25.069755,17.334245,False
63,Op-3,25,20.54,21.350,1.119352,24.708057,17.991943,False
