![manufacturing gears](manufacturing.jpg)

Manufacturing processes for any product is like putting together a puzzle. Products are pieced together step by step, and keeping a close eye on the process is important.

For this project, you're supporting a team that wants to improve how they monitor and control a manufacturing process. The goal is to implement a more methodical approach known as statistical process control (SPC). SPC is an established strategy that uses data to determine whether the process works well. Processes are only adjusted if measurements fall outside of an acceptable range. 

This acceptable range is defined by an upper control limit (UCL) and a lower control limit (LCL), the formulas for which are:

$ucl = avg\_height + 3 * \frac{stddev\_height}{\sqrt{5}}$

$lcl = avg\_height - 3 * \frac{stddev\_height}{\sqrt{5}}$

The UCL defines the highest acceptable height for the parts, while the LCL defines the lowest acceptable height for the parts. Ideally, parts should fall between the two limits.

Using SQL window functions and nested queries, you'll analyze historical manufacturing data to define this acceptable range and identify any points in the process that fall outside of the range and therefore require adjustments. This will ensure a smooth running manufacturing process consistently making high-quality products.

## The data
The data is available in the `manufacturing_parts` table which has the following fields:
- `item_no`: the item number
- `length`: the length of the item made
- `width`: the width of the item made
- `height`: the height of the item made
- `operator`: the operating machine

In [9]:
-- Final Solution
--- Step 1: Create sub query for creating row number column that contains iteration per calculation
WITH ranked AS 
(
SELECT 
	operator,
	item_no,
	height,
	ROW_NUMBER() OVER(PARTITION BY operator ORDER BY item_no) AS row_number
FROM public.manufacturing_parts
),
--- Step 2: Create another sub query for creating summary statistics such as average, standard deviation and also counter to limit window function to 5 rows.
--- Actually, COUNT here is optional as it can be also replace with 5 in the calculation of UCL and LCL.
stats AS
(
	SELECT *,
	AVG(height) OVER ( PARTITION BY operator ORDER BY item_no ROWS BETWEEN 4 PRECEDING AND  CURRENT ROW) AS avg_height,
	STDDEV(height) OVER  (PARTITION BY operator ORDER BY item_no ROWS BETWEEN 4 PRECEDING AND  CURRENT ROW) AS stddev_height,
	COUNT(height) OVER (PARTITION BY operator ORDER BY item_no ROWS BETWEEN 4 PRECEDING AND  CURRENT ROW) AS n_in_window_5
	FROM ranked
)
	
SELECT 
	operator,
	row_number,
	height,
	avg_height,
	stddev_height,
	avg_height + (3*(stddev_height)/SQRT(5)) AS ucl,
	avg_height - (3*(stddev_height)/SQRT(5)) AS lcl,
	CASE
	WHEN height > avg_height + (3*(stddev_height)/SQRT(n_in_window_5))  THEN TRUE
	WHEN height < avg_height - (3*(stddev_height)/SQRT(n_in_window_5))  THEN TRUE
	ELSE FALSE END AS alert
	
FROM stats
WHERE stats.n_in_window_5 = 5



Unnamed: 0,operator,row_number,height,avg_height,stddev_height,ucl,lcl,alert
0,Op-1,5,19.46,19.778,1.062812,21.203912,18.352088,False
1,Op-1,6,20.36,19.912,1.090812,21.375477,18.448523,False
2,Op-1,7,20.22,20.030,1.084574,21.485108,18.574892,False
3,Op-1,8,21.03,19.934,0.931225,21.183369,18.684631,False
4,Op-1,9,19.78,20.170,0.598832,20.973418,19.366582,False
...,...,...,...,...,...,...,...,...
415,Op-9,31,19.01,18.904,0.203052,19.176422,18.631578,False
416,Op-9,32,18.57,18.864,0.250260,19.199759,18.528241,False
417,Op-9,33,20.91,19.266,0.952276,20.543613,17.988387,True
418,Op-9,34,21.24,19.678,1.291112,21.410208,17.945792,False


In [21]:
-- DATA EXPLORATION
SELECT * 
FROM manufacturing_parts
LIMIT 10;

Unnamed: 0,item_no,length,width,height,operator
0,1,102.67,49.53,19.69,Op-1
1,2,102.5,51.42,19.63,Op-1
2,3,95.37,52.25,21.51,Op-1
3,4,94.77,49.24,18.6,Op-1
4,5,104.26,47.9,19.46,Op-1
5,6,105.18,49.39,20.36,Op-1
6,7,97.35,48.05,20.22,Op-1
7,8,99.35,44.59,21.03,Op-1
8,9,90.62,47.29,19.78,Op-1
9,10,97.22,52.14,20.71,Op-1


In [23]:
SELECT 
	operator,
	item_no,
	AVG(length),
	AVG(width),
	AVG(height)
FROM manufacturing_parts
GROUP BY operator, item_no
ORDER BY operator

Unnamed: 0,operator,item_no,avg,avg.1,avg.2
0,Op-1,1,19.69,19.69,19.69
1,Op-1,2,19.63,19.63,19.63
2,Op-1,3,21.51,21.51,21.51
3,Op-1,4,18.60,18.60,18.60
4,Op-1,5,19.46,19.46,19.46
...,...,...,...,...,...
495,Op-9,236,19.01,19.01,19.01
496,Op-9,237,18.57,18.57,18.57
497,Op-9,238,20.91,20.91,20.91
498,Op-9,239,21.24,21.24,21.24


In [30]:
--- Data Exploration 2
SELECT DISTINCT operator,
	COUNT(operator)
FROM public.manufacturing_parts
GROUP BY operator;

SELECT DISTINCT item_no,
	COUNT(item_no)
FROM public.manufacturing_parts
GROUP BY item_no;

Unnamed: 0,item_no,count
0,459,1
1,460,1
2,424,1
3,112,1
4,1,1
...,...,...
495,433,1
496,481,1
497,224,1
498,157,1


In [20]:
-- Write your query here
-- My Test Code
WITH ranked AS 
(
SELECT 
	operator,
	item_no,
	height,
	ROW_NUMBER() OVER( PARTITION BY operator ORDER BY item_no ) AS row_number
FROM public.manufacturing_parts
	
)
	
SELECT 
	operator,
	row_number,
	height,
	avg_height,
	stddev_height,
	avg_height + (3*(stddev_height)/SQRT(n_in_window_5)) AS ucl,
	avg_height - (3*(stddev_height)/SQRT(n_in_window_5)) AS lcl,
	CASE
	WHEN height > avg_height + (3*(stddev_height)/SQRT(n_in_window_5))  THEN TRUE
	WHEN height < avg_height - (3*(stddev_height)/SQRT(n_in_window_5))  THEN TRUE
	-- This part where i got confuse and wrong because 
	ELSE FALSE END AS alert
	
FROM 
	(SELECT *,
	AVG(height) OVER (PARTITION BY operator ORDER BY item_no ROWS BETWEEN 4 PRECEDING AND  CURRENT ROW) AS avg_height,
	STDDEV(height) OVER  (PARTITION BY operator ORDER BY item_no ROWS BETWEEN 4 PRECEDING AND  CURRENT ROW) AS stddev_height,
	COUNT(height) OVER (PARTITION BY operator ORDER BY item_no ROWS BETWEEN 4 PRECEDING AND  CURRENT ROW) AS n_in_window_5
	FROM ranked
	
	) stats

WHERE stats.n_in_window_5 = 5



Unnamed: 0,operator,row_number,height,avg_height,stddev_height,ucl,lcl,alert
0,Op-1,5,19.46,19.778,1.062812,21.203912,18.352088,False
1,Op-1,6,20.36,19.912,1.090812,21.375477,18.448523,False
2,Op-1,7,20.22,20.030,1.084574,21.485108,18.574892,False
3,Op-1,8,21.03,19.934,0.931225,21.183369,18.684631,False
4,Op-1,9,19.78,20.170,0.598832,20.973418,19.366582,False
...,...,...,...,...,...,...,...,...
415,Op-9,31,19.01,18.904,0.203052,19.176422,18.631578,False
416,Op-9,32,18.57,18.864,0.250260,19.199759,18.528241,False
417,Op-9,33,20.91,19.266,0.952276,20.543613,17.988387,True
418,Op-9,34,21.24,19.678,1.291112,21.410208,17.945792,False
