### Import the relevant libraries

In [1]:
import pandas as pd
from datetime import datetime, time

### Load the data

In [2]:
df = pd.read_csv("Assessment Data Set.csv")

In [3]:
df.head()

Unnamed: 0,Time,Machine A,Machine B,Detector α 1,Detector α 2,Detector β 1,Detector β 2,Diverter ψ 1,Diverter ψ 2,Detector δ 1,Detector δ 2
0,hh:mm:ss,RPM,RPM,,,,,,,,
1,01:13:04,6536,8285,0.0,1.0,0.0,1.0,1.0,1.0,0.0,1.0
2,01:13:09,6536,8285,0.0,1.0,0.0,1.0,1.0,1.0,0.0,1.0
3,01:13:14,6536,8285,0.0,1.0,0.0,1.0,1.0,1.0,0.0,1.0
4,01:13:19,6536,8285,0.0,1.0,0.0,1.0,1.0,1.0,0.0,1.0


### Data inspection and cleaning

The first row is not useful, so let's remove it.

In [4]:
df=df.iloc[1:]
df.index=df.index-1

Check for NaN

In [5]:
df.isnull().values.any()

False

So, there are no missing values. 

Let us use the datetime module to manipulate the times.

In [6]:
times = df.Time.map(lambda t: datetime.strptime(t, '%H:%M:%S').time() )
times

0        01:13:04
1        01:13:09
2        01:13:14
3        01:13:19
4        01:13:24
           ...   
17275    01:12:39
17276    01:12:44
17277    01:12:49
17278    01:12:54
17279    01:12:59
Name: Time, Length: 17280, dtype: object

We see that measurements were taken each 5 seconds for a period of 24 hours, starting at 01:13:04

In [7]:
df.Time = times
del(times)

Let's change the column names to make them easier to type.

In [8]:
df = df.rename(columns={'Detector α 1': 'alpha1', 'Detector α 2': 'alpha2', 
                   'Detector β 1 ': 'beta1',  'Detector β 2': 'beta2',
                   'Detector δ 1': 'delta1',  'Detector δ 2': 'delta2',
                   'Diverter ψ 1': 'psi1',   'Diverter ψ 2': 'psi2',
                   'Machine A' : 'machineA', 'Machine B' : 'machineB'})

Detectors return a 0 when steel is not passing, and they return a 1 when steel is passing.
Diverters have the inverse rule.

Let us change the rule of the diverters, so as to make the diverters have the same rule as the detectors.

In [9]:
df.psi1 = df.psi1.replace({0.0: 1.0, 1.0: 0.0})
df.psi2 = df.psi2.replace({0.0: 1.0, 1.0: 0.0})

## Question 1

I will interpret this question as asking about the number of products that reached the end of production line 1.

All the products that reach the end of production line 1 are measured by the detector at delta1.

First let's define the dataframe corresponding to the times between 02:00 and 14:00.

In [10]:
df2 = df.loc[ ( df.Time > time(hour = 2, minute = 0, second = 0) ) &(df.Time < time(hour = 14, minute = 0, second = 0) ) ]
df2.index = df2.index- df2.iloc[0].name # change the index to make it easier to loop over

Let's define a function that counts how many products pass by a detector or diverter.

During this notebook we will repeatedly apply this function.

In [11]:
def count(point, dataf):
    # point is the detector or diverter in question
    # dataf is the dataframe corresponding to the time window we are considering. 
    # it assumes that the index of dataf starts at 0 
    
    c = 0 
    
    for indx, values in dataf[point].iloc[2:].items():
        if values == 0.0 and dataf[point].iloc[indx-1] == 1.0 and dataf[point].iloc[indx-2] == 1.0: # point detected a product passing
            c += 1
    return c

In [12]:
print(count('delta1',df2)) ## number of products detected by delta1 between 02:00 and 14:00
del(df2)

58


## Question 2

I will interpret this question as asking how many products reached the end of both production lines.

This is the same as asking how many products were detected by delta1 and delta2.

In [13]:
print(count('delta1',df)+count('delta2',df))

370


## Question 3a

The number of products diverted by psi1 is equal to 

In [14]:
print(count('psi1',df)) # number of products diverted by psi1  
print(count('psi2',df)) # number of products diverted by psi2
print(count('psi1',df) + count('psi2',df)) #total number of products diverted

1
2
3


## Question 3b

### Product Diverted by Psi1

In [15]:
df.loc[df.psi1 == 1.0]

Unnamed: 0,Time,machineA,machineB,alpha1,alpha2,beta1,beta2,psi1,psi2,delta1,delta2
8310,12:45:34,5638,7255,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0
8311,12:45:39,5638,7255,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
8312,12:45:44,5638,7255,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
8313,12:45:49,5638,7255,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
8314,12:45:54,5638,7255,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
8315,12:45:59,5638,7255,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
8316,12:46:04,5638,7255,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
8317,12:46:09,5638,7255,1.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0
8318,12:46:14,5638,7255,1.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0
8319,12:46:19,5638,7255,1.0,0.0,1.0,0.0,1.0,0.0,0.0,0.0


It was diverted by psi1 at 12:45:34. Let's discover the time where it was detected by beta1 and alpha1.

In [16]:
pd.options.display.max_rows = 200
print(df[['Time','machineA','alpha1','beta1','psi1']].iloc[8260:8311])

          Time machineA  alpha1  beta1  psi1
8260  12:41:24     5638     0.0    0.0   0.0
8261  12:41:29     5638     0.0    0.0   0.0
8262  12:41:34     5638     1.0    0.0   0.0
8263  12:41:39     5638     1.0    0.0   0.0
8264  12:41:44     5638     1.0    0.0   0.0
8265  12:41:49     5638     1.0    0.0   0.0
8266  12:41:54     5638     1.0    0.0   0.0
8267  12:41:59     5638     1.0    0.0   0.0
8268  12:42:04     5638     1.0    0.0   0.0
8269  12:42:09     5638     1.0    0.0   0.0
8270  12:42:14     5638     1.0    0.0   0.0
8271  12:42:19     5638     1.0    0.0   0.0
8272  12:42:24     5638     1.0    0.0   0.0
8273  12:42:29     5638     1.0    0.0   0.0
8274  12:42:34     5638     1.0    0.0   0.0
8275  12:42:39     5638     1.0    1.0   0.0
8276  12:42:44     5638     1.0    1.0   0.0
8277  12:42:49     5638     1.0    1.0   0.0
8278  12:42:54     5638     1.0    1.0   0.0
8279  12:42:59     5638     1.0    1.0   0.0
8280  12:43:04     5638     1.0    1.0   0.0
8281  12:4

Scrolling through this dataframe, we see that the product entered machine A at 12:41:34 and left it at 12:45:19. Machine A's RPM was 5638 throughout.

In [17]:
pd.options.display.max_rows = 200
print(df[['Time','machineB', 'delta2','psi1']].iloc[8309:8327])

          Time machineB  delta2  psi1
8309  12:45:29     7255     0.0   0.0
8310  12:45:34     7255     0.0   1.0
8311  12:45:39     7255     0.0   1.0
8312  12:45:44     7255     0.0   1.0
8313  12:45:49     7255     0.0   1.0
8314  12:45:54     7255     0.0   1.0
8315  12:45:59     7255     0.0   1.0
8316  12:46:04     7255     0.0   1.0
8317  12:46:09     7255     0.0   1.0
8318  12:46:14     7255     0.0   1.0
8319  12:46:19     7255     0.0   1.0
8320  12:46:24     7255     0.0   1.0
8321  12:46:29     7255     0.0   1.0
8322  12:46:34     7255     1.0   1.0
8323  12:46:39     7255     1.0   1.0
8324  12:46:44     7255     1.0   0.0
8325  12:46:49     7255     1.0   0.0
8326  12:46:54     7255     0.0   0.0


The product passed machine B at 12:46:49. Machine B's RPM was 7255.

### Products Diverted by Psi2

There are two products diverted by psi2. Let's analyse the first.

In [18]:
pd.options.display.max_rows = 300
print(df[['Time','machineA','machineB','alpha2','beta2','psi2', 'delta1']].iloc[7944:8237])

          Time machineA machineB  alpha2  beta2  psi2  delta1
7944  12:15:04     5638     7255     0.0    0.0   0.0     1.0
7945  12:15:09     5638     7255     0.0    0.0   0.0     1.0
7946  12:15:14     5638     7255     0.0    0.0   0.0     1.0
7947  12:15:19     5638     7255     0.0    0.0   0.0     1.0
7948  12:15:24     5638     7255     0.0    0.0   0.0     1.0
7949  12:15:29     5638     7255     0.0    0.0   0.0     1.0
7950  12:15:34     5638     7255     0.0    0.0   0.0     1.0
7951  12:15:39     5638     7255     0.0    0.0   0.0     1.0
7952  12:15:44     5638     7255     0.0    0.0   0.0     1.0
7953  12:15:49     5638     7255     0.0    0.0   0.0     1.0
7954  12:15:54     5638     7255     0.0    0.0   0.0     1.0
7955  12:15:59     5638     7255     0.0    0.0   0.0     1.0
7956  12:16:04     5638     7255     0.0    0.0   0.0     1.0
7957  12:16:09     5638     7255     0.0    0.0   0.0     1.0
7958  12:16:14     5638     7255     0.0    0.0   0.0     1.0
7959  12

This product first entered psi2 at 12:34:44. Beforehand, machine A was turned off and Beta2 did not detect steel until 12:24:44. I will interpret this as having production line 2 stopped for about 10 minutes.

The product passed machine A at 12:24:44 and machine A's RPM was 5638.

The product passed machine B at 12:39:14 and machine B's RPM was 7255.

Let's analyse the second product diverted by psi2.

In [22]:
 pd.options.display.max_rows = 300
print(df[['Time','machineA','machineB','alpha2','beta2','psi2', 'delta1']].iloc[12850:12943])

           Time machineA machineB  alpha2  beta2  psi2  delta1
12850  19:03:54     5470     7211     0.0    0.0   0.0     0.0
12851  19:03:59     5470     7211     0.0    0.0   0.0     0.0
12852  19:04:04     5470     7211     0.0    0.0   0.0     0.0
12853  19:04:09     5470     7211     0.0    0.0   0.0     0.0
12854  19:04:14     5470     7211     0.0    0.0   0.0     0.0
12855  19:04:19     5470     7211     0.0    0.0   0.0     0.0
12856  19:04:24     5470     7211     0.0    0.0   0.0     0.0
12857  19:04:29     5470     7211     0.0    1.0   0.0     0.0
12858  19:04:34     5470     7211     0.0    1.0   0.0     0.0
12859  19:04:39     5470     7211     0.0    1.0   0.0     0.0
12860  19:04:44     5470     7211     0.0    1.0   0.0     0.0
12861  19:04:49     5470     7211     0.0    1.0   0.0     0.0
12862  19:04:54     5470     7211     1.0    1.0   0.0     0.0
12863  19:04:59     5470     7211     1.0    1.0   0.0     0.0
12864  19:05:04     5470     7211     1.0    1.0   0.0 

The product passed Machine A at 19:07:24. Machine A's RPM was 5470.

The product passed Machine B at 19:11:29. Machine B's RPM was 7211.

## Question 4

In [30]:
df[['Time','psi1','psi2','alpha2','beta2','delta2']].iloc[11444:11544]

Unnamed: 0,Time,psi1,psi2,alpha2,beta2,delta2
11444,17:06:44,0.0,0.0,1.0,1.0,1.0
11445,17:06:49,0.0,0.0,1.0,1.0,1.0
11446,17:06:54,0.0,0.0,1.0,1.0,1.0
11447,17:06:59,0.0,0.0,1.0,1.0,1.0
11448,17:07:04,0.0,0.0,1.0,1.0,1.0
11449,17:07:09,0.0,0.0,1.0,1.0,1.0
11450,17:07:14,0.0,0.0,1.0,1.0,1.0
11451,17:07:19,0.0,0.0,1.0,1.0,1.0
11452,17:07:24,0.0,0.0,1.0,1.0,1.0
11453,17:07:29,0.0,0.0,1.0,1.0,1.0


The product is at beta2 between 17:11:54 until 17:14:39.

The product is at alpha2 between 17:09:14 until 17:11:59
            