### Lab Title: Exploring Correlation and Covariance in Manufacturing Industry Data



#### Problem Statement:
Understanding the relationship between different variables in the manufacturing industry is crucial for optimizing processes and improving efficiency. Correlation and covariance help in identifying dependencies between key factors such as productivity, idle time, over time, and incentives.



#### Objective:
- Compute the correlation and covariance between different manufacturing metrics.
- Interpret these measures to understand dependencies and trends.



#### Dataset:
The dataset is taken from [Kaggle](https://www.kaggle.com/datasets/ishadss/productivity-prediction-of-garment-employees) 




#### Requirements:
- Python installed (version 3.x recommended).
- Pandas and NumPy libraries installed (`pip install pandas numpy`).
- Jupyter Notebook (optional but recommended for running the lab).



#### Implementation:


In [1]:

# Import necessary libraries
import pandas as pd
import numpy as np


In [2]:

# Load dataset
df=pd.read_csv("garments_worker_productivity.csv")


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1197 entries, 0 to 1196
Data columns (total 15 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   date                   1197 non-null   object 
 1   quarter                1197 non-null   object 
 2   department             1197 non-null   object 
 3   day                    1197 non-null   object 
 4   team                   1197 non-null   int64  
 5   targeted_productivity  1197 non-null   float64
 6   smv                    1197 non-null   float64
 7   wip                    691 non-null    float64
 8   over_time              1197 non-null   int64  
 9   incentive              1197 non-null   int64  
 10  idle_time              1197 non-null   float64
 11  idle_men               1197 non-null   int64  
 12  no_of_style_change     1197 non-null   int64  
 13  no_of_workers          1197 non-null   float64
 14  actual_productivity    1197 non-null   float64
dtypes: f

In [5]:
df.columns

Index(['date', 'quarter', 'department', 'day', 'team', 'targeted_productivity',
       'smv', 'wip', 'over_time', 'incentive', 'idle_time', 'idle_men',
       'no_of_style_change', 'no_of_workers', 'actual_productivity'],
      dtype='object')

In [7]:
data1=df[['team', 'targeted_productivity',
       'smv', 'wip', 'over_time', 'incentive', 'idle_time', 'idle_men',
       'no_of_style_change', 'no_of_workers', 'actual_productivity']]
data1

Unnamed: 0,team,targeted_productivity,smv,wip,over_time,incentive,idle_time,idle_men,no_of_style_change,no_of_workers,actual_productivity
0,8,0.80,26.16,1108.0,7080,98,0.0,0,0,59.0,0.940725
1,1,0.75,3.94,,960,0,0.0,0,0,8.0,0.886500
2,11,0.80,11.41,968.0,3660,50,0.0,0,0,30.5,0.800570
3,12,0.80,11.41,968.0,3660,50,0.0,0,0,30.5,0.800570
4,6,0.80,25.90,1170.0,1920,50,0.0,0,0,56.0,0.800382
...,...,...,...,...,...,...,...,...,...,...,...
1192,10,0.75,2.90,,960,0,0.0,0,0,8.0,0.628333
1193,8,0.70,3.90,,960,0,0.0,0,0,8.0,0.625625
1194,7,0.65,3.90,,960,0,0.0,0,0,8.0,0.625625
1195,9,0.75,2.90,,1800,0,0.0,0,0,15.0,0.505889


In [4]:

# Select relevant numerical columns for analysis
selected_columns=['team','targeted_productivity','smv','wip','over_time',
                  'incentive','idle_time','idle_men','no_of_style_change',
                  'no_of_workers','actual_productivity']
data=df[selected_columns]
data

Unnamed: 0,team,targeted_productivity,smv,wip,over_time,incentive,idle_time,idle_men,no_of_style_change,no_of_workers,actual_productivity
0,8,0.80,26.16,1108.0,7080,98,0.0,0,0,59.0,0.940725
1,1,0.75,3.94,,960,0,0.0,0,0,8.0,0.886500
2,11,0.80,11.41,968.0,3660,50,0.0,0,0,30.5,0.800570
3,12,0.80,11.41,968.0,3660,50,0.0,0,0,30.5,0.800570
4,6,0.80,25.90,1170.0,1920,50,0.0,0,0,56.0,0.800382
...,...,...,...,...,...,...,...,...,...,...,...
1192,10,0.75,2.90,,960,0,0.0,0,0,8.0,0.628333
1193,8,0.70,3.90,,960,0,0.0,0,0,8.0,0.625625
1194,7,0.65,3.90,,960,0,0.0,0,0,8.0,0.625625
1195,9,0.75,2.90,,1800,0,0.0,0,0,15.0,0.505889


In [8]:

# Compute covariance
cov_matrix=data.cov()

# Compute correlation
corr_matrix=data.corr()


In [9]:

# Print results
print("Covariance Matrix:\n", cov_matrix)
print("Correlation Matrix:\n", corr_matrix)

Covariance Matrix:
                               team  targeted_productivity           smv  \
team                     11.999042               0.010266     -4.170167   
targeted_productivity     0.010266               0.009583     -0.074439   
smv                      -4.170167              -0.074439    119.754046   
wip                    -212.712691              11.630780   -485.038033   
over_time             -1122.167410             -29.030617  24732.539468   
incentive                -4.258009               0.513815     57.195571   
idle_time                 0.167131              -0.069899      7.908797   
idle_men                  0.305443              -0.017222      3.788411   
no_of_style_change       -0.016590              -0.008766      1.476655   
no_of_workers            -5.775617              -0.183154    221.580535   
actual_productivity      -0.089909               0.007201     -0.233124   

                                wip     over_time     incentive    idle_time  \

In [10]:
cov_matrix

Unnamed: 0,team,targeted_productivity,smv,wip,over_time,incentive,idle_time,idle_men,no_of_style_change,no_of_workers,actual_productivity
team,11.999042,0.010266,-4.170167,-212.7127,-1122.167,-4.258009,0.167131,0.305443,-0.01659,-5.775617,-0.089909
targeted_productivity,0.010266,0.009583,-0.074439,11.63078,-29.03062,0.513815,-0.069899,-0.017222,-0.008766,-0.183154,0.007201
smv,-4.170167,-0.074439,119.754046,-485.038,24732.54,57.195571,7.908797,3.788411,1.476655,221.580535,-0.233124
wip,-212.712691,11.63078,-485.038033,3376241.0,117385.2,8478.812114,-807.632285,-383.459374,-71.420114,525.754311,37.299365
over_time,-1122.16741,-29.030617,24732.539468,117385.2,11214620.0,-2571.212375,1321.044885,-196.101555,85.666507,54574.928067,-31.674054
incentive,-4.258009,0.513815,57.195571,8478.812,-2571.212,25658.479053,-24.478679,-11.069442,-1.823491,175.018659,2.139222
idle_time,0.167131,-0.069899,7.908797,-807.6323,1321.045,-24.478679,161.537911,23.231413,-0.063067,16.377286,-0.179303
idle_men,0.305443,-0.017222,3.788411,-383.4594,-196.1016,-11.069442,23.231413,10.686278,0.186901,7.760404,-0.103661
no_of_style_change,-0.01659,-0.008766,1.476655,-71.42011,85.66651,-1.823491,-0.063067,0.186901,0.183054,3.113065,-0.015481
no_of_workers,-5.775617,-0.183154,221.580535,525.7543,54574.93,175.018659,16.377286,7.760404,3.113065,492.737294,-0.224611


In [11]:
corr_matrix

Unnamed: 0,team,targeted_productivity,smv,wip,over_time,incentive,idle_time,idle_men,no_of_style_change,no_of_workers,actual_productivity
team,1.0,0.030274,-0.110011,-0.033474,-0.096737,-0.007674,0.003796,0.026974,-0.011194,-0.075113,-0.148753
targeted_productivity,0.030274,1.0,-0.069489,0.062054,-0.088557,0.032768,-0.056181,-0.053818,-0.209294,-0.084288,0.421594
smv,-0.110011,-0.069489,1.0,-0.037837,0.674887,0.032629,0.056863,0.105901,0.315388,0.912176,-0.122089
wip,-0.033474,0.062054,-0.037837,1.0,0.022302,0.16721,-0.026299,-0.048718,-0.072357,0.030383,0.131147
over_time,-0.096737,-0.088557,0.674887,0.022302,1.0,-0.004793,0.031038,-0.017913,0.05979,0.734164,-0.054206
incentive,-0.007674,0.032768,0.032629,0.16721,-0.004793,1.0,-0.012024,-0.02114,-0.026607,0.049222,0.076538
idle_time,0.003796,-0.056181,0.056863,-0.026299,0.031038,-0.012024,1.0,0.559146,-0.011598,0.058049,-0.080851
idle_men,0.026974,-0.053818,0.105901,-0.048718,-0.017913,-0.02114,0.559146,1.0,0.133632,0.106946,-0.181734
no_of_style_change,-0.011194,-0.209294,0.315388,-0.072357,0.05979,-0.026607,-0.011598,0.133632,1.0,0.327787,-0.207366
no_of_workers,-0.075113,-0.084288,0.912176,0.030383,0.734164,0.049222,0.058049,0.106946,0.327787,1.0,-0.057991
