<a href="https://colab.research.google.com/github/navjotsingh151/Sensor-TimeSeriesAnalysis/blob/master/InsuranceData.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Economical Insurance Assignment

In [1]:
!pip install plotly==4.8.1
!pip install chart_studio
!pip install pyxlsb



In [3]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy.stats import norm
import chart_studio.plotly as py
import plotly.graph_objs as go
from plotly.offline import init_notebook_mode, plot, iplot
pd.options.plotting.backend = "plotly"



In [4]:
data = pd.read_excel('/content/Sample Inventory Report_20200610_n.xlsx', 
                     header=0)
data.head()

Unnamed: 0,Identifier,Year,OEM,Model,Ordered Date,Assembly Completed Date,Date Shipped to Upfitter,Date Upfit Complete,Ship Date To Dealer,Delivered to Dealer,Delivered to Client
0,ABCDEFGH123456789,2019,MAN E,Model M,2019-05-09,2019-09-26,NaT,2019-10-11,2019-10-23,2019-11-12,2019-11-12
1,ABCDEFGH123456790,2019,MAN E,Model M Cargo,2019-05-09,2019-09-29,NaT,2019-10-11,2019-10-15,2019-10-16,2019-11-06
2,ABCDEFGH123456791,2019,MAN E,Model M,2019-05-09,2019-09-26,NaT,NaT,2019-10-20,2019-10-21,NaT
3,ABCDEFGH123456792,2019,MAN E,Model M,2019-05-09,2019-09-25,NaT,2019-10-08,2019-11-01,2019-11-01,2019-11-18
4,ABCDEFGH123456793,2019,MAN E,Model M,2019-05-09,2019-09-29,NaT,2019-10-14,NaT,2019-11-21,2019-11-21


In [5]:
## Basic step is to see the datatypes and check variation for numeric data
print("Shape of Data :", data.shape)
data.info()

Shape of Data : (53570, 11)
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 53570 entries, 0 to 53569
Data columns (total 11 columns):
 #   Column                    Non-Null Count  Dtype         
---  ------                    --------------  -----         
 0   Identifier                53559 non-null  object        
 1   Year                      53570 non-null  int64         
 2   OEM                       53570 non-null  object        
 3   Model                     53570 non-null  object        
 4   Ordered Date              53570 non-null  datetime64[ns]
 5   Assembly Completed Date   49654 non-null  datetime64[ns]
 6   Date Shipped to Upfitter  18330 non-null  datetime64[ns]
 7   Date Upfit Complete       41406 non-null  datetime64[ns]
 8   Ship Date To Dealer       32118 non-null  datetime64[ns]
 9   Delivered to Dealer       41128 non-null  datetime64[ns]
 10  Delivered to Client       37230 non-null  datetime64[ns]
dtypes: datetime64[ns](7), int64(1), object(3)
memory usa

In [6]:
## Check for any nulls
data.isnull().sum()
print(data.isna().sum())
# data.drop(['id', 'name'], axis = 1, inplace=True)

Identifier                     11
Year                            0
OEM                             0
Model                           0
Ordered Date                    0
Assembly Completed Date      3916
Date Shipped to Upfitter    35240
Date Upfit Complete         12164
Ship Date To Dealer         21452
Delivered to Dealer         12442
Delivered to Client         16340
dtype: int64


# Questions to be asked

## 1. Some of the items are missing shipped date to **upfitter**. Is the part manufactured and upfit at same place? If not what are the scenarios where the date to upfit is not given 

## 2. For some Assembly completed date is not available but the product is delivered to client/ dealer - **ABCDEFGH123459158**. Is it the normal scenario ?

## 3.



# **Parts that were Ordered but not reached further stages**

In [16]:
## Parts that are ordered and not proceeded further

data_only_ordered  = data[(data['Assembly Completed Date'].isna() ) & ( data['Delivered to Dealer'].isna() ) &
                                                 ( data['Date Shipped to Upfitter'].isna() ) &
                                                 ( data['Delivered to Client'].isna() ) &
                                                 ( data['Date Upfit Complete'].isna() ) &
                                                 ( data['Ship Date To Dealer'].isna()  ) ]
print("Products that are noly ordered and as not moved down the further stages:", len(data_only_ordered))
data_only_ordered.head()                                           

Products that are noly ordered and as not moved down the further stages: 3828


Unnamed: 0,Identifier,Year,OEM,Model,Ordered Date,Assembly Completed Date,Date Shipped to Upfitter,Date Upfit Complete,Ship Date To Dealer,Delivered to Dealer,Delivered to Client
6364,ABCDEFGH123463458,2019,MAN E,Model M,2019-02-19,NaT,NaT,NaT,NaT,NaT,NaT
13839,ABCDEFGH123471117,2020,MAN C,Model B,2020-01-09,NaT,NaT,NaT,NaT,NaT,NaT
14181,ABCDEFGH123471493,2021,MAN C,Model G,2020-01-13,NaT,NaT,NaT,NaT,NaT,NaT
14182,ABCDEFGH123471494,2021,MAN C,Model G,2020-01-13,NaT,NaT,NaT,NaT,NaT,NaT
14183,ABCDEFGH123471495,2021,MAN C,Model G,2020-01-13,NaT,NaT,NaT,NaT,NaT,NaT


In [98]:
# Distribution of data along the order date

data_only_ordered_c = data_only_ordered.iloc[:, 0:5].copy()

data_only_ordered_c.groupby('Ordered Date').count()['Year'].plot()
fig = data_only_ordered_c.groupby('Ordered Date').count()['Year'].plot(title="Overall Distribution of products that are only ordered but not proceeded further down the supply chain", 
                                                                       template="seaborn",
                                                                       labels=dict(index="Ordered Date", 
                                                                                   value="Number of Parts",
                                                                                   variable="Option"
                                                                                   ))
fig.update_layout(showlegend=False)
fig.update_traces(mode="markers+lines")
fig.show()

In [99]:
import plotly.express as px

df = data_only_ordered_c.groupby(['Ordered Date', 'OEM'], as_index=False).count()
fig = px.line(df, x="Ordered Date", y="Year", color="OEM",
              title="Parts Distribution based on OEM that are only Ordered but not processed further ", 
              labels= dict(Year = 'Number of Parts'))
fig.update_traces(mode="markers+lines")
fig.update_layout(hovermode="x unified")

fig.show()

In [97]:
df = data_only_ordered_c.groupby('OEM', as_index=False).count()[['OEM','Identifier' ]]
df.rename(columns= {'Identifier': 'Number of parts'}, inplace = True)
df

Unnamed: 0,OEM,Number of parts
0,MAN B,1
1,MAN C,84
2,MAN E,1
3,MAN F,525
4,MISC,0
5,Man A,3206


## Observation

It is evident that there is some order issue with **Original Equipment Manufacture (OEM)** - **Man A** and **MAN F** as they have highest products that were ordered but not shipped to dealer or client.

1. Man A : **3206 Parts**
2. MAN F : **525 Parts**

#Question:

What is the order collection process for OEM - Man F and Man A? How long is the time taken to process OEM parts to further stages? 


In [100]:
## Analaysing OEM - Man A
df = data_only_ordered_c[( data_only_ordered_c.OEM == 'Man A' ) | 
                         ( data_only_ordered_c.OEM == 'MAN F' )].groupby(['Ordered Date', 'OEM', 'Model'],
                                                                     as_index=False).count()
# df = df[df.oem == 'Man A']
fig = px.line(df, x="Ordered Date", y="Year", color="Model", 
              facet_col = "OEM",
              title="layout.hovermode='closest' (the default)", 
              labels= dict(Year = 'Number of Parts'))
fig.update_traces(mode="markers+lines")
fig.update_layout(hovermode="x unified")

fig.show()

In [101]:
df = data_only_ordered_c.groupby(['OEM', 'Model'], as_index=False).count()[['OEM','Model','Identifier' ]]
df.rename(columns= {'Identifier': 'Number of parts'}, inplace = True)
df

Unnamed: 0,OEM,Model,Number of parts
0,MAN B,Model H,1
1,MAN C,Model B,1
2,MAN C,Model G,83
3,MAN E,Model M,1
4,MAN F,Model J,1
5,MAN F,Model L,524
6,MISC,Model P,0
7,Man A,Model E CUTAWAY,2
8,Man A,Model F COMMERCIAL STRIPPED,1375
9,Man A,Model N CARGO,1829


## Observation

1. It is evident that products are ordered in large quantity on **Dec 4, 2019** and **Dec 13, 2019**. 

2. OEM **Man A** with model **Model F Commercial Stripped** and **Model N Cargo** were the top to models that were ordered in high volumne but were not processed further.

## Question
1. What is average time taken by above two models to process further ?
2. Is ther any min. quantity these two model parts need to be followed ? As the order volumn is quite high


# **Without Assembly date but proceeded to further stages**

In [None]:
## Parts those don't have assembly completed dates but have 

data_no_assembly_date  = data[(data['Assembly Completed Date'].isna() ) & (( data['Delivered to Dealer'].notna() ) |
                                                 ( data['Date Shipped to Upfitter'].notna() )|
                                                 ( data['Delivered to Client'].notna() )| 
                                                 ( data['Date Upfit Complete'].notna() ) |
                                                 ( data['Ship Date To Dealer'].notna()  )) ]


print("Products withouts assembly date but were in next stages:", len(data_no_assembly_date))
data_no_assembly_date.head()