This data was obtained from the client’s inventory database using Microsoft SQL Server. A two-year history of the products that have been issued from the distribution center to the facilities was obtained.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import date, timedelta
from datetime import datetime


df = pd.read_csv (r"F:\Shr_Priv\Materials_Mgt_DataTeam\Laura Eshee\Demand Forcasting\Demand Forecasting Query DeID.csv", index_col=0, low_memory=False, skipinitialspace=True)

df.reset_index(inplace=True, drop=True)

print(df.head())
print(df.info())


   Company Company Name   Item                     Description  Qty STOCK_UOM  \
0      100          WSS  82834  SET BLOOD 23GA12IN PSHBTN W LL  600      EA     
1      100          WSS  82834  SET BLOOD 23GA12IN PSHBTN W LL  400      EA     
2      100          WSS  82834  SET BLOOD 23GA12IN PSHBTN W LL  600      EA     
3      100          WSS  82834  SET BLOOD 23GA12IN PSHBTN W LL  900      EA     
4      100          WSS  82834  SET BLOOD 23GA12IN PSHBTN W LL  400      EA     

  Trans_UOM  UNIT_COST  Ext Amount TRANS_DATE  ... ALT_UOM_CONV_03  \
0      BX         1.25       750.0   7/8/2019  ...             200   
1      BX         1.25       500.0   7/9/2019  ...             200   
2      BX         1.25       750.0  7/10/2019  ...             200   
3      BX         1.25      1125.0  8/26/2019  ...             200   
4      BX         1.25       500.0  8/22/2019  ...             200   

  ALT_UOM_CONV_04  ALT_UOM_CONV_05 ALT_UOM_CONV_06  BUY_FL_01  BUY_FL_02  \
0               

The date column was imported as an object data type. It will now be converted to a datetime data type.

In [2]:
df['TRANS_DATE'] = pd.to_datetime(df['TRANS_DATE'])
print(df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 474649 entries, 0 to 474648
Data columns (total 46 columns):
Company            474649 non-null int64
Company Name       474649 non-null object
Item               474649 non-null int64
Description        474649 non-null object
Qty                474649 non-null int64
STOCK_UOM          474649 non-null object
Trans_UOM          474383 non-null object
UNIT_COST          474649 non-null float64
Ext Amount         474649 non-null float64
TRANS_DATE         474649 non-null datetime64[ns]
Item Type          474649 non-null object
Sys                474649 non-null object
Document           474649 non-null int64
Doc Type           474649 non-null object
Line Nbr           474649 non-null int64
Req Nbr            474649 non-null int64
From Location      474649 non-null int64
From Loc Name      474649 non-null object
Req Location       474649 non-null object
TRACKING_FL_01     474649 non-null object
TRACKING_FL_02     0 non-null float64
TRACKING

The client requested that the unit of measure (UOM) used for forecasting be the tracked UOM. The tracked UOM is denoted in the data by an X in one of the BUY_FL_ columns. The code below converts the transaction UOM to the tracked UOM.

In [3]:
def conv(df):
    if (df['BUY_FL_01'] == 'X'):
        return df['ALT_UOM_CONV_01'].apply(pd.to_numeric, errors='coerce')
    elif (df['BUY_FL_02'] == 'X'):
        return df['ALT_UOM_CONV_02'].apply(pd.to_numeric, errors='coerce')
    elif (df['BUY_FL_03'] == 'X'): 
        return df['ALT_UOM_CONV_03'].apply(pd.to_numeric, errors='coerce')
    elif (df['BUY_FL_04'] == 'X'):
        return df['ALT_UOM_CONV_04'].apply(pd.to_numeric, errors='coerce')         
    elif (df['BUY_FL_05'] == 'X'): 
        return df['ALT_UOM_CONV_05'].apply(pd.to_numeric, errors='coerce')
    elif (df['BUY_FL_06'] == 'X'):
        return df['ALT_UOM_CONV_06'].apply(pd.to_numeric, errors='coerce')
    else:
        return 1
                    
df['Track_UOM_Conv'] = df.apply(conv, axis=1)
                        
print(df['Track_UOM_Conv'])

0         1
1         1
2         1
3         1
4         1
         ..
474644    1
474645    1
474646    1
474647    1
474648    1
Name: Track_UOM_Conv, Length: 474649, dtype: int64


Using the conversion amount calculated above, the transaction quantity is converted to the tracked UOM quantity by multiplying Qty by Track_UOM_Conv.

In [4]:
df['conv_qty'] = (df.Track_UOM_Conv * df.Qty).astype(int)

print(df.conv_qty)
print (df.info())

0          600
1          400
2          600
3          900
4          400
          ... 
474644    2000
474645     144
474646     120
474647     200
474648      48
Name: conv_qty, Length: 474649, dtype: int32
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 474649 entries, 0 to 474648
Data columns (total 48 columns):
Company            474649 non-null int64
Company Name       474649 non-null object
Item               474649 non-null int64
Description        474649 non-null object
Qty                474649 non-null int64
STOCK_UOM          474649 non-null object
Trans_UOM          474383 non-null object
UNIT_COST          474649 non-null float64
Ext Amount         474649 non-null float64
TRANS_DATE         474649 non-null datetime64[ns]
Item Type          474649 non-null object
Sys                474649 non-null object
Document           474649 non-null int64
Doc Type           474649 non-null object
Line Nbr           474649 non-null int64
Req Nbr            474649 non-null int64
Fro

The data wrangling is complete, and the wrangled data frame is saved into a csv file.

In [5]:
df.to_csv("F:\Shr_Priv\Materials_Mgt_DataTeam\Laura Eshee\Demand Forcasting\Output Files\Demand Forecasting Data Wrangling.csv")
print (df.info())

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 474649 entries, 0 to 474648
Data columns (total 48 columns):
Company            474649 non-null int64
Company Name       474649 non-null object
Item               474649 non-null int64
Description        474649 non-null object
Qty                474649 non-null int64
STOCK_UOM          474649 non-null object
Trans_UOM          474383 non-null object
UNIT_COST          474649 non-null float64
Ext Amount         474649 non-null float64
TRANS_DATE         474649 non-null datetime64[ns]
Item Type          474649 non-null object
Sys                474649 non-null object
Document           474649 non-null int64
Doc Type           474649 non-null object
Line Nbr           474649 non-null int64
Req Nbr            474649 non-null int64
From Location      474649 non-null int64
From Loc Name      474649 non-null object
Req Location       474649 non-null object
TRACKING_FL_01     474649 non-null object
TRACKING_FL_02     0 non-null float64
TRACKING

The data wrangling for this data set was fairly simple. First, the date column data type was converted from object to datetime. Then, the transaction UOM was converted to the tracked UOM and the tracked quantity was computed. The data set was then saved to a csv file. It is now ready for the next step.