# Inventory Management System
## Week 2 - Stock Processing in Python

**Tools:** Python(Pandas, Numpy)

**Tasks:**
* Load stock movement logs (CSV or JSON)
*  Clean and transform data (e.g., date formatting, quantity validation)
* Calculate current stock levels using numpy
* Use pandas to flag items below reorder threshold.

**Deliverables:**

 * Python script to compute current stock
 * Summary report of low-stock items

In [1]:
from google.colab import drive
drive.mount('/content/mydrive')

Mounted at /content/mydrive


**1. DATA EXTRACTION**

In [8]:
import pandas as pd
import numpy as np

data_path = (
    "/content/mydrive/MyDrive/Hexware_Training_DataEngineering/Project/"
    "Inventory_Management_System/Week-02/stock_management.csv"
    )

df = pd.read_csv(data_path)
df.head()

Unnamed: 0,movement_id,product_id,warehouse_id,quantity,movement_type,movement_date,reference_number,reason
0,401,101,201,15,IN,2025-01-05 09:15:00,PO-1001,Initial stock
1,402,102,202,50,IN,2025-01-05 10:30:00,PO-1002,Bulk order
2,403,103,203,30,IN,2025-01-06 11:45:00,PO-1003,Quarterly restock
3,404,104,204,25,IN,2025-01-06 14:20:00,PO-1004,New product line
4,405,105,205,18,IN,2025-01-07 08:10:00,PO-1005,Pre-holiday stock


In [9]:
print("Dataframe Info:\n")
df.info()

print("\nMissing Values:\n")
print(df.isnull().sum())


Dataframe Info:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 16 entries, 0 to 15
Data columns (total 8 columns):
 #   Column            Non-Null Count  Dtype 
---  ------            --------------  ----- 
 0   movement_id       16 non-null     int64 
 1   product_id        16 non-null     int64 
 2   warehouse_id      16 non-null     int64 
 3   quantity          16 non-null     int64 
 4   movement_type     15 non-null     object
 5   movement_date     14 non-null     object
 6   reference_number  13 non-null     object
 7   reason            14 non-null     object
dtypes: int64(4), object(4)
memory usage: 1.1+ KB

Missing Values:

movement_id         0
product_id          0
warehouse_id        0
quantity            0
movement_type       1
movement_date       2
reference_number    3
reason              2
dtype: int64


In [14]:
print(df)

    movement_id  product_id  warehouse_id  quantity movement_type  \
0           401         101           201        15            IN   
1           402         102           202        50            IN   
2           403         103           203        30            IN   
3           404         104           204        25            IN   
4           405         105           205        18            IN   
6           407         107           207         0            IN   
10          411         101           201         5           OUT   
11          412         102           202        10           OUT   
12          413         103           203         8           OUT   
13          414         104           204         3           OUT   
15          416         101           201        10            IN   

         movement_date reference_number                 reason  
0  2025-01-05 09:15:00          PO-1001          Initial stock  
1  2025-01-05 10:30:00          PO-1002  

**2. DATA CLEANING**

In [20]:
# Handling the missing values

# Date Type conversion
df['movement_date'] = pd.to_datetime(df['movement_date'], errors='coerce')

# Movement_type
df['movement_type'] = df['movement_type'].str.upper()

# dropping rows with missing values values
df = df.dropna(subset=['movement_type', 'movement_date'])

# product id and warehouse id
df = df[(df['product_id'] != 0) & (df['warehouse_id'] != 0)]

# Filling with placeholders
df.loc[:,'reference_number'] = df['reference_number'].fillna('Not Applicable')
df.loc[:,'reason'] = df['reason'].fillna('Not Specified')

# Quantity
df.loc[:,'quantity'] = pd.to_numeric(df['quantity'], errors='coerce').astype(int)
df = df[df['quantity'] != 0]

# Adding new column for netquantity
df['net_quantity'] = df.apply(
    lambda row: -row['quantity'] if row['movement_type'] == 'OUT' else row['quantity'],
    axis=1
)

df_clean = df.copy()
print(df_clean)




    movement_id  product_id  warehouse_id  quantity movement_type  \
0           401         101           201        15            IN   
1           402         102           202        50            IN   
2           403         103           203        30            IN   
3           404         104           204        25            IN   
4           405         105           205        18            IN   
10          411         101           201         5           OUT   
11          412         102           202        10           OUT   
12          413         103           203         8           OUT   
13          414         104           204         3           OUT   
15          416         101           201        10            IN   

         movement_date reference_number                reason  net_quantity  
0  2025-01-05 09:15:00          PO-1001         Initial stock            15  
1  2025-01-05 10:30:00          PO-1002            Bulk order            50  
2  202

In [21]:
# Final Check
print("Dataframe Info:\n")
df.info()

print("\nMissing Values:\n")
print(df.isnull().sum())


Dataframe Info:

<class 'pandas.core.frame.DataFrame'>
Index: 10 entries, 0 to 15
Data columns (total 9 columns):
 #   Column            Non-Null Count  Dtype         
---  ------            --------------  -----         
 0   movement_id       10 non-null     int64         
 1   product_id        10 non-null     int64         
 2   warehouse_id      10 non-null     int64         
 3   quantity          10 non-null     int64         
 4   movement_type     10 non-null     object        
 5   movement_date     10 non-null     datetime64[ns]
 6   reference_number  10 non-null     object        
 7   reason            10 non-null     object        
 8   net_quantity      10 non-null     int64         
dtypes: datetime64[ns](1), int64(5), object(3)
memory usage: 800.0+ bytes

Missing Values:

movement_id         0
product_id          0
warehouse_id        0
quantity            0
movement_type       0
movement_date       0
reference_number    0
reason              0
net_quantity        0
dt

In [22]:
# Saving Cleaned data
clean_data_path = (
    "/content/mydrive/MyDrive/Hexware_Training_DataEngineering/Project/"
    "Inventory_Management_System/Week-02/cleaned_stock_data.csv"
    )
df_clean.to_csv(clean_data_path, index = False)
print("Cleaned Stock Data saved successfully.")

Cleaned Stock Data saved successfully.


**3. STOCK REPORT**

In [27]:
stock_summary = df.groupby('product_id')['net_quantity'].agg('sum').reset_index()
stock_summary = stock_summary.rename(columns={'net_quantity': 'current_stock'})

In [28]:

# Adding a threshold low stock for reorder
reorder_levels = {
    101: 20,
    102: 25,
    103: 30,
    104: 15,
    105: 10,
    106: 50,
    107: 10,
    108: 20,
    109: 10
}

stock_summary['reorder_level'] = stock_summary['product_id'].map(reorder_levels)
stock_summary['low_stock'] = stock_summary['current_stock'] < stock_summary['reorder_level']
stock_summary.head()

Unnamed: 0,product_id,current_stock,reorder_level,low_stock
0,101,20,20,False
1,102,40,25,False
2,103,22,30,True
3,104,22,15,False
4,105,18,10,False


**4. LOADING DATA**


In [29]:
summaryPath = (
    "/content/mydrive/MyDrive/Hexware_Training_DataEngineering/Project/"
    "Inventory_Management_System/Week-02/final_stock_report.csv"
    )
lowstockPath =  (
    "/content/mydrive/MyDrive/Hexware_Training_DataEngineering/Project/"
    "Inventory_Management_System/Week-02/low_stock_report.csv"
    )

stock_summary.to_csv(summaryPath, index=False)
stock_summary[stock_summary['low_stock']].to_csv(lowstockPath, index=False)
print("Final Stock Report saved successfully.")


Final Stock Report saved successfully.
