<div style="text-align: center;">
  <h1>--: SHIPMENT TRACKING :--</h1>
</div>

## Introduction

This script defines a `ShipmentProcessor` class to ingest, transform, and export courier tracking data for analytical consumption. It:

1. **Loads** a nested JSON file of shipment events from our logistics partner.  
2. **Flattens** each shipment record—extracting tracking number, payment type, pickup/delivery timestamps, geolocation details, weight, and delivery attempt counts—into a tidy tabular format.  
3. **Exports** the cleaned dataset to `shipment_tracking_output.csv` and computes summary statistics (mean, median, mode) for journey durations and delivery attempts, saving them to `shipment_summary_statistics.csv`.  

Designed for clarity and maintainability, it uses Python’s built‑in `json` module, Pandas for data handling, and `logging` for robust error reporting.
For simplicity, logging is configured to output to the console—whereas in real‑time or production environments, I extend this setup to write detailed log files for auditability and long‑term monitoring.


#### 1) Import Required Libraries

In [None]:
import json
import pandas as pd
import logging
from datetime import datetime

#### 2) Setup Logging for monitoring and Exception handling

In [None]:
logging.basicConfig(level=logging.INFO, format='%(levelname)s:%(message)s')

#### 3) Defining required Classes and Objects

In [None]:
class ShipmentProcessor:
    # Processes shipment tracking data from nested JSON into a flattened and analyzable format.
    
    def __init__(self, json_path: str):
        self.json_path = json_path
        self.data = []
        self.records = []
        self.df = pd.DataFrame()

    def load_json(self) -> None:
        # Loads the JSON data from file
        try:
            logging.info("Loading JSON data...")
            with open(self.json_path, 'r') as f:
                self.data = json.load(f)
            logging.info(f"Loaded {len(self.data)} records.")
        except Exception as e:
            logging.error(f"Failed to load JSON: {e}")
            raise

    def extract_fields(self, shipment: dict) -> dict:
        # Extract individual shipment fields
        try:
            # 1) Tracking Number:
            tracking_number = shipment.get('trackingNumber')
            
            # 2) Shipment weight:
            weight = shipment.get('shipmentWeight', {}).get('value')
            
            # 3) Pickup, Drop locations and out for deliveries datetimes:
            event_details = shipment.get('events',[])
            ofd_dates = []
            # As per the given data, pickup and delivery details are nested under 'events'
            for event in event_details:
                if(event.get('eventType')=='PU'):
                    pickup_details = event.get('address',{})
                    pickup_pincode = pickup_details.get('postalCode')
                    pickup_city = pickup_details.get('city')
                    pickup_state = pickup_details.get('stateOrProvinceCode')
                elif(event.get('eventType')=='DL'):
                    delivery_details = event.get('address',{})
                    drop_state = delivery_details.get('stateOrProvinceCode')
                    drop_city = delivery_details.get('city')
                    drop_pincode = delivery_details.get('postalCode')
                elif(event.get('eventType') == 'OD'):
                    ts_str = event.get('timestamp', {}).get('$numberLong')
                    ts = int(ts_str)
                    ofd_dates.append(pd.to_datetime(ts, unit='ms'))
    
            # 4) Payment type:
            # As per given data, payment details are nested under 'specialHandlings'
            payment_details = shipment.get('specialHandlings',[])
            payment_type = 'Prepaid'
            for payment in payment_details:
                if(payment.get('type')=='COD'):
                    payment_type = 'COD'
                    break
    
            # 5) Pickup and Delivery Datetime
            # As per the given data, pickup and delivery datetimes are nested under 'datesOrTimes'
            datetime_details = shipment.get('datesOrTimes',[])
            for date in datetime_details:
                if(date.get('type')=='ACTUAL_DELIVERY'):
                    delivery_date_time = pd.to_datetime(date.get('dateOrTimestamp'))
                if(date.get('type')=='ACTUAL_PICKUP'):
                    pickup_date_time = pd.to_datetime(date.get('dateOrTimestamp'))
                    
            # Calculation of required metrics
            
            # 6) Days taken for delivery:
            days_taken = (delivery_date_time-pickup_date_time).days
            
            # 7) Delivery attempts:
            ofd = set([x.date() for x in ofd_dates])
            delivery_attempts= len(ofd)+1
            # If delivery happens on the same day as out-for-delivery, it counts as one attempt
            if(delivery_date_time.date() in ofd):
                delivery_attempts-=1
            
            # Finally, return the extracted fields as a dictionary
            return {
                "Tracking Number": tracking_number,
                "Payment Type": payment_type,
                "Pickup Datetime": pickup_date_time,
                "Delivery Datetime": delivery_date_time,
                "Days Taken for Delivery": days_taken,
                "Shipment Weight": weight,
                "Pickup city": pickup_city,
                "Pickup state": pickup_state,
                "Pickup pincode": pickup_pincode,
                "Delivery city" : drop_city,
                "Delivery state": drop_state,
                "Delivery pincode": drop_pincode,
                "Number of Delivery Attempts":delivery_attempts
            }
        except Exception as e:
            logging.error(f"Error extracting shipment: {e}")
            return {}

    def flatten(self) -> None:
        # Flattens all shipments into a records list
        try:
            logging.info("Flattening data...")
            for entry in self.data:
                for shipment in entry.get('trackDetails', []):
                    record = self.extract_fields(shipment)
                    self.records.append(record)
            logging.info(f"Extracted {len(self.records)} flattened records.")
        except Exception as e:
            logging.error(f"Failed during flattening: {e}")
            raise

    def to_dataframe(self) -> pd.DataFrame:
        # Converts records to DataFrame
        try:
            self.df = pd.DataFrame(self.records)
            logging.info("Converted records to dataframe")
            return self.df
        except Exception as e:
            logging.error(f"Failed to convert records to DataFrame: {e}")
            raise

    def save_to_csv(self, filename: str) -> None:
        # Saves flattened shipment data to CSV
        try:
            self.df.to_csv(filename, index=False)
            logging.info(f"Saved shipment data to {filename}")
        except Exception as e:
            logging.error(f"Failed to save CSV: {e}")
            raise

    def safe_mode(self, series: pd.Series) -> float | None:
        # Calculates and returns mode of a series
        mode_vals = series.mode()
        if not mode_vals.empty:
            return mode_vals[0]
        else:
            return None

    def get_summary_stats(self, series: pd.Series) -> list[float | None]:
        # Returns mean, median, and mode of a series
        return [
            round(series.mean(),2),
            round(series.median(),2),
            round(self.safe_mode(series),2)
        ]

    def export_summary_statistics(self, output_file: str) -> None:
        # Calculates and saves summary statistics to CSV
        try:
            # Dictionary to hold summary statistics
            summary = {
                "Metric": ["Mean", "Median", "Mode"],
                "Days Taken for Delivery": self.get_summary_stats(self.df['Days Taken for Delivery']),
                "Delivery Attempts": self.get_summary_stats(self.df['Number of Delivery Attempts'])
            }
            summary_df = pd.DataFrame(summary)
            summary_df.to_csv(output_file, index=False)
            logging.info(f"Saved summary statistics to {output_file}")
        except Exception as e:
            logging.error(f"Failed to compute summary: {e}")
            raise

#### 4) Final Actions 

#### Step 1: Initialize the Shipment Processor Class

In [None]:
processor = ShipmentProcessor('Swift Assignment 4 - Dataset.json')

#### Step 2: Load Shipment Tracking JSON Data

In [None]:
processor.load_json()

#### Step 3: Flatten Nested Shipment Records into a Tabular Format

In [None]:
processor.flatten()

#### Step 4: Convert Flattened Records into a Pandas DataFrame

In [None]:
processor.to_dataframe()

#### Step 5: Export Cleaned Shipment Data to CSV

In [None]:
processor.save_to_csv('shipment_tracking_output.csv')

#### Step 6: Generate Summary Statistics and Export to CSV

In [None]:
processor.export_summary_statistics('shipment_summary_statistics.csv')

#### Final Submissions:

- `shipment_tracking_output.csv`: Flattened, cleaned shipment tracking data
- `shipment_summary_statistics.csv`: Mean, Median, Mode of Days Taken and Delivery Attempts

All steps above include error handling, logging, and code organization using a class-based approach.
