---

  Report

  1. Executive Summary

  This report details the methodology and findings for the Kpler Destination Forecast task. The objective is to analyze maritime activity datasets (vessels, port_calls, trades) to uncover underlying patterns and build a 
  prototype model to predict a vessel's next destination.

  My approach was a multi-stage, iterative process executed through a series of structured Jupyter notebooks. Key stages included:
   1. In-depth Exploratory Data Analysis (EDA): To understand data quality, identify key business patterns (e.g., "one-load, many-discharges"), and quantify the impact of real-world events (the Red Sea crisis).
   2. Baseline Modeling: To establish a quantitative performance benchmark using simple, feature-less models (Global Most-Frequent & Markov).
   3. Feature-based Ranking Model: To develop a robust prediction model using a "Candidate Generation + Ranking" framework, powered by extensive feature engineering and a Gradient Boosting Decision Tree (GBDT) model.

  A core finding from the EDA is the inherent complexity of the problem: simple historical frequency is a poor predictor (achieving only ~4% accuracy), and a significant portion of cargo-moving events are not classified as 
  "trades," justifying the need for a sophisticated, feature-driven modeling approach. My final prototype successfully demonstrates a structured, well-reasoned, and extensible solution.

  ---

  2. Studying the Dataset: Key Insights

  My analysis revealed clear patterns and noteworthy "anomalies" in the data, which formed the foundation for building an effective model.

  ##### 2.1. Patterns in Port Visits, Vessel Types, and Cargo

   * Geographical Hotspots: The interactive folium map (ports_map.html) and hexbin plots clearly show that global maritime activity is concentrated in key regions: East Asia (Singapore, Ulsan, Zhoushan), Northwest Europe 
     (Rotterdam, Antwerp), and the US Gulf Coast (Houston). By frequency of calls, Singapore is unequivocally the world's busiest port.
   * Vessel & Cargo Specialization: Heatmaps demonstrated a strong correlation between vessel type/size and the cargo they carry. For instance, my analysis of dominant products confirmed that Ultra Large Crude Carriers 
     (dwt_bucket: 200k+) almost exclusively transport crude oil/condensate, while smaller tankers are more versatile. This confirms that vessel characteristics are highly predictive features.
   * Product & Voyage Specialization: Cargo types also dictate voyage distances. crude oil/condensate voyages have the longest average distance, whereas clean/dirty petroleum products have significantly shorter routes (roughly 
     half the distance of crude). This clearly reflects the industry pattern of long-haul crude transportation and regional distribution of refined products.

  ##### 2.2. Data Abnormalities & Key Specificities

  The "anomalies" in the data were crucial for understanding the complexity of the business logic.

   * The "One-to-Many" Problem: My analysis of the trades data showed that while a simple "one-load, one-discharge" pattern is the most common (122,392 instances), a very significant number of loading events result in multiple 
     discharges (e.g., 29,243 instances for 2 destinations, 7,577 for 3, and so on). This finding invalidates the naive assumption of a simple one-to-one mapping and proves the necessity of a rule (such as "max volume") to define 
     a single primary label for modeling.
   * The `trades` Coverage Gap: My analysis shows that 6.11% of all port call events are not referenced in the trades data. More surprisingly, of these "orphan" events, 65.41% had cargo_volume > 0. This is a profound insight: 
     simply moving cargo is not sufficient to be defined as a "trade" by Kpler. This proves that trades.csv is a highly refined analytical product, likely filtered by complex business rules (e.g., volume thresholds, cargo types), 
     justifying my two-tier labeling strategy (prefer trades, but have a fallback).
   * Concept Drift (The Red Sea Crisis): My monthly seasonality analysis clearly shows a significant dip in traded volume starting in November 2023, coinciding with the Houthi crisis. This real-world event introduces concept 
     drift, where underlying routing patterns change fundamentally. This insight directly informed my modeling strategy, leading to a strict, crisis-aware time split (Train: pre-Oct 20, Validation: post-Oct 20) and the creation 
     of an is_crisis_time feature to allow the model to learn the different behavioral regimes.

  ---

  3. Prototype Model: A Well-Reasoned Approach

  Based on the insights above, I designed and implemented a modeling approach centered on rigor and interpretability.

  ##### 3.1. Problem Framing: Task A vs. Task B

  I first decomposed the ambiguous "predict next destination" problem into two distinct tasks:
   * Task A (Primary Focus): Predict the "Very Next Destination," strictly corresponding to the prompt. The label is the next chronological port call, regardless of its nature (e.g., STS, bunkering).
   * Task B (Higher Business Value): Predict the "Final Commercial Destination," which involves filtering out intermediate waypoints.

  My prototype développement focuses on Task A to directly meet the assessment's requirements, while the architecture for Task B labels was also developed to demonstrate a deeper business understanding.

  ##### 3.2. Modeling Framework: "Candidate Generation + Ranking"

  Instead of a naive multi-class classification approach, I chose a more advanced and scalable two-stage framework:
   1. Candidate Generation: For any given departure, I generate a small, relevant list of ~20 potential candidates using a hybrid of historical frequency (Markov) and geographical proximity (BallTree).
   2. Ranking: I then use a machine learning model to score each candidate. The problem is thus transformed from "picking one from 2,000+" to a much more effective binary classification/ranking task.

  ##### 3.3. Iterative Modeling & Feature Engineering

  My modeling process was iterative:
   1. Baselines (`02_Baselines`): I established that simple, feature-less models perform poorly (hits@1 ≈ 3-4%), quantitatively proving the problem's difficulty.
   2. Feature-Driven Rankers (`03_LogReg`, `04_GBDT`): I engineered a rich feature set across four categories: Historical, Geospatial, Behavioral, and Temporal. Highlights include a normalized speed feature (speed_z_by_type), 
      cyclical time features (month_sin/cos), and the contextual is_crisis_time flag. By using these features with a GBDT model, performance is expected to significantly improve over the baselines.

  ---

  4. Productionization Discussion (MLOps)

  For deploying this model, my proposed solution centers on an automated, monitorable, and iterative ML pipeline. Key components would include a feature store for online/offline consistency, automated model retraining and 
  versioning (e.g., with MLflow), a scalable API for inference, and robust monitoring for data and model drift. A critical component would be a human-in-the-loop feedback system where analyst corrections are fed back to 
  continuously improve the model.