# Sentinel — AI-Powered Premarket Forecasting System 
### Project Overview & Architecture  
**Author:** Brandon Theard  
**Date:** December 6, 2025

## 1. What Is Sentinel?

**Sentinel** is an AI-powered premarket forecasting engine designed to predict whether a stock will **continue** or **fade** after the opening bell.

The system combines:

- Historical **options flow data** (TradyFlow dataset)
- Live **premarket data** (Polygon API)
- Engineered volatility, flow, and momentum features
- **Machine learning models** (LightGBM)
- **Excel trade journaling** with SHAP-based explanations
- A modern **Streamlit dashboard**, deployed on **Azure**

Sentinel bridges *research, machine learning, and real-time analytics* into one cohesive platform.

## 2. Why Build Sentinel?

Premarket price action often contains early signals about market behavior—but most traders rely on intuition, not data.

Sentinel aims to:

- Quantify premarket strength and weakness
- Estimate probability of continuation vs reversal
- Log predictions, confidence, and features in an Excel trade journal
- Provide a trader-friendly dashboard for daily forecasting
- Demonstrate cloud deployment skills through an Azure architecture

This project also serves as one of the **Big 4 Portfolio Projects**, showing expertise in:
- Machine learning
- Time-series forecasting
- Feature engineering
- Explainability (SHAP)
- Full-stack data product development
- Azure deployment

## 3. High-Level Architecture

      TradyFlow Dataset (Historical)
                 │
                 ▼
       Feature Engineering (Offline)
                 │
                 ▼
          ML Model Training (LGBM)
                 │
                 ▼
     Inference Engine (Live Premarket)
                 │
                 ▼
    Excel Trade Journal + SHAP Insights
                 │
                 ▼
     Streamlit Dashboard (Azure Deploy)

## 4. Core System Components

### **1. Data Ingestion**
- TradyFlow dataset (historical options flow)
- Polygon API (live premarket data)
- System ensures no leakage: only data available before prediction is used.

---

### **2. Feature Engineering**
Sentinel computes advanced features:
- Premarket gap %, volume surge
- Flow imbalance index
- Delta-weighted pressure
- Premarket strength score
- Volatility compression signals
- Market microstructure–inspired indicators

---

### **3. Machine Learning Model**
- LightGBM classifier
- Walk-forward validation (time-based)
- Regularization to prevent overfitting
- Probability calibration
- SHAP explainability

---

### **4. Live Inference Engine**
- Builds identical features from live data
- Produces predictions + confidence scores
- Generates trade suggestions (Long / Short / No Trade)

---

### **5. Excel Trade Journal**
Each prediction includes:
- Ticker  
- Timestamp  
- Probability of continuation  
- Expected return estimate  
- SHAP feature breakdown  
- LLM commentary (optional)

---

### **6. Dashboard**
A modern Streamlit dashboard displaying:
- Today's signals
- Journal metrics
- SHAP explanations
- Feature insights
- Weekly performance

---

### **7. Azure Deployment**
- Azure App Service (dashboard)
- Azure Functions (scheduled inference)
- Azure Blob Storage (journal storage)
- Azure Key Vault (secrets)

## 5. Machine Learning Concepts Used in Sentinel

Sentinel is built to follow industry-grade ML practices:

### **1. Data Leakage Prevention**
- Only features available before 9:30 AM are used.
- No future candles or indicators.

### **2. Walk-Forward Validation**
- Train on past → test on future.
- Essential for time-series markets.

### **3. Feature Engineering Principles**
- Normalize features using percentiles/z-scores.
- Focus on domain-driven features (flow, vol, momentum).

### **4. Overfitting Prevention**
- Regularization (L1/L2)
- Early stopping
- Feature selection based on SHAP and domain logic

### **5. SHAP Interpretability**
- Explains *why* the model predicted continuation vs fade.
- Supports Excel journal entries and dashboards.

### **6. Drift Monitoring**
- Detect changes in market regime.
- Trigger model updates if performance decays.

## 6. How Sentinel Fits Into My Portfolio

Sentinel represents the **Azure / Machine Learning / Real-Time Forecasting** project in the Big 4 portfolio:

1. **VAE (GCP)**  
   - Research engine, volatility modeling, RL integration.

2. **Sentinel (Azure)**  
   - Premarket predictions, feature engineering, ML, dashboard, cloud deployment.

3. **Market NLP Intelligence Suite (AWS)**  
   - Language modeling, earnings sentiment, real-time news analysis.

4. **RL Options Trading System**  
   - Reinforcement learning environment + policy agent demonstration.

Sentinel showcases:
- Clean ML pipeline design  
- Strong feature engineering  
- Production-style architecture  
- Cloud deployment skills  
- Explainability and analytics  

## 7. What Comes Next

With Notebook 00 complete, the next steps are:

### **1. Notebook 01 — TradyFlow EDA**
- Load raw dataset
- Explore key columns (delta, gamma, volume, etc.)
- Save cleaned version to `data/processed/`

### **2. Begin Feature Engineering (Notebook 02)**
- Create premarket + flow-based features
- Normalize and validate distributions

### **3. Early Modeling Prototype**
- LightGBM model
- Walk-forward validation pipeline

### **4. Set Up Journal Writer + Dashboard Skeleton**
- Append predictions to Excel
- Basic UI to display predictions

Sentinel now has a fully defined architecture and direction.