# NIFTY Market Data Engine — API Test Suite
**Version:** 3.0  
**Author:** Data Pipeline Team  
**Purpose:** Validate all query methods of `NiftyMarketData` and demonstrate correct usage patterns for the Pricing, Hedging, and Volatility Surface teams.

---

## Table of Contents
1. [Environment Setup](#1-environment-setup)
2. [Engine Initialisation](#2-engine-initialisation)
3. [Discovery Queries](#3-discovery-queries)
4. [Basic Option Chain Query](#4-basic-option-chain-query)
5. [Strike Filtering](#5-strike-filtering)
6. [Option Type Filtering (Calls / Puts)](#6-option-type-filtering)
7. [Intraday Time Window Query](#7-intraday-time-window-query)
8. [Liquidity Filtering](#8-liquidity-filtering)
9. [Combined Query (All Filters Together)](#9-combined-query)
10. [ATM Strike Grid Generation](#10-atm-strike-grid-generation)
11. [Spot Price Merge Validation](#11-spot-price-merge-validation)
12. [Volatility Surface Snapshot](#12-volatility-surface-snapshot)
13. [Time Series Query (Multi-Day)](#13-time-series-query)
14. [Error Handling Demonstrations](#14-error-handling)
15. [Performance Benchmarks](#15-performance-benchmarks)
16. [Cache Management](#16-cache-management)

---
## 1. Environment Setup

In [None]:
import sys
import os
import time
import pandas as pd

print("Python executable:", sys.executable)
print("Working directory:", os.getcwd())
print("Pandas version   :", pd.__version__)

---
## 2. Engine Initialisation
Set `BASE_DIR` to the root of the shared Google Drive dataset folder. All other paths are resolved automatically.

In [None]:
from api.marketdatav3 import NiftyMarketData

# ─── Set your local path to the shared dataset root ───────────────────────────
BASE_DIR = r"G:\.shortcut-targets-by-id\1f6XlJFCOVmETxGoJjD4O9WSmhB38IsQy\FinanceProject_LogicLabs\NiftyHistorical2024\raw_kaggle"
# ──────────────────────────────────────────────────────────────────────────────

md = NiftyMarketData(base_dir=BASE_DIR)

---
## 3. Discovery Queries
Use these before any data query to confirm that the expiry, trade date, and strikes you want actually exist in the dataset.

### 3a. List available expiries for a trade date

In [None]:
expiries = md.list_expiries(trade_date="01JAN24")

print(f"Expiries available on 01 Jan 2024: {len(expiries)} found")
for e in expiries:
    print(" ", e)

### 3b. List available trading days for a month

In [None]:
trading_days = md.list_trading_days(year=2024, month="JAN")

print(f"Trading days in January 2024: {len(trading_days)} found")
print(trading_days)

### 3c. List available strikes for a specific expiry

In [None]:
strikes_available = md.list_strikes(expiry="01FEB24", trade_date="01JAN24")

print(f"Total strikes available: {len(strikes_available)}")
print("First 15 strikes:", strikes_available[:15])

---
## 4. Basic Option Chain Query
Fetch the complete option chain for one expiry on one trade date. No filters applied — all strikes, both Calls and Puts, all intraday timestamps.

In [None]:
df = md.query_options(
    expiry="01FEB24",
    trade_date="01JAN24"
)

print(f"Rows returned : {len(df):,}")
print(f"Columns       : {df.columns.tolist()}")
print(f"Time range    : {df['timestamp'].min()} → {df['timestamp'].max()}")
print(f"Strikes range : {df['strike'].min()} → {df['strike'].max()}")
df.head()

---
## 5. Strike Filtering
Request data for a specific list of strikes only.

In [None]:
df_strikes = md.query_options(
    expiry="01FEB24",
    trade_date="01JAN24",
    strikes=[21500, 21600, 21700, 21800, 21900]
)

print(f"Rows returned          : {len(df_strikes):,}")
print(f"Unique strikes returned: {sorted(df_strikes['strike'].unique())}")
df_strikes.head()

---
## 6. Option Type Filtering
Filter to Calls only, or Puts only.

### 6a. Calls only

In [None]:
df_calls = md.query_options(
    expiry="01FEB24",
    trade_date="01JAN24",
    option_type="C"
)

print(f"Rows returned         : {len(df_calls):,}")
print(f"Option types in result: {df_calls['option_type'].unique()}")
df_calls.head()

### 6b. Puts only

In [None]:
df_puts = md.query_options(
    expiry="01FEB24",
    trade_date="01JAN24",
    option_type="P"
)

print(f"Rows returned         : {len(df_puts):,}")
print(f"Option types in result: {df_puts['option_type'].unique()}")

---
## 7. Intraday Time Window Query
Restrict data to a specific intraday window. NIFTY trades 09:15–15:30 IST.

In [None]:
df_window = md.query_options(
    expiry="01FEB24",
    trade_date="01JAN24",
    start="2024-01-01 09:30",
    end="2024-01-01 10:30"
)

print(f"Rows returned : {len(df_window):,}")
print(f"Earliest time : {df_window['timestamp'].min()}")
print(f"Latest time   : {df_window['timestamp'].max()}")
df_window.head()

### 7b. Single-minute snapshot

In [None]:
df_snapshot = md.query_options(
    expiry="01FEB24",
    trade_date="01JAN24",
    start="2024-01-01 10:00",
    end="2024-01-01 10:00"
)

print(f"Rows at exactly 10:00 AM: {len(df_snapshot)}")
print(f"Unique timestamps       : {df_snapshot['timestamp'].unique()}")
df_snapshot[["strike", "option_type", "market_price", "spot_price"]].head(10)

---
## 8. Liquidity Filtering
Many rows in this dataset have zero traded volume. Apply `min_volume` to retain only actively traded contracts — essential for accurate implied volatility estimation.

In [None]:
# Unfiltered baseline
df_all     = md.query_options(expiry="01FEB24", trade_date="01JAN24")
zero_vol   = (df_all["volume"] == 0).sum()
print(f"Total rows           : {len(df_all):,}")
print(f"Rows with zero volume: {zero_vol:,}  ({100*zero_vol/len(df_all):.1f}%)")

In [None]:
# Apply liquidity filter
df_liquid = md.query_options(
    expiry="01FEB24",
    trade_date="01JAN24",
    min_volume=10
)

print(f"Rows after min_volume=10 filter: {len(df_liquid):,}")
print(f"Minimum volume in result       : {df_liquid['volume'].min()}")
df_liquid.head()

---
## 9. Combined Query
All filters applied simultaneously: specific strikes, Calls only, intraday window, liquidity threshold.

This is the recommended pattern for the Pricing Team when computing implied volatility.

In [None]:
# First, find the ATM strikes dynamically
atm, grid = md.get_atm_strikes("01FEB24", "01JAN24", n_strikes=3, step=100)
print(f"ATM: {atm}")
print(f"Selected strikes: {grid}")

In [None]:
df_combined = md.query_options(
    expiry="01FEB24",
    trade_date="01JAN24",
    strikes=grid,
    option_type="C",
    start="2024-01-01 10:00",
    end="2024-01-01 12:00",
    min_volume=5
)

print(f"Rows returned : {len(df_combined):,}")
df_combined[["timestamp", "strike", "option_type", "market_price", "volume", "spot_price"]].head(10)

---
## 10. ATM Strike Grid Generation
Generate a symmetric grid of strikes around the at-the-money level, derived automatically from the opening spot price.

In [None]:
# Default: 10 strikes each side of ATM, step=100
atm, grid = md.get_atm_strikes(
    expiry="01FEB24",
    trade_date="01JAN24",
    n_strikes=10,
    step=100
)

print(f"ATM Strike     : {atm}")
print(f"Grid size      : {len(grid)} strikes")
print(f"Strike range   : {min(grid)} → {max(grid)}")
print(f"Strike grid    : {grid}")

In [None]:
# Narrower grid — 5 strikes each side, step=50
atm_50, grid_50 = md.get_atm_strikes(
    expiry="01FEB24",
    trade_date="01JAN24",
    n_strikes=5,
    step=50
)

print(f"ATM: {atm_50}")
print(f"Grid (step=50): {grid_50}")

---
## 11. Spot Price Merge Validation
Verify that spot prices are correctly merged and contain no null values.

In [None]:
df_check = md.query_options(expiry="01FEB24", trade_date="01JAN24")

null_spot = df_check["spot_price"].isna().sum()
print(f"Null spot_price values : {null_spot}  ← should be 0")
print(f"Spot price range       : {df_check['spot_price'].min()} → {df_check['spot_price'].max()}")

# Show spot tracking against timestamps
df_check[["timestamp", "strike", "market_price", "spot_price"]].head(10)

In [None]:
# Verify spot updates minute-by-minute
spot_by_time = (
    df_check
    .drop_duplicates("timestamp")[["timestamp", "spot_price"]]
    .sort_values("timestamp")
    .head(15)
)
print(spot_by_time.to_string(index=False))

---
## 12. Volatility Surface Snapshot
The primary deliverable for Team 2b. Builds a complete (expiry × strike) grid at a single timestamp — directly usable for implied volatility surface fitting.

In [None]:
surface = md.surface_snapshot(
    trade_date="01JAN24",
    timestamp="2024-01-01 10:00",
    n_expiries=6,
    n_strikes=10,
    step=100
)

print(f"Total rows in surface   : {len(surface)}")
print(f"Expiries included       : {sorted(surface['expiry_date'].unique())}")
print(f"Days-to-expiry range    : {surface['days_to_expiry'].min()} → {surface['days_to_expiry'].max()} days")
surface.head(10)

In [None]:
# Summary: rows per expiry
print(surface.groupby("expiry_date")[["strike","market_price"]]
      .agg(n_strikes=("strike","count"),
           strike_min=("strike","min"),
           strike_max=("strike","max"),
           avg_price=("market_price","mean"))
      .round(2)
      .to_string())

In [None]:
# Calls-only surface (for standard Black-Scholes IV fitting)
surface_calls = md.surface_snapshot(
    trade_date="01JAN24",
    timestamp="2024-01-01 10:00",
    n_expiries=6,
    n_strikes=10,
    option_type="C",
    min_volume=0
)

print(f"Call surface rows: {len(surface_calls)}")
print(f"Option types     : {surface_calls['option_type'].unique()}")

### How Pricing Team Uses This Output

```python
# Extract key columns directly for Black-Scholes
S     = surface_calls["spot_price"]       # Spot price
K     = surface_calls["strike"]            # Strike
T     = surface_calls["days_to_expiry"] / 365.0   # Time to expiry (years)
C_mkt = surface_calls["market_price"]     # Market option price

# Then pass to your IV solver:
# sigma = implied_vol(S, K, T, r, C_mkt, option_type='C')
```

---
## 13. Time Series Query (Multi-Day)
Track how a specific option's price evolves across multiple trading days.

In [None]:
# Get all trading days in January 2024
jan_days = md.list_trading_days(2024, "JAN")
print(f"January 2024 trading days ({len(jan_days)}): {jan_days}")

In [None]:
# Track the ATM call price at 10:00 AM each day
df_ts = md.query_time_series(
    expiry="01FEB24",
    trade_dates=jan_days[:5],          # first 5 days
    strikes=[21700],
    option_type="C",
    snapshot_time="10:00"
)

print(f"Rows returned: {len(df_ts)}")
df_ts[["trade_date", "timestamp", "strike", "option_type",
       "market_price", "spot_price", "days_to_expiry"]]

---
## 14. Error Handling Demonstrations
The engine returns clear, actionable error messages. These examples show what happens when queries are malformed or data is unavailable.

### 14a. Invalid option_type

In [None]:
from api.marketdatav3 import InvalidParameter, FileNotAvailable, NoDataReturned

try:
    md.query_options(expiry="01FEB24", trade_date="01JAN24", option_type="CE")
except InvalidParameter as e:
    print("Caught InvalidParameter:")
    print(e)

### 14b. Non-existent expiry

In [None]:
try:
    md.query_options(expiry="31DEC24", trade_date="01JAN24")
except FileNotAvailable as e:
    print("Caught FileNotAvailable:")
    print(e)

### 14c. Bad date format

In [None]:
try:
    md.query_options(expiry="01FEB24", trade_date="2024-01-01")
except InvalidParameter as e:
    print("Caught InvalidParameter:")
    print(e)

### 14d. Over-filtered query returns no data

In [None]:
# Very high min_volume — returns empty with a helpful message (no exception by default)
df_empty = md.query_options(
    expiry="01FEB24",
    trade_date="01JAN24",
    strikes=[21700],
    option_type="C",
    min_volume=999999
)
print(f"\nRows returned: {len(df_empty)}")

In [None]:
# With raise_if_empty=True — raises exception
try:
    md.query_options(
        expiry="01FEB24",
        trade_date="01JAN24",
        strikes=[21700],
        option_type="C",
        min_volume=999999,
        raise_if_empty=True
    )
except NoDataReturned as e:
    print("Caught NoDataReturned:")
    print(e)

---
## 15. Performance Benchmarks

In [None]:
# Cold load (no cache)
md.clear_spot_cache()

t0 = time.time()
_ = md.query_options(expiry="01FEB24", trade_date="01JAN24")
t1 = time.time()
print(f"Cold load (spot not cached)  : {t1-t0:.3f}s")

# Warm load (spot is now cached)
t0 = time.time()
_ = md.query_options(expiry="01FEB24", trade_date="01JAN24", strikes=[21700])
t1 = time.time()
print(f"Warm load (spot cached)      : {t1-t0:.3f}s")

# Surface snapshot
t0 = time.time()
_ = md.surface_snapshot(trade_date="01JAN24", timestamp="2024-01-01 10:00", n_expiries=6)
t1 = time.time()
print(f"Surface snapshot (6 expiries): {t1-t0:.3f}s")

---
## 16. Cache Management

In [None]:
# Show what is currently cached
print("Cached months:", md.cache_status())

In [None]:
# Clear cache (useful for long-running sessions or after dataset update)
md.clear_spot_cache()
print("Cache after clearing:", md.cache_status())

---
## Summary: Full API Reference

| Method | Purpose |
|--------|---------|
| `query_options(expiry, trade_date, ...)` | Core data fetch with all filters |
| `list_expiries(trade_date)` | Discover available expiries on a date |
| `list_strikes(expiry, trade_date)` | Discover available strikes |
| `list_trading_days(year, month)` | Discover trading days in a month |
| `get_atm_strikes(expiry, trade_date, ...)` | Generate ATM-centered strike grid |
| `query_time_series(expiry, trade_dates, ...)` | Multi-day evolution query |
| `surface_snapshot(trade_date, timestamp, ...)` | Full vol-surface input grid |
| `clear_spot_cache()` | Free memory / reset cache |
| `cache_status()` | Inspect what is in memory |