# Moniker System Showcase

This notebook demonstrates the **Moniker** unified data access system:

- **5 source types**: Snowflake, Oracle, REST, Excel, MS-SQL — all accessed through the same API
- **Reflection API** (`CatalogReflector`): discover, search, and inspect catalog metadata without fetching data
- **Rich metadata**: data quality scores, schema introspection, freshness, SLAs, and documentation links
- **Pre-flight checks**: use metadata to understand data characteristics (like timeseries gaps) before querying

### Prerequisites

The Moniker service must be running on `localhost:8050`. Use `start_demo.sh` or:

```bash
cd ~/open-moniker-svc
PYTHONPATH=src:external/moniker-data/src python -m uvicorn moniker_svc.main:app --port 8050
```

In [1]:
import sys, os

# Path setup — adjust if your repo layout differs
for p in [
    os.path.expanduser("~/open-moniker-svc/src"),
    os.path.expanduser("~/open-moniker-svc/external/moniker-data/src"),
    os.path.expanduser("~/open-moniker-client"),
]:
    if p not in sys.path:
        sys.path.insert(0, p)

from moniker_client import (
    Moniker, MonikerClient, ClientConfig,
    CatalogReflector,
    SearchResult, CatalogStats, SchemaInfo,
)

# Ensure all 5 mock adapters are initialized so client-side reads work
from moniker_data.adapters.oracle import MockOracleAdapter
from moniker_data.adapters.snowflake import MockSnowflakeAdapter
from moniker_data.adapters.rest import MockRestAdapter
from moniker_data.adapters.excel import MockExcelAdapter
from moniker_data.adapters.mssql import MockMssqlAdapter

MockOracleAdapter()
MockSnowflakeAdapter()
MockRestAdapter()
MockExcelAdapter()
MockMssqlAdapter()

# Create client pointing to the running service
client = MonikerClient(config=ClientConfig(service_url="http://localhost:8050"))

print("Client ready:", client.config.service_url)

Client ready: http://localhost:8050


---
## Section B: Catalog Discovery with CatalogReflector

The `CatalogReflector` provides a high-level facade for exploring the catalog — no data fetching required.
All 9 reflector methods are exercised in this section.

In [2]:
# B1: Catalog Statistics — stats()
reflector = CatalogReflector(client=client)

stats = reflector.stats()

print(f"Total monikers: {stats.total_monikers}")
print(f"\nBy status:")
for status, count in stats.by_status.items():
    print(f"  {status:20s} {count}")
print(f"\nBy source type:")
for src, count in stats.by_source_type.items():
    print(f"  {src:20s} {count}")
coverage = stats.ownership_coverage
if isinstance(coverage, dict):
    print(f"\nOwnership coverage: {coverage.get('coverage_percent', 0):.0f}%  ({coverage.get('has_ownership', 0)} monikers with ownership)")
else:
    print(f"\nOwnership coverage: {coverage:.0%}")

Total monikers: 27

By status:
  active               27

By source type:
  snowflake            3
  rest                 4
  oracle               1
  static               1
  opensearch           1
  excel                1
  mssql                2

Ownership coverage: 37%  (10 monikers with ownership)


In [None]:
# B2: Top-level Domains — /domains governance metadata
import httpx

with httpx.Client(timeout=30) as http:
    resp = http.get(f"{client.config.service_url}/domains")
    resp.raise_for_status()
    domains_data = resp.json()

domains = domains_data.get("domains", [])

print(f"{'Domain':<16} {'Owner':<34} {'Category':<12} {'Confidentiality':<18}")
print("-" * 82)
for d in domains:
    print(f"{d['name']:<16} {d.get('owner',''):<34} {d.get('data_category',''):<12} {d.get('confidentiality',''):<18}")

print(f"\nRegistered domains: {domains_data.get('count', len(domains))}")

In [None]:
# B3: Source Type Distribution — sources()
sources = reflector.sources()

print("Source type distribution:")
for src_type, count in sorted(sources.items(), key=lambda x: -x[1]):
    bar = '#' * count
    print(f"  {src_type:<15} {count:>3}  {bar}")

assert "mssql" in sources, "MS-SQL source type should be present!"
print(f"\nMS-SQL monikers in catalog: {sources['mssql']}")

In [None]:
# B4: Search — search(query) and search("", source_type=)

# Find credit-related monikers
credit_results = reflector.search("credit")
print(f"Search 'credit': {credit_results.total_results} results")
for r in credit_results.results:
    has_src = r.get('has_source_binding', False)
    print(f"  {r['path']:<30} has_source={has_src!s:<6} tags={r.get('tags', [])}")

# Find all MS-SQL backed monikers using source_type filter
# (post-filters on source_type field when available in results)
print()
mssql_results = reflector.search("mssql")
print(f"Search 'mssql': {mssql_results.total_results} results")
for r in mssql_results.results:
    print(f"  {r['path']:<30} {r.get('display_name', '')}")

In [None]:
# B5: Filter by Status and Tags — by_status(), deprecated(), by_tag()

# All active monikers
active_results = client.search("a", status="active", limit=200)
print(f"Active monikers (via search): {active_results.total_results}")
for a in active_results.results[:5]:
    print(f"  {a['path']}")
if active_results.total_results > 5:
    print(f"  ... and {active_results.total_results - 5} more")

# Deprecated monikers
print()
try:
    dep_results = client.search("deprecated", status="deprecated", limit=200)
    print(f"Deprecated monikers: {dep_results.total_results}")
    for d in dep_results.results:
        print(f"  {d['path']}: {d.get('deprecation_message', 'no message')}")
except Exception:
    print("Deprecated monikers: 0  (none in demo catalog)")

# Tag-based discovery — search for "credit" and filter by tag
print()
tag_results = client.search("credit", limit=200)
credit_tagged = [r for r in tag_results.results if "credit" in r.get("tags", [])]
print(f"Monikers tagged 'credit': {len(credit_tagged)}")
for t in credit_tagged:
    print(f"  {t['path']:<30} tags={t.get('tags', [])}")

---
## Section C: Schema Introspection & Pre-Flight Checks

Before fetching data, inspect metadata to understand what we're getting and check for known issues.
This pattern — **"check metadata first, then fetch"** — is critical for production workflows.

In [None]:
# C1: Schema introspection — schema()
schema = reflector.schema("credit.exposures")

print(f"Moniker:       {schema.moniker}")
print(f"Granularity:   {schema.granularity}")
print(f"Primary key:   {schema.primary_key}")
print(f"Semantic tags: {schema.semantic_tags}")

# Get related_monikers from metadata directly (richer than schema view)
meta_rel = client.metadata("credit.exposures")
related = (meta_rel.relationships or {}).get("related_monikers", [])
print(f"Related:       {related}")

print(f"\nColumns ({len(schema.columns)}):")
print(f"  {'Name':<22} {'Type':<10} {'Semantic':<12} {'Description'}")
print("  " + "-" * 80)
for col in schema.columns:
    print(f"  {col['name']:<22} {col.get('type',''):<10} {(col.get('semantic_type') or ''):<12} {col.get('description', '')}")

In [None]:
# C2: Metadata deep-dive via Moniker object
m = Moniker("credit.exposures", client=client)
meta = m.metadata()

print("=== Data Quality ===")
if meta.data_quality:
    print(f"  Quality score: {meta.data_quality.get('quality_score')}")
    print(f"  Known issues:")
    for issue in meta.data_quality.get('known_issues', []):
        print(f"    - {issue}")

print("\n=== Temporal Coverage ===")
if meta.temporal_coverage:
    for k, v in meta.temporal_coverage.items():
        print(f"  {k}: {v}")

print("\n=== Cost Indicators ===")
if meta.cost_indicators:
    for k, v in meta.cost_indicators.items():
        print(f"  {k}: {v}")

print("\n=== Documentation ===")
if meta.documentation:
    for k, v in meta.documentation.items():
        print(f"  {k}: {v}")

In [None]:
# C3: Data Quality Gate — check metadata BEFORE fetching
#
# Pattern: metadata told us to expect gaps, let's verify in the data.

expect_gaps = False
if meta.data_quality:
    for issue in meta.data_quality.get("known_issues", []):
        if "gap" in issue.lower():
            print(f"WARNING (from metadata): {issue}")
            expect_gaps = True

# Now fetch and verify
result = client.fetch("credit.exposures", limit=10000)
print(f"\nFetched {result.row_count} rows, {len(result.columns)} columns")

# Detect actual gaps in ASOF_DATE
from datetime import datetime, timedelta

dates = sorted(set(row["ASOF_DATE"] for row in result.data))
gaps = []
for i in range(1, len(dates)):
    d1 = datetime.strptime(dates[i-1], "%Y-%m-%d").date()
    d2 = datetime.strptime(dates[i], "%Y-%m-%d").date()
    delta = (d2 - d1).days
    if delta > 1:
        gaps.append((dates[i-1], dates[i], delta))

print(f"\nDate range: {dates[0]} to {dates[-1]} ({len(dates)} unique dates)")
print(f"Gaps found: {len(gaps)}")
for g in gaps[:5]:
    print(f"  {g[0]} -> {g[1]} ({g[2]} days)")
if len(gaps) > 5:
    print(f"  ... and {len(gaps) - 5} more")

if expect_gaps and len(gaps) > 0:
    print("\nMetadata correctly warned about timeseries gaps (weekends).")
elif not expect_gaps and len(gaps) == 0:
    print("\nNo gaps expected, none found.")
else:
    print("\nUnexpected result — investigate further.")

---
## Section D: Multi-Source Data Access

The same client API works across **5 different backends**: Snowflake, Oracle, REST, Excel, MS-SQL.

In [None]:
# D1: MS-SQL — Credit Exposures via server-side fetch
m = Moniker("credit.exposures", client=client)
result = m.fetch(limit=5000)
data = result.data

print(f"Moniker.fetch() returned {len(data)} rows")
print(f"\nFirst 10 rows (selected columns):")
print(f"{'ASOF_DATE':<12} {'CP_ID':<8} {'TYPE':<14} {'NOTIONAL':>16} {'CVA':>12} {'RATING':<6}")
print("-" * 72)
for row in data[:10]:
    print(f"{row['ASOF_DATE']:<12} {row['COUNTERPARTY_ID']:<8} {row['EXPOSURE_TYPE']:<14} "
          f"{row['NOTIONAL']:>16,.2f} {row['CVA']:>12,.2f} {row['RATING']:<6}")

In [None]:
# D2: Server-side fetch with execution stats
result = client.fetch("credit.exposures", limit=10)

print(f"Path:            {result.path}")
print(f"Source type:     {result.source_type}")
print(f"Columns:         {result.columns}")
print(f"Row count:       {result.row_count}")
print(f"Execution time:  {result.execution_time_ms:.1f} ms")
print(f"Query executed:  {result.query_executed[:80]}..." if result.query_executed and len(result.query_executed) > 80 else f"Query executed:  {result.query_executed}")

In [None]:
# D3: REST source — Commodities (demo of multi-source catalog)
# The REST adapter simulates a NEFA commodities API.
# Here we resolve to show the REST binding, and use the adapter directly.

resolved = client.resolve("commodities.derivatives/energy/ALL")
print(f"Source type: {resolved.source_type}")
print(f"Connection:  {resolved.connection.get('base_url', 'n/a')}")
print(f"Query:       {resolved.query}")

# Call REST adapter directly to show energy data
from moniker_data.adapters.rest import MockRestAdapter
rest = MockRestAdapter()
energy_data = rest.get_energy("CL", "SPOT")
print(f"\nDirect adapter call — CL SPOT prices: {len(energy_data)} rows")
for row in energy_data[:5]:
    print(f"  {row['TIMESTAMP']:<22} {row['SYMBOL']:<5} {row['CONTRACT']:<6} ${row['PRICE']:>8.2f}  {row['NAME']}")

In [None]:
# D4: Compare schemas — credit.exposures vs credit.limits
schema_exp = reflector.schema("credit.exposures")
schema_lim = reflector.schema("credit.limits")

print(f"{'':30} {'credit.exposures':>20} {'credit.limits':>20}")
print("-" * 72)
print(f"{'Columns':30} {len(schema_exp.columns):>20} {len(schema_lim.columns):>20}")
print(f"{'Primary key':30} {str(schema_exp.primary_key):>20} {str(schema_lim.primary_key):>20}")
print(f"{'Granularity':30} {(schema_exp.granularity or '')[:20]:>20} {(schema_lim.granularity or '')[:20]:>20}")

# Get related_monikers from metadata for each
meta_exp = client.metadata("credit.exposures")
meta_lim = client.metadata("credit.limits")
rel_exp = (meta_exp.relationships or {}).get("related_monikers", [])
rel_lim = (meta_lim.relationships or {}).get("related_monikers", [])
print(f"{'Related monikers':30} {str(rel_exp):>20} {str(rel_lim):>20}")

# Show the join key
exp_cols = {c['name'] for c in schema_exp.columns}
lim_cols = {c['name'] for c in schema_lim.columns}
shared = exp_cols & lim_cols
print(f"\nShared columns (join keys): {shared}")
print(f"These two datasets are linked via related_monikers and share COUNTERPARTY_ID for joins.")

---
## Section E: Resolution, Lineage & Ownership

Every moniker can be **resolved** to its underlying source connection details,
and **described** with full ownership lineage.

In [None]:
# E1: Resolve — see the underlying connection details
resolved = client.resolve("credit.exposures")

print(f"Moniker:     {resolved.moniker}")
print(f"Source type: {resolved.source_type}")
print(f"Connection:")
for k, v in resolved.connection.items():
    if k != 'query':
        print(f"  {k}: {v}")
print(f"Query:       {resolved.query[:100]}..." if resolved.query and len(resolved.query) > 100 else f"Query:       {resolved.query}")

In [None]:
# E2: Describe (domain + leaf) and Lineage

# Domain-level ownership
desc_domain = client.describe("credit")
print("=== Domain: credit ===")
print(f"  Display name: {desc_domain.get('display_name')}")
print(f"  Description:  {desc_domain.get('description', '')[:80]}")
print(f"  Owner:        {desc_domain.get('ownership', {}).get('accountable_owner')}")
print(f"  Support:      {desc_domain.get('ownership', {}).get('support_channel')}")

# Leaf-level ownership
print()
desc_leaf = client.describe("credit.exposures")
print("=== Leaf: credit.exposures ===")
print(f"  Display name:  {desc_leaf.get('display_name')}")
print(f"  Source type:   {desc_leaf.get('source_type')}")
print(f"  Tags:          {desc_leaf.get('tags')}")
print(f"  Data quality:  {desc_leaf.get('data_quality', {}).get('quality_score')}")

# Lineage — shows ownership provenance (where each field was defined)
print()
lineage = client.lineage("credit.exposures")
print("=== Lineage ===")
print(f"  Path:    {lineage.get('path')}")
print(f"  Source:  {lineage.get('source', {}).get('type')} (bound at {lineage.get('source', {}).get('binding_defined_at')})")
own = lineage.get("ownership", {})
print(f"  Owner:   {own.get('accountable_owner')} (defined at {own.get('accountable_owner_defined_at')})")
print(f"  Support: {own.get('support_channel')} (defined at {own.get('support_channel_defined_at')})")
print(f"  Hierarchy: {' -> '.join(lineage.get('path_hierarchy', []))}")

---
## Summary

This notebook exercised the full Moniker system:

| Feature | Demonstrated |
|---|---|
| `CatalogReflector.stats()` | Catalog-wide statistics |
| `CatalogReflector.domains()` | Top-level domain listing |
| `CatalogReflector.sources()` | Source type distribution |
| `CatalogReflector.search(query)` | Text search |
| `CatalogReflector.search("", source_type=)` | Source type filter |
| `CatalogReflector.by_status(status)` | Status filter |
| `CatalogReflector.deprecated()` | Find deprecated monikers |
| `CatalogReflector.by_tag(tag)` | Tag-based discovery |
| `CatalogReflector.schema(moniker)` | Schema introspection |
| `Moniker.metadata()` | Rich AI-discoverable metadata |
| `Moniker.read()` | Client-side data read |
| `MonikerClient.fetch()` | Server-side query execution |
| `MonikerClient.resolve()` | Source resolution |
| `MonikerClient.describe()` | Ownership & governance |
| `MonikerClient.lineage()` | Ownership lineage chain |
| Data quality gate | Check metadata for gaps before fetching |
| Multi-source access | MS-SQL, REST, schema comparison |

All **9 CatalogReflector methods** exercised. All **5 source types** represented in the catalog.