| Aspect                               | Details / What to Know                                                                  | Where Used                 | Role in Data Engineering System Design                                |
| ------------------------------------ | --------------------------------------------------------------------------------------- | -------------------------- | --------------------------------------------------------------------- |
| **Fact Table**                       | `fact_ad_performance` containing metrics: impressions, clicks, CTR, spend, conversions  | BigQuery / Snowflake       | Central table for aggregations and reporting per campaign / ad / date |
| **Dimension Tables**                 | Examples: `dim_campaign`, `dim_ad`, `dim_date`, `dim_device`, `dim_channel`             | BigQuery / Redshift        | Provide context to metrics; used for slicing & dicing analytics       |
| **Keys**                             | Fact table has foreign keys to dimensions; dimensions have primary keys                 | Star schema                | Enables joins and aggregation in queries                              |
| **Granularity**                      | Usually **ad-level per day**; can be per impression in advanced pipelines               | ETL / dbt / Dataproc       | Determines aggregation windows and performance metrics                |
| **ETL / ELT Relevance**              | dbt transforms raw API tables from DV360, Facebook Ads, CM360 → fact & dimension tables | Dataproc, dbt, BigQuery    | Standardizes metrics, unifies data across platforms                   |
| **Batch / Streaming Usage**          | Mostly **daily batch**, can support near-real-time if needed                            | Dataproc + Cloud Scheduler | Ensures dashboards have fresh metrics                                 |
| **Slowly Changing Dimensions (SCD)** | Campaigns may change targeting or names → type 2 SCD to preserve history                | dbt transformations        | Keeps historical context for campaign performance analysis            |
| **Partitioning**                     | Partition `fact_ad_performance` on `date` for efficient querying                        | BigQuery / Redshift        | Reduces cost and improves performance for daily analytics             |
| **BI / Analytics**                   | Connects directly to Looker / Data Studio dashboards                                    | Looker / Data Studio       | Used for campaign reporting, KPIs, ad performance dashboards          |
| **Backfills & Corrections**          | Use Kafka replay or batch backfill from raw tables for missing or late events           | Dataproc / BigQuery        | Ensures historical accuracy for reports                               |


| Aspect                               | Details / What to Know                                                                 | Where Used                    | Role in Data Engineering System Design                                               |
| ------------------------------------ | -------------------------------------------------------------------------------------- | ----------------------------- | ------------------------------------------------------------------------------------ |
| **Fact Table**                       | `fact_transactions` containing metrics: order_id, revenue, units_sold, discounts, cost | BigQuery / Snowflake          | Central table for revenue, conversion, and sales metrics                             |
| **Dimension Tables**                 | Examples: `dim_customer`, `dim_product`, `dim_date`, `dim_channel`, `dim_region`       | BigQuery / Redshift           | Context for transactions; allows aggregation by product, region, or customer segment |
| **Keys**                             | Fact table foreign keys → dimension tables                                             | Star schema                   | Maintains referential integrity and query efficiency                                 |
| **Granularity**                      | Usually **transaction-level** (one row per order or line item)                         | ETL / dbt / Dataproc          | Determines aggregation for daily, weekly, monthly metrics                            |
| **ETL / ELT Relevance**              | Raw e-commerce / POS / CRM data → dbt transformations → fact & dimensions              | Dataproc / dbt / BigQuery     | Ensures clean, standardized, client-agnostic data                                    |
| **Batch / Streaming Usage**          | Daily batch for historical aggregation; streaming for near-real-time dashboards        | Dataproc / BigQuery streaming | Supports dashboards for revenue monitoring or alerts                                 |
| **Slowly Changing Dimensions (SCD)** | Customers or product details may change → type 2 SCD preserves historical accuracy     | dbt                           | Maintains correct attribution of revenue over time                                   |
| **Partitioning**                     | Partition fact table by `transaction_date`                                             | BigQuery / Redshift           | Improves query performance and cost efficiency                                       |
| **BI / Analytics**                   | Connects to Looker / Data Studio                                                       | Looker / Data Studio          | Sales, revenue, conversion rate dashboards; client reporting                         |
| **Backfills & Corrections**          | Historical orders can be reprocessed via batch jobs                                    | Dataproc / BigQuery           | Ensures accurate financial reporting, handles missing or corrected data              |


| Aspect                         | Details / What to Know                                                                                                                                         | Where Used                                                | Role in Data Engineering System Design                                     |
| ------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------- | -------------------------------------------------------------------------- |
| **Fact Table**                 | `fact_trades` containing **trade-level metrics**: trade_id, symbol, price, quantity, timestamp, trade_type, venue                                              | BigQuery / Redshift / Snowflake / HDFS                    | Central table for trade analytics, P&L, and market activity                |
| **Dimension Tables**           | `dim_symbol` (ticker, instrument type, exchange), `dim_time` (date, hour, minute, second), `dim_trader` (trader_id, desk), `dim_venue` (exchange, market type) | Star schema in data warehouse                             | Provides context for slicing by symbol, exchange, trader, and time windows |
| **Keys**                       | Fact table foreign keys → dimension tables; surrogate keys in dimensions                                                                                       | Star schema                                               | Ensures joins are consistent and analytics can aggregate quickly           |
| **Granularity**                | Typically **per trade**, can be per quote (bid/ask) for high-frequency analysis                                                                                | ETL / Dataproc / PySpark                                  | Determines level of aggregation for real-time and historical analysis      |
| **Event Time**                 | `trade_timestamp` or `quote_timestamp`                                                                                                                         | Streaming ETL                                             | Critical for windowed aggregations and time-based analytics                |
| **Processing Time**            | Time when event is ingested / processed by pipeline                                                                                                            | Streaming ETL                                             | Useful for monitoring latency, detecting delays                            |
| **Watermarks**                 | Track **latest event times** to allow late trades/quotes without breaking aggregates                                                                           | Apache Flink / Kafka Streams / Spark Structured Streaming | Enables correct aggregation of windows, handles out-of-order events        |
| **Windows / Aggregations**     | Tumbling / sliding / session windows on trade timestamp                                                                                                        | PySpark / Flink / Dataproc                                | Compute metrics like VWAP, volume per minute, bid-ask spread, realized P&L |
| **ETL / ELT Relevance**        | Raw market data → Dataproc/PySpark transformations → fact/dimension tables → BigQuery / Snowflake                                                              | Dataproc / dbt / BigQuery                                 | Clean, normalized, and enriched trade data ready for analytics             |
| **Slowly Changing Dimensions** | Symbols may be delisted, renamed, or split → SCD type 2                                                                                                        | dbt / ETL                                                 | Maintains historical correctness for backtesting and reporting             |
| **Partitioning / Clustering**  | Partition fact table by `trade_date`, cluster by `symbol` or `venue`                                                                                           | BigQuery / Redshift                                       | Speeds up queries for trading desks, risk reports, and dashboards          |
| **Backfills & Corrections**    | Use Kafka replay / batch reprocessing for missed trades or corrected prices                                                                                    | Dataproc / BigQuery                                       | Essential for compliance, risk, and audit                                  |
| **BI / Analytics**             | Connects to Looker / Tableau / custom trading dashboards                                                                                                       | Looker / Tableau                                          | Dashboards for P&L, risk metrics, and execution quality                    |
| **Additional Metrics**         | VWAP, TWAP, spread, liquidity, market depth, fill rates, latency                                                                                               | Data engineering pipelines                                | Enables performance analysis, risk monitoring, and strategy optimization   |
| **Reliability / Compliance**   | Immutable raw tables, audit logs, replayable streams                                                                                                           | BigQuery / Kafka / GCS                                    | Ensures regulatory compliance (FINRA, SEC, MiFID II)                       |
| **Scalability**                | Partitioned tables, distributed processing for high-frequency data, streaming ingestion from multiple venues                                                   | Dataproc / Kafka / Flink                                  | Handles millions of trades per day per asset class                         |
