## Question 1. dbt Lineage and Execution

Given a dbt project with the following structure:

In [None]:
models/
├── staging/
│   ├── stg_green_tripdata.sql
│   └── stg_yellow_tripdata.sql
└── intermediate/
    └── int_trips_unioned.sql (depends on stg_green_tripdata & stg_yellow_tripdata)

If you run dbt run --select int_trips_unioned, what models will be built?

- stg_green_tripdata, stg_yellow_tripdata, and int_trips_unioned (upstream dependencies)
- Any model with upstream and downstream dependencies to int_trips_unioned
- int_trips_unioned only
- int_trips_unioned, int_trips, and fct_trips (downstream dependencies)

Answer: stg_green_tripdata, stg_yellow_tripdata, and int_trips_unioned (upstream dependencies)

Explanation:
dbt builds models based on dependency lineage, which is defined using the ref() function.

The dependency chain is:

In [None]:
stg_green_tripdata ─┐
                     ├── int_trips_unioned
stg_yellow_tripdata ┘

This means:
- int_trips_unioned depends on both staging models
- staging models are upstream dependencies
- dbt ensures upstream models exist before building downstream models

dbt Execution Behavior

When running:

In [None]:
dbt run --select int_trips_unioned --target prod

dbt will:
1. Check if upstream dependencies exist
2. If missing → build them first
3. Then build the selected model

So logically, dbt builds:

In [None]:
stg_green_tripdata
stg_yellow_tripdata
int_trips_unioned

Why downstream models are NOT built

Downstream models like:

In [None]:
int_trips
fct_trips

depend on int_trips_unioned, but dbt does NOT build downstream models unless explicitly requested.

To build downstream models, you would use:

In [None]:
dbt run --select int_trips_unioned+

What happened in my local execution

When I ran:
- dbt run --select int_trips_unioned --target prod
- Only int_trips_unioned was built because staging models already existed in DuckDB from a previous run:
- dbt build --target prod
- dbt reused existing upstream models instead of rebuilding them.

![Question 1](./images/Question1.png)

Key Concept: dbt Lineage

Symbols used in dbt selection:
- Symbol -> Meaning
- model -> selected model only
- +model -> model + upstream dependencies
- model+ -> model + downstream dependencies
- +model+ -> upstream + model + downstream

Final Conclusion:
Even though dbt may not rebuild upstream models if they already exist, logically the correct answer is:

stg_green_tripdata, stg_yellow_tripdata, and int_trips_unioned

because dbt ensures upstream dependencies are available before building downstream models.