Question 1. dbt Lineage and Execution
Given a dbt project with the following structure:
models/ ├── staging/ │ ├── stg_green_tripdata.sql │ └── stg_yellow_tripdata.sql └── intermediate/ └── int_trips_unioned.sql (depends on stg_green_tripdata & stg_yellow_tripdata) If you run dbt run --select int_trips_unioned, what models will be built?
Ans : int_trips_unioned only
Question 2 You've configured a generic test like this in your schema.yml:
columns:
- name: payment_type
data_tests:
- accepted_values: arguments: values: [1, 2, 3, 4, 5] quote: false Your model fct_trips has been running successfully for months. A new value 6 now appears in the source data.
What happens when you run dbt test --select fct_trips?
Ans : dbt will fail the test, returning a non-zero exit code
Question 3 After running your dbt project, query the fct_monthly_zone_revenue model.
What is the count of records in the fct_monthly_zone_revenue model?
SELECT COUNT(*) FROM prod.fct_monthly_zone_revenue_models;
Ans : 12184
Question 4 Using the fct_monthly_zone_revenue table, find the pickup zone with the highest total revenue (revenue_monthly_total_amount) for Green taxi trips in 2020.
Which zone had the highest revenue?
select pickup_zone, sum(revenue_monthly_total_amount) as total_revenue_2020 from prod.fct_monthly_zone_revenue where service_type = 'Green' and extract(year from revenue_month) = 2020 group by pickup_zone order by total_revenue_2020 desc limit 1;
Ans : East Harlem North
Question 5 Using the fct_monthly_zone_revenue table, what is the total number of trips (total_monthly_trips) for Green taxis in October 2019?
select sum(total_monthly_trips) as total_trips_oct_2019 from prod.fct_monthly_zone_revenue where service_type = 'Green' and extract(year from revenue_month) = 2019 and extract(month from revenue_month) = 10;
Ans : 384,624
Question 6
Create a staging model for the For-Hire Vehicle (FHV) trip data for 2019.
Load the FHV trip data for 2019 into your data warehouse Create a staging model stg_fhv_tripdata with these requirements: Filter out records where dispatching_base_num IS NULL Rename fields to match your project's naming conventions (e.g., PUlocationID → pickup_location_id) What is the count of records in stg_fhv_tripdata?
D SELECT COUNT(*) FROM prod.stg_fhv_tripdata;
Answer : 43244693