Skip to content

ravichander34/dataengineering-hw4

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Question 1. dbt Lineage and Execution

Given a dbt project with the following structure:

models/ ├── staging/ │ ├── stg_green_tripdata.sql │ └── stg_yellow_tripdata.sql └── intermediate/ └── int_trips_unioned.sql (depends on stg_green_tripdata & stg_yellow_tripdata) If you run dbt run --select int_trips_unioned, what models will be built?

Ans : int_trips_unioned only

Question 2 You've configured a generic test like this in your schema.yml:

columns:

  • name: payment_type data_tests:
    • accepted_values: arguments: values: [1, 2, 3, 4, 5] quote: false Your model fct_trips has been running successfully for months. A new value 6 now appears in the source data.

What happens when you run dbt test --select fct_trips?

Ans : dbt will fail the test, returning a non-zero exit code

Question 3 After running your dbt project, query the fct_monthly_zone_revenue model.

What is the count of records in the fct_monthly_zone_revenue model?

SELECT COUNT(*) FROM prod.fct_monthly_zone_revenue_models;

Ans : 12184

Question 4 Using the fct_monthly_zone_revenue table, find the pickup zone with the highest total revenue (revenue_monthly_total_amount) for Green taxi trips in 2020.

Which zone had the highest revenue?

select pickup_zone, sum(revenue_monthly_total_amount) as total_revenue_2020 from prod.fct_monthly_zone_revenue where service_type = 'Green' and extract(year from revenue_month) = 2020 group by pickup_zone order by total_revenue_2020 desc limit 1;

Ans : East Harlem North

Question 5 Using the fct_monthly_zone_revenue table, what is the total number of trips (total_monthly_trips) for Green taxis in October 2019?

select sum(total_monthly_trips) as total_trips_oct_2019 from prod.fct_monthly_zone_revenue where service_type = 'Green' and extract(year from revenue_month) = 2019 and extract(month from revenue_month) = 10;

Ans : 384,624

Question 6

Create a staging model for the For-Hire Vehicle (FHV) trip data for 2019.

Load the FHV trip data for 2019 into your data warehouse Create a staging model stg_fhv_tripdata with these requirements: Filter out records where dispatching_base_num IS NULL Rename fields to match your project's naming conventions (e.g., PUlocationID → pickup_location_id) What is the count of records in stg_fhv_tripdata?

D SELECT COUNT(*) FROM prod.stg_fhv_tripdata;

Answer : 43244693

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages