Skip to content

Latest commit

 

History

History
39 lines (24 loc) · 1.45 KB

File metadata and controls

39 lines (24 loc) · 1.45 KB

NYC Uber Taxi Data Analysis | Data Engineering - Google Cloud Platform Project

Introduction

The goal of this project is to perform data analysis on NYC Uber Taxi data using various tools and technologies, including GCP Storage, Python, Compute Instance, Mage Data Pipeline Tool, BigQuery, and Looker Studio.

Architecture

Project Architecture

Technology Used

Programming Language:

  • Python
  • SQL Google Cloud Platform:
  • Google Storage
  • Compute Instance
  • BigQuery
  • Looker Studio Modern Data Pipeine Tool:
  • Mage: https://www.mage.ai/

Contribute to this open-source project - https://github.com/mage-ai/mage-ai

Dataset

TLC Trip Record Data Yellow and green taxi trip records include fields capturing pick-up and drop-off dates/times, pick-up and drop-off locations, trip distances, itemized fares, rate types, payment types, and driver-reported passenger counts.

Dataset used: https://github.com/PreetKothari/Uber_etl_pipeline_data_analytics_project/tree/main/Data

Website - https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page

Data Dictionary - https://www.nyc.gov/assets/tlc/downloads/pdf/data_dictionary_trip_records_yellow.pdf

Data Model

NY Taxi - Uber Data Model