Skip to content

Aryan-401/FullStackDataAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Yellow Taxi Data Engineering Project

Data Sources

Process of Data Engineering

  • Download data from the source
  • Upload data to Google Cloud Storage
  • Decompose data into Third Normal Form (3NF)
  • Use Mage-AI to create a Pipeline
  • Upload data to Google BigQuery
  • Create a Looker Studio Dashboard

Third Normal Form (3NF)

3NF

Data Engineering Pipeline

Pipeline

Combining the Monthly data

python3 data/scripts/combine_data.py

Changing .geojson to .csv

python3 data/scripts/geospacial_data.py

Locally running Pipeline

jupyter notebook DataModelling.ipynb

To view Mage Nodes check out this directory

Converting 3NF to flat file using SQL, run the script in this directory

Technologies Used

Looker Studio Dashboard