datawarehouse

In this project we are making an ELT data pipeline. Before we start lets define the abbreviations

E: Extract, refers to the process of getting data from the source.
T: Transform, refers to the process of transforming the raw data from the source (eg: joins with other tables, group by, column mapping, denormalizing, lookups on external database, machine learning modeling, etc).
L: Load, refers to the process of loading the data into a table to be used.

ELT is mostly used when we don't really know the type of transformation needed foor the data.

TOOLS

The data used is provided in https://anson.ucdavis.edu/~clarkf/ it contains station and traffic movement on those stations over time.

we will load the data from the different csv's into our Mysql database with the help of ariflow.
with dbt we will make models to perform some tranformations on the data
we will display the queried columns on redash

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
airflow-docker		airflow-docker
data		data
datawarehouse		datawarehouse
schema		schema
scripts		scripts
README.md		README.md