Skip to content

A project that creates a scalable and robust ELT data pipeline leveraging PostgreSQL , DBT, Orchestration using airflow and data visualization with Redash

License

Notifications You must be signed in to change notification settings

ProgrammingOperative/elt_with_scalable_data_warehouse

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

traffic_data_etl


Traffic Data ELT

ELT pipeline using PostgreSQL, Airflow, DBT, Redash and Superset.

· Report Bug · Request Feature .

Table of Contents
  1. About The Project
  2. Getting Started
  3. Usage
  4. Roadmap
  5. Contributing
  6. License
  7. Contact
  8. Acknowledgements

About The Project

The objective of this project was to migrate an ELT pipeline developed for the week 11 challenge using(MYSQL, DBT, Apache Airflow, and Redash) to a more scalable and robust ELT pipeline. This was accomplished by changing the two main components, namely the MySQL data warehouse to Postgres and the Redash dashboard to Superset.

Built With

Tech Stack used in this project

Getting Started

Installation

  1. Clone the repo
    git clone https://github.com/ProgrammingOperative/traffic_data_etl

Usage

Adminer:

Adminer (formerly phpMinAdmin) is a full-featured database management tool written in PHP. Used to access MYSQL and Postgres Databases.

  • Postgres:
    Navigate to `http://localhost:8080/` on the browser
    use `postgres-dbt` server
    use `testdb` database
    use `dbtuser` for username
    use `pssd` for password

Airflow:

Airflow is used for aurchestration and automation.

Navigate to `http://localhost:8080/` on the browser
use `admin` for username
use `admin` for password

DBT:

DBT is used for cleaning and transforming the data in the warehouses.

  • Airflow is used for automation of running and testing dbt models

Redash

open terminal and execute `docker-compose run — rm server create_db`
using adminer create a user and grant read access
Navigate to `http://localhost:5000/` on the browser

Superset

  • navigate to localhost:8088 to access Airflow

Roadmap

See the open issues for a list of proposed features (and known issues).

Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

  1. Fork the Project
  2. Create your Feature Branch (git checkout -b feature/AmazingFeature)
  3. Commit your Changes (git commit -m 'Add some AmazingFeature')
  4. Push to the Branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

License

Distributed under the MIT License. See LICENSE for more information.

Contact

Titus Wachira - wachura11t@@gmail.com

Project Link: https://github.com/ProgrammingOperative/Migrate_traffic_data

Acknowledgements

About

A project that creates a scalable and robust ELT data pipeline leveraging PostgreSQL , DBT, Orchestration using airflow and data visualization with Redash

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published