Skip to content

ETL streaming pipeline using python. generating a constant stream of fake web server logs, analyzing it, and storing in a PostgreSQL DB

Notifications You must be signed in to change notification settings

vicmar57/Web-server-logs-ETL-pipeline

Repository files navigation

Web server logs ETL pipeline

ETL streaming pipeline using python. generating a constant stream of fake web server logs and analyzing it, and storing in a PostgreSQL DB

Pipeline illustration: Alt text

Installation

faker is needed to generate fake log data (pip install faker)

Usage

  1. run log_generator to generate a stream of logs that will get into log_a.txt and log_b.txt
  2. run store_logs to parse and store the log data in a PostgreSQL db
  3. run count_visitors and count_browsers to generate analytics

Credits

credit to Vik Paruchuri for his tutorial on data pipelines with python!! (https://www.dataquest.io/blog/data-pipelines-tutorial/)

About

ETL streaming pipeline using python. generating a constant stream of fake web server logs, analyzing it, and storing in a PostgreSQL DB

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages