Skip to content
View ofili's full-sized avatar
🏠
Working from home
🏠
Working from home
Block or Report

Block or report ofili

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
ofili/README.md

Hi there, I'm Lewis 👋


  • 📫 How to reach me: LinkedIn
  • ⚡ Fun fact: Two of my favorites books are A Billion Wicked Thoughts by Ogi Ogas & Sai Gaddam, and How Will You Measure Your Life by Clayton Christensen!
  • 📚 I'm currently reading Streaming Data by Andrew Psaltis, Designing Cloud Data Platforms by Danil Zburivsky & Lynda Partner, and Building the Data Lakehouse by Bill Inmon

Pinned Loading

  1. pyspark-template pyspark-template Public

    Structured Streaming app that can read files from the local system folder as new files are added to the folder as stream data and apply all the operations on the new data and, finally, write the re…

    Python

  2. data_pipeline_with_airflow data_pipeline_with_airflow Public

    This project builds a data pipeline that ingests Sparkify's music data into an AWS Redshift Data Warehouse. The ETL pipeline will be run on an hourly basis, scheduled using Airflow.

    Python

  3. data-pipeline-with-gcp data-pipeline-with-gcp Public

    This project implements a data ingestion and processing pipeline to collect, store and process time-series data. The pipeline consists of a publisher, a message queue (Pub/Sub), a consumer, a data …

    Python 1

  4. nyc-taxi-data nyc-taxi-data Public

    This etl pipeline extracts and integrates NYC Taxi Trip Data with Taxi Zone Lookup Data to create a dataset that can be used for descriptive and predictive analysis. For example, to predict the num…

    Jupyter Notebook

  5. data-lake data-lake Public

    This project builds an ETL pipeline for a data lake hosted on S3. We will load data from S3, process the data into analytics tables using Spark, and load them back into S3. We will deploy this Spar…

    Python