ETL Pipeline for processing scraped web data and preparation for loading
-
Updated
Sep 6, 2018 - Jupyter Notebook
ETL Pipeline for processing scraped web data and preparation for loading
Data modeling with Postgres and ETL pipeline using Python.
Complete data platform that performs sentiment analysis on tweets. Built using Cassandra, Kafka, Spark, Node, and React.
A self-paced learning sandbox for rocking a data pipeline with Python.
generating an ETL pipeline using pyspark, from a supermarket_sales CSV file, finally storing in a PostgreSQL DB
Create a Apache Cassandra DB
Data Tweak is a simplified, lightweight ETL framework based on Apache Spark.
From data gathering to productionizing LLMs using LLMOps good practices.
data warehouse & data modelling in AWS using s3 and Redshift
Extract, Transform, and Load Data: Extract from wikipedia JSON, transform using python pandas dataframe, then load into PostgreSQL database
This project is to classify the messages that will popup on time of natural disaster. It will be useful for countries to tackle that disaster. In this project, NLP and ML pipelines are used.
DATA SCIENCE PORTFOLIO & BLOG
An ETL project: to load google stock data to Cassandra and do some regression.
Udacity Data Engeneering Nanodegree Program - My submission of Project: Data Modeling with Postgres
Data Analysis for Amazon Product Reviews
ETL pipeline that extracts their data from S3, stages them in Redshift, and transforms data into a set of dimensional tables for their analytics team to continue finding insights in what songs their users are listening to. Then we will test the database and ETL pipeline by running queries given to us by the analytics team from Sparkify and compa…
Dockerized Data Pipeline that analyzes the sentiment of tweets and a Slack Bot that publishes selected tweets
This was the first project that involved creating an automated data pipeline. The app took in data from CSV & JSON files, and automatically cleaned, formatted, and stored the data using Python (Pandas) and PostgreSQL.
Add a description, image, and links to the etl-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the etl-pipeline topic, visit your repo's landing page and select "manage topics."