sparkify

Here are 12 public repositories matching this topic...

Guli-Y / SparkifyRedshift

a ETL pipeline for extracting data from s3, staging themon Redshift and transforming them into fact and dimensional tables for song play analysis

etl s3 redshift sparkify

Updated May 21, 2021
Python

Mcamin / User-Churn-Prediction

Star

Data Analysis in Spark to Identify Customer Churn for a fictional music service.

python pyspark tuning logistic-regression support-vector-machines churn-prediction gradient-boosting sparkify

Updated Nov 25, 2019
Jupyter Notebook

cdumen / Sparkify_Churn_Prediction

Star

Sparkify project for predicting customer loyality.

udacity python3 churn-prediction sparkify

Updated Nov 3, 2019
HTML

pratikwatwani / ETL-pipeline-for-Sparkify

Star

An ETL model designed using Postgres SQL for Sparkify database 🗄, modeling user activity data to create a database and ETL pipeline🔀 for a music streaming app 🎼.

database etl postgresql data-modeling datamodel etl-pipeline sparkify

Updated Jun 2, 2020
Jupyter Notebook

fpcarneiro / data-lake

Star

Udacity Data Engineer Nanodegree: Project Data Lake

udacity spark data-lake data-engineer etl-pipeline sparkify

Updated Aug 21, 2019
Python

fpcarneiro / Data-Modeling-with-Cassandra

Star

Project: Data Modeling with Cassandra

udacity cassandra etl-pipeline sparkify

Updated May 19, 2019
Jupyter Notebook

Guli-Y / Sparkify-s3-Spark-s3

Star

ETL script for reading data from s3, processing them using Spark and loading them back to s3 for data analysis team

emr spark etl s3 sparkify

Updated May 21, 2021
Python

alessiococchieri / BDA-project-sparkify

Star

This Git repo showcases my analysis of Sparkify dataset with PySpark on Apache Spark cluster mode and JupyterLab on Docker. The goal was to identify at-risk customers and develop retention strategies. The analysis tested multiple machine learning models and uncovered insights into customer behavior and churn patterns.

machine-learning big-data spark apache-spark pyspark churn-prediction big-data-analytics big-data-processing churn-analysis sparkify

Updated Feb 15, 2023
Jupyter Notebook

fpcarneiro / Data-Warehouse

Star

Students will build an ETL pipeline that extracts data from S3, stages them in Redshift, and transforms data into a set of dimensional tables for their analytics team.

udacity redshift data-engineer etl-pipeline sparkify data-warehouses