pyspark-notebook

The project aims to process Formula 1 racing data, create an automated data pipeline, and make the data available for presentation and analysis purposes.

sql azure databricks pyspark-notebook data-factory data-lakehouse

Updated Jan 10, 2024
Python

alisonpezzott / calendario_fabric_lakehouse

Sponsor

Star

Tabela calendário para lakehouse Fabric a partir do notebook spark

microsoft pyspark pyspark-notebook calendario lakehouse tabular-editor semantic-model calendar-table microsoft-fabric date-table direct-lake

Updated Nov 25, 2024
Python

hjh17 / dbloy

Star

Continuous Delivery tool for PySpark Notebooks based jobs on Databricks

cli ci-cd python3 pyspark databricks pyspark-notebook databricks-notebooks

Updated Mar 25, 2021
Python

Ragadeepthi / Loading-different-types-of-data-files-using-Flume-and-pyspark

Star

Loading different types of dataset files using Flume and pyspark

python machine-learning pyspark machinelearning pyspark-notebook pyspark-python

Updated Jul 4, 2019
Python

gaelblanchard / anime_recommendation_engine

Star

An anime recommendation engine that allows us to recommend anime based on a given anime title or a given user using Pyspark

spark anime jupyter-notebook python3 kaggle pyspark recommendation-engine cosine-similarity joins python-notebook pyspark-notebook spark-ml alternating-least-squares reccomendersystem

Updated May 8, 2019
Python

sanogotech / pyspark-examples

Star

Pyspark RDD, DataFrame and Dataset Examples in Python language

python spark pyspark join pyspark-notebook

Updated Dec 1, 2022
Python

arturogonzalezm / energy_price_and_demand_forecast

Star

AEMO Aggregated price and demand data

machine-learning linear-regression ml labels python3 pyspark feature-engineering spark-mllib anomaly-detection pyspark-notebook mlops

Updated Jul 25, 2024
Python

heischichou / Sample-CDM-Tagger

Star

A simple tool to compare new data to historical records. It will tag rows accordingly as duplicate or NULL. The team of interns I was in designed this tool using PySpark and Jupyter Notebook in Microsoft Fabric as a practice exercise within Lexmark Research and Development Corporation's Digital Transformation program.

python jupyter-notebook pyspark duplicate-detection pyspark-notebook null-check microsoft-fabric

Updated Oct 15, 2023
Python

sunidhit / NYCCityTaxiDataAnalysis

Star

spark spark-sql pyspark-notebook nyc-taxi-dataset

Updated May 22, 2020
Python

Saikesana31 / Netflix

Star

Azure Data engineering project

datalake pyspark-notebook databricks-notebooks datafactory etl-pipeline deltallivetables

Updated Mar 31, 2025
Python

SAI-MOHAN-B / Spark-Structured-Streaming

Star

This repo is for the Structured Streaming and Projects

spark spark-streaming pyspark-notebook

Updated Aug 29, 2024
Python

solvimm / glue-comprehend

Star

Scaling sentiment analysis with AWS Glue and Amazon Comprehend.

aws python3 pyspark pyspark-notebook amazon-comprehend aws-glue

Updated Sep 16, 2019
Python

EchoSingh / pySpark_movie_analysis

Star

This project analyzes the MovieLens 20M dataset using PySpark, with interactive visualizations provided by Streamlit. Additionally, a Kaggle notebook offers more insights into the analysis.

pyspark pyspark-notebook streamlit

Updated May 6, 2025
Python

citysiva180 / databricks_practice_repo

Star

This repo is built to learn and practice databricks and PySpark. This is the practice repo for databricks Data Engineering Associate Certification

pyspark pyspark-notebook

Updated Aug 13, 2024
Python

jashshah-dev / Automating-EMR-Cluster-using-AWS-Lambda

Star

Automate Amazon EMR clusters using Lambda for streamlined and scalable data processing workflows. Unlock the full potential of your data pipeline with LambdaEMR Automator.

lambda-functions pyspark boto3 pyspark-notebook emr-cluster transient-cluster

Updated Jan 1, 2024
Python

Improve this page

Add a description, image, and links to the pyspark-notebook topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the pyspark-notebook topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pyspark-notebook

Here are 18 public repositories matching this topic...

josephmachado / efficient_data_processing_spark

brennerh1 / databricks-demos

prabeesh / pyspark-notebook

gupta-aayushkr / F1-Racing

alisonpezzott / calendario_fabric_lakehouse

hjh17 / dbloy

Ragadeepthi / Loading-different-types-of-data-files-using-Flume-and-pyspark

gaelblanchard / anime_recommendation_engine

sanogotech / pyspark-examples

arturogonzalezm / energy_price_and_demand_forecast

heischichou / Sample-CDM-Tagger

sunidhit / NYCCityTaxiDataAnalysis

Saikesana31 / Netflix

SAI-MOHAN-B / Spark-Structured-Streaming

solvimm / glue-comprehend

EchoSingh / pySpark_movie_analysis

citysiva180 / databricks_practice_repo

jashshah-dev / Automating-EMR-Cluster-using-AWS-Lambda

Improve this page

Add this topic to your repo