Skip to content
#

pyspark-notebook

Here are 209 public repositories matching this topic...

Leveraging NYC Open Data, this repository contains Databricks notebooks for analyzing motor vehicle collisions. We perform EDA, spatial clustering, and predictive modeling on collision, vehicle, and person datasets to understand accident trends and predict potential risks.

  • Updated Feb 5, 2025
  • Jupyter Notebook

This project demonstrates how to perform Exploratory Data Analysis (EDA) on the Netflix dataset using PySpark in a Jupyter Notebook environment. It involves setting up Spark, loading a dataset, performing basic data cleaning, and visualizing the results. All of it is runnning on a container in Docker.

  • Updated Dec 13, 2024
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the pyspark-notebook topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the pyspark-notebook topic, visit your repo's landing page and select "manage topics."

Learn more