Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
-
Updated
Mar 16, 2024 - Jupyter Notebook
Apache Spark & Python (pySpark) tutorials for Big Data Analysis and Machine Learning as IPython / Jupyter notebooks
Jupyter notebooks for pyspark tutorials given at University
Collection of Databricks and Jupyter Notebooks
Pyspark Notebook With Docker
JupyterLab extension that enables monitoring launched Apache Spark jobs from within a notebook
My notebook on using Python with Jupyter Notebook, PySpark etc
Repository of notebooks and related collateral used in the Databricks Demo Hub, showing how to use Databricks, Delta Lake, MLflow, and more.
Blog Post Notebooks
Zeppelin Notebooks for use on AWS EMR with and without using Zelp
Analytics and ML notebooks
Implementation of Spark code in Jupyter notebook. Topics include: RDDs and DataFrame, exploratory data analysis (EDA), handling multiple DataFrames, visualization, Machine Learning
A collection of data analysis projects done using PySpark via Jupyter notebooks.
PySpark notebooks to learn Apache Spark (WIP)
This repo contains my learnings and practice notebooks on Spark using PySpark (Python Language API on Spark). All the notebooks in the repo can be used as template code for most of the ML algorithms and can be built upon it for more complex problems.
Add a description, image, and links to the pyspark topic page so that developers can more easily learn about it.
To associate your repository with the pyspark topic, visit your repo's landing page and select "manage topics."