-
Updated
Apr 30, 2018 - Shell
Apache Spark
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Here are 177 public repositories matching this topic...
A container image of jupyter notebook development environment with anaconda3, python2.7, some other runtimes and packages.
-
Updated
Feb 5, 2021 - Python
Pyspark and Spark [ My Notes and all practise Notebook ]
-
Updated
Jan 9, 2023 - Jupyter Notebook
This notebook contains detailed code for spark and machine learning and databricks
-
Updated
Mar 15, 2023 - Jupyter Notebook
PySpark notebooks
-
Updated
May 23, 2018 - Jupyter Notebook
Big Data Management related Zeppelin notebooks
-
Updated
Jul 20, 2022 - Java
-
Updated
Apr 2, 2018 - Scala
Repositório contendo todo o projeto de engenharia de dados realizado na Databricks conectando com o redshift na aws
-
Updated
Mar 28, 2022 - Jupyter Notebook
systemctl for Spark and Jupyter-notebook
-
Updated
Dec 30, 2021
This is a study project. I get analytics/ML examples from Kaggle and use different technologies to re-implement them.
-
Updated
May 25, 2021 - Jupyter Notebook
-
Updated
Apr 26, 2022
Machine Learning notebooks with PySpark
-
Updated
May 13, 2022 - Jupyter Notebook
Interactive Notebooks that support the book
-
Updated
Jun 18, 2019 - Jupyter Notebook
contains notebooks with solutions for data prepping and implementation of the Medallion Architecture and Delta Lake Storage
-
Updated
Dec 31, 2022 - Python
Created by Matei Zaharia
Released May 26, 2014
- Followers
- 416 followers
- Repository
- apache/spark
- Website
- spark.apache.org
- Wikipedia
- Wikipedia