dataproc-cluster

Here are 5 public repositories matching this topic...

Wittline / pyDag

Scheduling Big Data Workloads and Data Pipelines in the Cloud with pyDag

bigquery cloud big-data workflow-engine google-cloud data-engineering task-scheduler google-cloud-platform dataproc-cluster dag parallel-processing data-pipeline dataengineering dataproc directed-acyclic-graph task-scheduling

Updated Sep 19, 2022
Python

naranjja / gcp-jupyter-sql

Star

Run Jupyter Notebooks (and store data) on Google Cloud Platform.

jupyter-notebook dataproc-cluster cloud-sql compute-engine

Updated Oct 6, 2017
Python

MarieeCzy / METAR-Data-Engineering-and-Machine-Learning-Project

Star

An educational project to build an end-to-end pipline for near real-time and batch processing of data further used for visualisation and a machine learning model.

python docker bigquery machine-learning looker big-data spark terraform pyspark dataproc-cluster googlecloudplatform dataproc prefect streamlit

Updated May 19, 2023
Python

jaiswalanshul / gcp_dataproc_spark_airflow

Star

Data Workflows with GCP Dataproc, Apache Airflow and Apache Spark

airflow spark gcp dataproc-cluster dataproc airflow-operators

Updated Mar 4, 2020
Python

mihir-robotics / pyspark-gcp-project

Star

PySpark Job that runs in Dataproc cluster, loads data from Cloud Storage to BigQuery table.

bigquery google-cloud dataproc-cluster pyspark-python

Updated Feb 15, 2024
Python

Improve this page

Add a description, image, and links to the dataproc-cluster topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dataproc-cluster topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dataproc-cluster

Here are 5 public repositories matching this topic...

Wittline / pyDag

naranjja / gcp-jupyter-sql

MarieeCzy / METAR-Data-Engineering-and-Machine-Learning-Project

jaiswalanshul / gcp_dataproc_spark_airflow

mihir-robotics / pyspark-gcp-project

Improve this page

Add this topic to your repo