gcp-dataproc

Here are 12 public repositories matching this topic...

dwaiba / dataproc-terraform

Dataproc Customisable HA cluster debian-9 with zookeeper,kafka ,BigQuery and other tools/jobs with Terraform

google-cloud-platform dataproc-cluster dataproc dataproc-clusters terraform-gcp gcp-dataproc terraform-dataproc dataproc-ha-terraform

Updated Feb 29, 2020
HCL

prakashdontaraju / google-cloud-ecommerce

Star

ecommerce GCP Streaming pipeline ― Cloud Storage, Compute Engine, Pub/Sub, Dataflow, Apache Beam, BigQuery and Tableau; GCP Batch pipeline ― Cloud Storage, Dataproc, PySpark, Cloud Spanner and Tableau

Updated Mar 9, 2022
Python

askmrsinh / spark-stocksim

Star

Monte Carlo stock simulation using Apache Spark.

apache-spark stock-market monte-carlo-simulation predictive-analytics spark-sql spark-mllib apache-commons-math gcp-dataproc

Updated Mar 27, 2020
Scala

prodriguezdefino / dataproc-workflowtemplate-cloudfunction

Star

Implements a work queue for Dataproc Worflow Template executions

terraform gcp-cloud-functions gcp-dataproc

Updated Sep 28, 2020
HCL

emanuelegiona / CC2019

Star

Project for Cloud Computing course (A.Y. 2018/2019)

streaming apache-spark gcp python3 cloud-computing word-count sapienza-university gcp-dataproc

Updated Jan 28, 2020
Python

nrohit78 / PigHive_StackExhangeData

Star

Data is fetched from StackExchange, transformed using Pig, queried and stored in Hive. Additionally, the TF-IDF of the top 10 users is calculated using Hive.

hive pig tf-idf gcp-dataproc google-datap

Updated Nov 21, 2020
PigLatin

tansudasli / spark-sandbox

Star

Apache spark sandbox on GCP and Amazon EMR.

python apache-spark aws-emr gcp-dataproc

Updated Mar 4, 2020
Jupyter Notebook

bug-data / Big_Data_First_Project

Star

First project for Big Data course held at Roma Tre University

python spark hive hadoop bigdata jupyter-notebook gcp university-project hadoop-streaming gcp-storage gcp-compute gcp-dataproc roma-tre-university

Updated Jun 26, 2019
Jupyter Notebook

RickLeite / Hadoop-Google-DataProc-DIOstudy

Star

Hadoop Google DataProc DIO study

hadoop google-cloud-platform gcp-cloud-functions gcp-dataproc digital-innovation-one

Updated Sep 4, 2021
Python

visalvo / projectScalable

Star

Project for Scalable and Cloud Programming Course - 2018/19 UNIBO.

scala spark mapbox-gl-js pagerank gcp-dataproc weighted-pagerank

Updated May 21, 2020
JavaScript

DenisOgr / sentiment-batch-stream-pipeline

Star

nlp twitter spark sentiment-analysis pyspark gcp-cloud-functions gcp-storage gcp-dataproc gcp-app-engine

Updated May 25, 2021
Jupyter Notebook

ElhNour / large-scale-data-management-spark

Star

Process large amount of data and implement complex data analyses using Spark. The dataset has been made available by Google. It includes data about a cluster of 12500 machines, and the activity on this cluster during 29 days.

spark gcp-dataproc large-scale-data-analytics

Updated Jan 13, 2023
Python

Improve this page

Add a description, image, and links to the gcp-dataproc topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gcp-dataproc topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gcp-dataproc

Here are 12 public repositories matching this topic...

dwaiba / dataproc-terraform

prakashdontaraju / google-cloud-ecommerce

askmrsinh / spark-stocksim

prodriguezdefino / dataproc-workflowtemplate-cloudfunction

emanuelegiona / CC2019

nrohit78 / PigHive_StackExhangeData

tansudasli / spark-sandbox

bug-data / Big_Data_First_Project

RickLeite / Hadoop-Google-DataProc-DIOstudy

visalvo / projectScalable

DenisOgr / sentiment-batch-stream-pipeline

ElhNour / large-scale-data-management-spark

Improve this page

Add this topic to your repo