data-pipeline

Star

Here are 17 public repositories matching this topic...

sohamray19 / KafkaSparkStreamingPOCinScala

Star

POC in Apache Kafka and Spark Streaming using Avro serialization.

serialization kafka big-data spark avro apache pubsub spark-streaming data-pipeline

Updated Sep 6, 2018
Scala

princebhatt9588 / Real_Time_Twitter_Trends_Analytics

Star

A cutting-edge big data initiative aimed at creating a real-time data pipeline to analyze the popularity and sentiments of trending topics on Twitter.

mongodb bigdata zookeeper spark-streaming business-intelligence realtime-database tableau twitter-sentiment-analysis kafka-streams data-pipeline apache-drill

Updated Jul 24, 2023
Scala

GameTuner / collector

Star

GameTuner Scala Stream Collector is project for collecting raw events from tracker

data scala gaming analytics snowplow collector data-pipeline big-data-analytics gaming-analytics gametuner

Updated Feb 6, 2024
Scala

GameTuner / enricher

Star

GameTuner Enricher application for processing raw events

data scala gaming analytics snowplow data-pipeline enrich gaming-analytics gametuner

Updated Feb 6, 2024
Scala

PATRICIAJUNQUEIRA / pipeline-databricks-azure

Star

Pipeline de dados no Azure para base de imóveis, com estrutura em três camadas (unbound, silver, gold) e trigger automática a cada hora para atualização consistente.

real-estate scala azure databricks data-pipeline azure-data-factory data-engineering-pipeline

Updated Feb 1, 2024
Scala

opensnowcat / opensnowcat-rdb-loader

Star

OpenSnowcat Relational Database Loader (Apache 2.0 License)

analytics snowplow data-engineering event-pipeline data-pipeline

Updated May 26, 2024
Scala

stephen29xie / tweet-streaming-data-pipeline

Star

Real-time streaming data pipeline for Twitter Tweets

docker scala kafka mongodb twitter4j data-pipeline spark-structured-streaming

Updated Jan 31, 2022
Scala

JHLeeeMe / fake-data-pipeline

Star

Data Generators -> Kafka -> Spark Streaming -> PostgreSQL -> Grafana

docker scala kafka spark docker-compose grafana postgresql data-engineering data-pipeline

Updated Jan 31, 2023
Scala

opensnowcat / opensnowcat-enrich

Star

OpenSnowcat Enricher (Apache 2.0 License)

analytics snowplow data-engineering event-pipeline data-pipeline

Updated May 27, 2024
Scala

techmonad / spark-data-pipeline

Star

This project describes how to write full ETL data pipeline using spark.

elasticsearch scala kafka spark sbt data-pipeline

Updated Oct 15, 2022
Scala

opensnowcat / opensnowcat-collector

Star

OpenSnowcat Collector, an open source fork of Snowplow (Apache 2.0 License)

analytics snowplow data-engineering event-pipeline data-pipeline

Updated May 27, 2024
Scala

akshitvjain / realtime-twitter-trends-analytics

Star

A big data project to develop a real-time data pipeline for analyzing the popularity and sentiments of trending topics on Twitter.

Updated Jun 21, 2022
Scala

snowplow / enrich

Star

Snowplow Enrichment jobs and library

etl analytics snowplow data-pipeline

Updated May 27, 2024
Scala

AbsaOSS / pramen

Star

Resilient data pipeline framework running on Apache Spark

scala big-data spark etl hacktoberfest data-pipeline

Updated May 31, 2024
Scala

NebulaGraph Exchange is an Apache Spark application to parse data from different sources to NebulaGraph in a distributed environment. It supports both batch and streaming data in various formats and sources including other Graph Databases, RDBMS, Data warehouses, NoSQL, Message Bus, File systems, etc.

spark etl data-import graph-database hacktoberfest data-pipeline nebulagraph