POC in Apache Kafka and Spark Streaming using Avro serialization.
-
Updated
Sep 6, 2018 - Scala
POC in Apache Kafka and Spark Streaming using Avro serialization.
A cutting-edge big data initiative aimed at creating a real-time data pipeline to analyze the popularity and sentiments of trending topics on Twitter.
GameTuner Scala Stream Collector is project for collecting raw events from tracker
GameTuner Enricher application for processing raw events
Pipeline de dados no Azure para base de imóveis, com estrutura em três camadas (unbound, silver, gold) e trigger automática a cada hora para atualização consistente.
OpenSnowcat Relational Database Loader (Apache 2.0 License)
Real-time streaming data pipeline for Twitter Tweets
Data Generators -> Kafka -> Spark Streaming -> PostgreSQL -> Grafana
OpenSnowcat Enricher (Apache 2.0 License)
This project describes how to write full ETL data pipeline using spark.
OpenSnowcat Collector, an open source fork of Snowplow (Apache 2.0 License)
A big data project to develop a real-time data pipeline for analyzing the popularity and sentiments of trending topics on Twitter.
Snowplow Enrichment jobs and library
Resilient data pipeline framework running on Apache Spark
NebulaGraph Exchange is an Apache Spark application to parse data from different sources to NebulaGraph in a distributed environment. It supports both batch and streaming data in various formats and sources including other Graph Databases, RDBMS, Data warehouses, NoSQL, Message Bus, File systems, etc.
Model complex data transformation pipelines easily
The enterprise-grade behavioral data engine (web, mobile, server-side, webhooks), running cloud-natively on AWS and GCP
Add a description, image, and links to the data-pipeline topic page so that developers can more easily learn about it.
To associate your repository with the data-pipeline topic, visit your repo's landing page and select "manage topics."