#

data-pipeline

Here are 18 public repositories matching this topic...

sohamray19 / KafkaSparkStreamingPOCinScala

POC in Apache Kafka and Spark Streaming using Avro serialization.

serialization kafka big-data spark avro apache pubsub spark-streaming data-pipeline

Updated Sep 6, 2018
Scala

stephen29xie / tweet-streaming-data-pipeline

Real-time streaming data pipeline for Twitter Tweets

docker scala kafka mongodb twitter4j data-pipeline spark-structured-streaming

Updated Jan 31, 2022
Scala

akshitvjain / realtime-twitter-trends-analytics

A big data project to develop a real-time data pipeline for analyzing the popularity and sentiments of trending topics on Twitter.

Updated Jun 21, 2022
Scala

itkpi / trembita

Model complex data transformation pipelines easily

cats functional spark cassandra slf4j dsl lazy parallel collections finite-state-machine typesafe akka-streams phantom infinispan data-pipeline log4j2 typelevel-programming

Updated Sep 23, 2022
Scala

techmonad / spark-data-pipeline

This project describes how to write full ETL data pipeline using spark.

elasticsearch scala kafka spark sbt data-pipeline

Updated Oct 15, 2022
Scala

JHLeeeMe / fake-data-pipeline

Data Generators -> Kafka -> Spark Streaming -> PostgreSQL -> Grafana

docker scala kafka spark docker-compose grafana postgresql data-engineering data-pipeline

Updated Jan 31, 2023
Scala

princebhatt9588 / Real_Time_Twitter_Trends_Analytics

A cutting-edge big data initiative aimed at creating a real-time data pipeline to analyze the popularity and sentiments of trending topics on Twitter.

mongodb bigdata zookeeper spark-streaming business-intelligence realtime-database tableau twitter-sentiment-analysis kafka-streams data-pipeline apache-drill

Updated Jul 24, 2023
Scala

PATRICIAJUNQUEIRA / pipeline-databricks-azure

Pipeline de dados no Azure para base de imóveis, com estrutura em três camadas (unbound, silver, gold) e trigger automática a cada hora para atualização consistente.

real-estate scala azure databricks data-pipeline azure-data-factory data-engineering-pipeline

Updated Feb 1, 2024
Scala

GameTuner / collector

GameTuner Scala Stream Collector is project for collecting raw events from tracker

data scala gaming analytics snowplow collector data-pipeline big-data-analytics gaming-analytics gametuner

Updated Feb 6, 2024
Scala

GameTuner / enricher

GameTuner Enricher application for processing raw events

data scala gaming analytics snowplow data-pipeline enrich gaming-analytics gametuner

Updated Feb 6, 2024
Scala

snowplow / enrich

Snowplow Enrichment jobs and library

etl analytics snowplow data-pipeline

Updated Jun 25, 2024
Scala

vesoft-inc / nebula-exchange

NebulaGraph Exchange is an Apache Spark application to parse data from different sources to NebulaGraph in a distributed environment. It supports both batch and streaming data in various formats and sources including other Graph Databases, RDBMS, Data warehouses, NoSQL, Message Bus, File systems, etc.

spark etl data-import graph-database hacktoberfest data-pipeline nebulagraph

Updated Jul 17, 2024
Scala

opensnowcat-rdb-loader

opensnowcat / opensnowcat-rdb-loader

OpenSnowcat Relational Database Loader (Apache 2.0 License)

analytics snowplow data-engineering event-pipeline data-pipeline

Updated Aug 11, 2024
Scala

snowplow

snowplow / snowplow

The leader in Next-Generation Customer Data Infrastructure

data analytics snowplow data-collection data-pipeline product-analytics marketing-analytics snowplow-pipeline snowplow-events

Updated Sep 2, 2024
Scala

opensnowcat-enrich

opensnowcat / opensnowcat-enrich

OpenSnowcat Enricher (Apache 2.0 License)

analytics snowplow data-engineering event-pipeline data-pipeline

Updated Oct 21, 2024
Scala

opensnowcat-collector

opensnowcat / opensnowcat-collector

OpenSnowcat Collector, an open source fork of Snowplow (Apache 2.0 License)

analytics snowplow data-engineering event-pipeline data-pipeline

Updated Oct 23, 2024
Scala

AbsaOSS / pramen

Resilient data pipeline framework running on Apache Spark

scala big-data spark etl hacktoberfest data-pipeline

Updated Nov 6, 2024
Scala

starlake-ai / starlake

Declarative text based tool for data analysts and engineers to extract, load, transform and orchestrate their data pipelines.

bigquery spark etl snowflake data-engineering hdfs data-integration redshift synapse data-pipeline

Updated Nov 6, 2024
Scala

Improve this page

Add a description, image, and links to the data-pipeline topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-pipeline topic, visit your repo's landing page and select "manage topics."