spark-streaming

Here are 1,017 public repositories matching this topic...

risingwavelabs / risingwave

SQL stream processing, analytics, and management. We decouple storage and compute to offer instant failover, dynamic scaling, speedy bootstrapping, and efficient joins.

Updated Jun 8, 2024
Rust

AlexRogalskiy / spark-patterns

Star

🏆 Spark4You Design patterns

patterns spark ebook spark-streaming spark-sql spark-structured-streaming patterns-design

Updated Jun 8, 2024
Shell

Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines

python spark faker pyspark spark-streaming data-generation databricks synthetic-data datagen datagenerator deltalake datageneration delta-live-tables

Updated Jun 8, 2024
Python

MelvinCERBA / CryptoViz

Star

Cryptocurrency monitoring system. Includes data scraping (Selenium), processing (Spark) and visualization (grafana). Pub-Sub messaging using Kafka, persistence of processed data using PostgreSQL. Deployment with Docker Compose.

kafka spark docker-compose scraping selenium spark-streaming system-design

Updated Jun 7, 2024
Jupyter Notebook

danzipie / eth-spark-stream

Star

kafka ethereum spark-streaming

Updated Jun 6, 2024
Python

microsoft / data-accelerator

Star

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.

Updated Jun 4, 2024
C#

dotnet / spark

Star

.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Updated Jun 4, 2024
C#

agile-lab-dev / wasp

Star

WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.

elasticsearch scala kafka akka spark yarn hadoop solr jdbc hbase spark-streaming hdfs parquet

Updated Jun 3, 2024
Scala

cdapio / cdap

Star

An open source framework for building data analytic applications.

python java platform middleware spark integration dataset spark-streaming java-8 unified mapreduce cdap

Updated Jun 5, 2024
Java

GMAP / DSPBench

Star

a suite of benchmark applications for distributed data stream processing systems

big-data apache-spark storm data-stream bigdata evaluation stream-processing spark-streaming apache-storm apache-flink experiments big-data-analytics

Updated May 27, 2024
Java

Ashenoy64 / Reddit-Analysis

Star

Sentiment Analysis on streaming data and batch data from Reddit

docker kafka reddit spark sentiment-analysis spark-streaming

Updated May 25, 2024
Jupyter Notebook

00VALAK00 / Structured_data_streaming

Star

Structured data streaming using Spark’ s Structured Streaming API ,kafka for data ingestion and cassandra for data storing

python kafka spark docker-compose zookeeper spark-streaming cassandra-database

Updated May 22, 2024
Python

Shankar-Anumula / data-engineer

Star

java scala spark spark-streaming spark-sql

Updated May 21, 2024
Scala

drisskhattabi6 / Real-Time-Twitter-Sentiment-Analysis

Star

This repo contains Big Data Project, its about "Real Time Twitter Sentiment Analysis via Kafka, Spark Streaming, MongoDB and Django Dashboard".

docker django kafka big-data spark mongodb sentiment-analysis pyspark spark-streaming kafka-producer real-time-processing sentiment-classification etl-pipeline tweets-classification big-data-projects django-dashboard

Updated May 20, 2024
Jupyter Notebook

OmarNouih / Twitter-Streams

Star

Real-Time Sentiment Analysis on Twitter Streams is a web application that categorizes tweets into sentiments like Negative, Positive, Neutral, or Irrelevant. Built using Apache Kafka , Spark and PySpark ML models, it offers real-time analysis capabilities.

machine-learning kafka spark streams pyspark spark-streaming kafka-streams pyspark-mllib

Updated May 20, 2024
Python

hortonworks / registry

Star

Schema Registry

metadata kafka storm schema-registry kinesis spark-streaming schemas flink

Updated May 18, 2024
Java

trannhatnguyen2 / streaming_data_processing

Star

Data Streaming with Debezium, Kafka, Spark Streaming, Delta Lake, and MinIO

airflow kafka minio spark-streaming debezium delta-lake

Updated May 15, 2024
Python

Mahmoud-nfz / football-big-data

Star

This is a comprehensive solution for real-time football analytics, leveraging Apache Spark execution on yarn for both streaming and batch processing, Hadoop HDFS for distributed storage, Kafka for real-time data ingestion, rethinkdb for live data updates , a custom built search engine and Next.js for data visualization.

search-engine kafka spark hadoop nextjs rethinkdb spark-streaming hadoop-hdfs t3-stack

Updated May 14, 2024
TypeScript

LuisFalva / ophelia

Star

Ophelia a PySpark analytics wrapper.

spark spark-streaming dask dataframe rdd spark-mllib spark-ml ophelia ophelia-spark

Updated May 14, 2024
Python

Azure / azure-event-hubs-spark

Star

Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs

microsoft streaming real-time scala kafka spark apache-spark stream connector azure bigdata apache spark-streaming eventhubs ingestion continuous event-hubs databricks structured-streaming

Updated May 9, 2024
Scala

Improve this page

Add a description, image, and links to the spark-streaming topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the spark-streaming topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

spark-streaming

Here are 1,017 public repositories matching this topic...

risingwavelabs / risingwave

AlexRogalskiy / spark-patterns

databrickslabs / dbldatagen

MelvinCERBA / CryptoViz

danzipie / eth-spark-stream

microsoft / data-accelerator

dotnet / spark

agile-lab-dev / wasp

cdapio / cdap

GMAP / DSPBench

Ashenoy64 / Reddit-Analysis

00VALAK00 / Structured_data_streaming

Shankar-Anumula / data-engineer

drisskhattabi6 / Real-Time-Twitter-Sentiment-Analysis

OmarNouih / Twitter-Streams

hortonworks / registry

trannhatnguyen2 / streaming_data_processing

Mahmoud-nfz / football-big-data

LuisFalva / ophelia

Azure / azure-event-hubs-spark

Improve this page

Add this topic to your repo