#

bigdata

Here are 155 public repositories matching this topic...

flowman

dimajix / flowman

Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.

scala sql big-data spark apache-spark hadoop etl bigdata data-engineering flowman

Updated Aug 6, 2024
Scala

scray / scray

Lambda Architecture Framework for Big Data, Spark, Versioned Data, NoSQL and SQL-Stores.

nosql bigdata lambda-architectures

Updated Aug 5, 2024
Scala

AbsaOSS / spline

Data Lineage Tracking And Visualization Solution

visualization tracking scala spark hadoop bigdata lineage

Updated Jul 29, 2024
Scala

mjakubowski84 / parquet4s

Read and write Parquet in Scala. Use Scala classes as schema. No need to start a cluster.

aws scala akka hadoop bigdata google-storage fs2 writer streams reader parquet akka-streams parquet-files

Updated Aug 6, 2024
Scala

pingcap / tispark

TiSpark is built for running Apache Spark on top of TiDB/TiKV

spark bigdata tikv tidb

Updated Jul 26, 2024
Scala

apache / incubator-livy

Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.

spark bigdata livy apachelivy

Updated Jul 25, 2024
Scala

bursasha / spark-elk-docker-datasets-visualization

Analysis of socio-economic and climatic data in the USA (1975-2020) using Apache Spark and ELK Stack 🌍

visualization docker elasticsearch kibana logstash apache-spark analysis docker-compose bigdata bash-script dataset-preprocessing

Updated Jun 23, 2024
Scala

Azure / azure-event-hubs-spark

Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs

microsoft streaming real-time scala kafka spark apache-spark stream connector azure bigdata apache spark-streaming eventhubs ingestion continuous event-hubs databricks structured-streaming

Updated Jun 11, 2024
Scala

byzer-org / byzer-lang

Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.

machine-learning bigdata mlsql sql-like-dsl

Updated May 29, 2024
Scala

spotify / big-data-rosetta-code

Code snippets for solving common big data problems in various platforms. Inspired by Rosetta Code

scala spark bigdata scalding scio

Updated Jul 30, 2024
Scala

AbsaOSS / enceladus

Dynamic Conformance Engine

scala spark spring mongodb hadoop bigdata datalake

Updated May 16, 2024
Scala

Prashanth0205 / Minwage-Big-Data-Analysis-using-Apache-Spark

This project features Scala code powered by Apache Spark for analyzing minimum wage data, aiming to uncover trends and variations in minimum wage rates across states and over time. It encompasses data transformations, mean wage computation, inflation analysis, and comparisons with Department of Labor (DOL) reported wages.

scala spark bigdata

Updated May 13, 2024
Scala

ondergormez / BLM5127_Big_Data_Analytics

Average Temperature - Hadoop - Mapper - Reducer

java scala big-data cassandra hadoop bigdata cassandra-cql hadoop-filesystem hadoop-mapreduce hadoop-hdfs

Updated Mar 26, 2024
Scala

itsumma / spark-greenplum-connector

ITSumma Spark Greenplum Connector

spark connector bigdata greenplum

Updated Mar 26, 2024
Scala

Java-Edge / Spark-MLlib-Tutorial

大数据框架 Spark MLlib 机器学习库基础算法全面讲解,附带齐全的测试文件

machine-learning spark bigdata mllib

Updated Mar 22, 2024
Scala

mohankrishna02 / SparkSQL

This project demonstrates how to use Spark SQL to execute SQL queries on structured data in Spark, and display the results in a tabular format using the show() method.

spark apache-spark bigdata spark-sql

Updated Jan 24, 2024
Scala

SharpData / SharpETL

Write ETL using your favorite SQL dialects

scala sql spark hive etl bigdata data-warehouse flink datawarehouse spark-sql etl-framework flink-sql paimon

Updated Jan 7, 2024
Scala

yahoo / burst

BURST Behavioral Analysis System

scala big-data bigdata hacktoberfest

Updated Dec 11, 2023
Scala

datainsider-co / rocket-bi

A free, open-source, web-based self-service BI tailor-made for clickhouse, google bigquery, mysql, postgresql, vertica

mysql bigquery data dashboard etl analytics clickhouse bigdata postgresql ingestion hacktoberfest vertica bussiness-intelligence hacktoberfest2023

Updated Nov 30, 2023
Scala

lkycxb / cassandra-count

Count rows in Cassandra Table,Test data count reached 3 billion.

cassandra nosql bigdata

Updated Nov 29, 2023
Scala

Improve this page

Add a description, image, and links to the bigdata topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the bigdata topic, visit your repo's landing page and select "manage topics."