OpenSnowcat Relational Database Loader (Apache 2.0 License)
-
Updated
May 26, 2024 - Scala
OpenSnowcat Relational Database Loader (Apache 2.0 License)
OpenSnowcat Collector, an open source fork of Snowplow (Apache 2.0 License)
OpenSnowcat Enricher (Apache 2.0 License)
A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine
Example API implementation for Data Caterer
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
LovinData - Simplified Full Stack Data Engineering
Feathr – A scalable, unified data and AI engineering platform for enterprise
A schema-aware Scala library for data transformation
A re-implementation of Hadoop DistCP in Apache Spark
Trendyol Data Engineering Technical Case Study.
A simple Spark-powered ETL framework that just works 🍺
Flink Example
Comprehensive training program equips developers with essential skills in data engineering and data science life cycles, encompassing data processing, software development, ML/AI, and KPI visualization for real-world business problem-solving.
Data manipulation and reporting for Scala.
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
Huemul BigDataGovernance, es una framework que trabaja sobre Spark, Hive y HDFS. Permite la implementación de una estrategia corporativa de dato único, basada en buenas prácticas de Gobierno de Datos. Permite implementar tablas con control de Primary Key y Foreing Key al insertar y actualizar datos utilizando la librería, Validación de nulos, la…
Scala library to validate tabular data loaded from CSV files
Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.
To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."