A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine
-
Updated
May 25, 2024 - Scala
A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine
Feathr – A scalable, unified data and AI engineering platform for enterprise
Collection of open-source Spark tools & frameworks that have made the data engineering and data science teams at Swoop highly productive
A simple Spark-powered ETL framework that just works 🍺
Flowman is an ETL framework powered by Apache Spark. With its declarative approach, Flowman simplifies the development of complex data pipelines.
A schema-aware Scala library for data transformation
Waimak is an open-source framework that makes it easier to create complex data flows in Apache Spark.
A re-implementation of Hadoop DistCP in Apache Spark
Data manipulation and reporting for Scala.
Data Brewery is an ETL (Extract-Transform-Load) program that connect to many data sources (cloud services, databases, ...) and manage data warehouse workflow.
OpenSnowcat Collector, an open source fork of Snowplow (Apache 2.0 License)
akka http service for serving spark machine learning models
OpenSnowcat Enricher (Apache 2.0 License)
Data Generators -> Kafka -> Spark Streaming -> PostgreSQL -> Grafana
Flink Example
Huemul BigDataGovernance, es una framework que trabaja sobre Spark, Hive y HDFS. Permite la implementación de una estrategia corporativa de dato único, basada en buenas prácticas de Gobierno de Datos. Permite implementar tablas con control de Primary Key y Foreing Key al insertar y actualizar datos utilizando la librería, Validación de nulos, la…
Optimal distributed data deduplication and supervised learning pipeline using Apache Spark
OpenSnowcat Relational Database Loader (Apache 2.0 License)
Business Validation Testing in Spark SQL
Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.
To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."