Apache NiFi custom processors
-
Updated
Sep 8, 2023 - Java
Apache NiFi custom processors
Generates fake data for big data projects. Have capability to generate medical, industry datasets. File size as well number of files and number of records can be configured
CSCI: 5408 Data Management and Warehousing Analytics Projects
An android application that allow the user to log in (and access to all his data), and connect to external distributors, in order to get the coffee generated by a Machine Learning algorithm
Akka hands-on for the Distributed Data Management course at the Hasso-Plattner-Institute
A database project that me and a team worked on for an intro to Software Engineering course.
First academic big data project to implement analysis using MapReduce and Hive platform
Airbyte is an open-source EL(T) platform that helps you replicate your data in your warehouses, lakes and databases.
Esse repositório será usado para documentar meus estudos usando a ferramenta Apache Calcite.
OSS crowdsource translation service that allows people to annotate burmese-english language pair data to ensure quality for ML tasks
LinkedIn's previous generation Kafka to HDFS pipeline.
Configurable IoT Library for Android
HTTP API with asynchronous data processing using RabbitMQ and Spring Boot
TFE-MASTER-SIRI@IFRI.UAC
The Data Pulse pipeline processes and transforms web-scraped pageviews using Apache Beam and Google Cloud Dataflow. It reads JSON lines, parses them into PageView objects, filters for "product" post types, enriches with country info, and writes to Google BigQuery. Robust logging and error handling ensure data integrity
Code lab for Confluent Schema registry for Kafka. [⚙️]
Extracts the Top K Common Words between 2 Text Files using Hadoop's MapReduce
A data engineering cli for reading and writing data to/from multiple locations across multiple formats.
아파치 카프카 애플리케이션 프로그래밍 with 자바
Clusterless is a tool for scheduling decentralized, scalable, and secure data pipelines for continuously arriving data, across clouds.
Add a description, image, and links to the data-engineering topic page so that developers can more easily learn about it.
To associate your repository with the data-engineering topic, visit your repo's landing page and select "manage topics."