Skip to content
#

dataengineering

Here are 11 public repositories matching this topic...

This Maven Java project implements three common measures for link prediction in graphs: Common Neighbors, Jaccard Coefficient, and Adamic-Adar. The project leverages the power of Apache Spark to efficiently process large graphs in a distributed environment.

  • Updated Feb 26, 2023
  • Java

This project focuses on real-time data streaming with Kinesis, using Flink for advanced processing and OpenSearch for analytics. This architecture has succinctly handled the complete lifecycle of data from ingestion to actionable insights, making it a comprehensive solution.

  • Updated May 9, 2024
  • Java

Explore essential MapReduce design patterns for big data processing! This repository includes practical implementations of patterns from the "MapReduce Design Patterns" book, complete with examples across summarization, filtering, organization, joins, and more.

  • Updated Apr 6, 2024
  • Java

Improve this page

Add a description, image, and links to the dataengineering topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dataengineering topic, visit your repo's landing page and select "manage topics."

Learn more