dataprocessing

Here are 8 public repositories matching this topic...

imehrdadmahdavi / map-reduce-inverted-index

Creating an Inverted Index of words occurring in a large set of documents extracted from web pages using Hadoop MapReduce and Google Dataproc

search-engine information-retrieval big-data hadoop clustering bigdata gcp map-reduce inverted-index mapreduce googlecloud dataprocessing dataproc

Updated Oct 28, 2019
Java

waikato-datamining / multiway-algorithms

Star

Java library of multi-way algorithms.

java dataprocessing multiway-algorithms parafac

Updated Jun 3, 2024
Java

tirthmehta / WINGS_PROVENANCE_EXPORT

Star

A Transformation Script for converting from the WINGS Framework to OPMW and PROV frameworks. The Transformation Script utilizes the Jena Framework in Java. Multiple aspects of the WINGS Workflow System are captured in the script including the Expanded Template. Further, it includes several tests for its validity and consistency.

java workflow-engine artificial-intelligence wings dataprocessing apache-jena transformation-script

Updated Jun 27, 2017
Java

MA-Repos / Kafka-stream-from-reddit-twitter

Star

(Work in Progress) Real time data from reddit and then apply sentimental analysis. Store the data in Hadoop as well.

machine-learning kafka apache-spark bigdata dataprocessing

Updated Sep 5, 2023
Java

BobErgot / Large-Scale-Data-Processing-Design-Patterns

Star

Explore essential MapReduce design patterns for big data processing! This repository includes practical implementations of patterns from the "MapReduce Design Patterns" book, complete with examples across summarization, filtering, organization, joins, and more.

java hadoop bigdata datascience mapreduce designpatterns dataprocessing dataengineering cloudcomputing bigdataanalytics distributedcomputing

Updated Apr 6, 2024
Java

l3s-learnweb / interweb

Star

Versatile API that consolidates multiple data providers into one unified interface

search rest-api dataprocessing

Updated Jun 12, 2024
Java

addingama / sid_waterpoints

Star

Advance IT Test for Summit Institute of Development

java tdd sid dataprocessing

Updated Oct 21, 2017
Java

tirthmehta / Google-Cloud-Platform-based-Hadoop-Map-Reduce

Star

Determination of which words occur in a dataset of textbooks along with each word's occurrence count identification with the help of Google Cloud Platform based Dataproc cluster formation.

java dataproc-cluster crawler4j googlecloud dataprocessing googlecloudplatform dataproc

Updated Jul 28, 2017
Java

Improve this page

Add a description, image, and links to the dataprocessing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the dataprocessing topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dataprocessing

Here are 8 public repositories matching this topic...

imehrdadmahdavi / map-reduce-inverted-index

waikato-datamining / multiway-algorithms

tirthmehta / WINGS_PROVENANCE_EXPORT

MA-Repos / Kafka-stream-from-reddit-twitter

BobErgot / Large-Scale-Data-Processing-Design-Patterns

l3s-learnweb / interweb

addingama / sid_waterpoints

tirthmehta / Google-Cloud-Platform-based-Hadoop-Map-Reduce

Improve this page

Add this topic to your repo