Creating an Inverted Index of words occurring in a large set of documents extracted from web pages using Hadoop MapReduce and Google Dataproc
-
Updated
Oct 28, 2019 - Java
Creating an Inverted Index of words occurring in a large set of documents extracted from web pages using Hadoop MapReduce and Google Dataproc
Java library of multi-way algorithms.
A Transformation Script for converting from the WINGS Framework to OPMW and PROV frameworks. The Transformation Script utilizes the Jena Framework in Java. Multiple aspects of the WINGS Workflow System are captured in the script including the Expanded Template. Further, it includes several tests for its validity and consistency.
(Work in Progress) Real time data from reddit and then apply sentimental analysis. Store the data in Hadoop as well.
Explore essential MapReduce design patterns for big data processing! This repository includes practical implementations of patterns from the "MapReduce Design Patterns" book, complete with examples across summarization, filtering, organization, joins, and more.
Versatile API that consolidates multiple data providers into one unified interface
Advance IT Test for Summit Institute of Development
Determination of which words occur in a dataset of textbooks along with each word's occurrence count identification with the help of Google Cloud Platform based Dataproc cluster formation.
Add a description, image, and links to the dataprocessing topic page so that developers can more easily learn about it.
To associate your repository with the dataprocessing topic, visit your repo's landing page and select "manage topics."