GitHub is home to over 28 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.
Practical utilities for spark applications
Content ExtRactor and MINEr
System Analizy Orzeczeń Sądowych /Court Judgement Analysis System/
Content Analysis System is a framework for mining scientific publications using Apache Hadoop.
Tool created to simplify the creation of packages (tar.gz files) containing workflow definition of Apache Oozie as well as all other files needed to run a workflow (configuration files, libraries, etc.).
Editor of training sets for page segmentation and zone classification of scholarly PDFs
REST service providing data for the COMAC Navigator Frontend.
An example R package project that uses code written in Java
Implementation of Random Ferns for Apache Spark
UI part of the COMAC Navigator. The frontend runs in the browser and connects over HTTP to the COMAC Navigator Backend to fetch data.
Utility for browsing and simple manipulation of Avro-based files
Proof-of-concept implementation of a blockchain approach to PIDs.
A sample Spark app
RSS feeds of journals and conferences indexed by DBLP
Oozie workflow generator
Common Map for Academia
Hadoop SequenceFile browser