DKPro large scale processing support
Java Shell Arc
Switch branches/tags
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
dkpro-bigdata-collocations
dkpro-bigdata-doc
dkpro-bigdata-examples
dkpro-bigdata-hadoop
dkpro-bigdata-io-hadoop
.gitattributes
.gitignore
LICENSE.txt
README.md
pom.xml

README.md

dkpro-bigdata

DKPro BigData enables the easy execution of UIMA-based natural language processing pipelines on a hadoop cluster.

###Features Large scale NLP processing using UIMA and hadoop Store your corpora on a Hadoop filesystem and access them from local or distributed pipelines Find patterns in your textual data using adaptable collocation extraction ###Details

  • Execute DKPro pipelines on a hadoop cluster with minimal adaption

  • Read data stored on a HDFS Filesystem using DKPro Collection Readers

  • Read/Write serialized CASes from HDFS ###Contributors:

  • Hans-Peter Zorn

  • Johannes Simon

  • Martin Riedl

  • Richard Eckart de Castilho

  • Steffen Remus

##License DKPro BigData is licensed under the Apache Software Licence (ASL) Version 2.0.

This project is a joint effort of UKP Lab and the Language Technology Group, Technical University of Darmstadt.