Skip to content

nzhiltsov/Anduin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

91 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Anduin

Processing Large RDF Graphs on Hadoop

Anduin is a lightweight and concise tool to process RDF/N-Quads as well as RDF/NTriples formatted data using Hadoop. Anduin is written in Scala and built atop Scalding, a library from Twitter.

Current Version

0.3.1

Features

  • Support of RDF/N-Quads and RDF/NTriples formats
  • Tolerant to ill-formed RDF data
  • Gathering entity type statistics
  • Building adjacency matrices
  • Aggregating entity descriptions (e.g. for entity search)

Known Issues

There is no support of blank nodes at the moment.

Prerequisites

  • Java 1.6+
  • Scala 2.9.2+
  • tested on Apache Hadoop 1.1 as well as Amazon Web Services Elastic MapReduce

Mailing list

Have a question or a suggestion? Please join our mailing list.

anduin@googlegroups.com

Development and Contribution

Anduin has been developed by Nikita Zhiltsov. To add new functionality or fix existing bugs, feel free to contribute the patches via pull requests into the develop branch.

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

About

A Scala library to process RDF data on Hadoop

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages