Skip to content

printmiles/Odin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Odin

Summary

A data curation and archival tool created as part of my MSc dissertation. Users can use a client (Sleipnir) that can be downloaded from the server (Odin) to upload documents to the server's repository.

Documents are scanned by the client and the detected metadata is shown to the user for them to make any amendments (adding author details or extra keywords to aid later retrieval). The metadata and document are sent to a web-service on the server where they are stored for later retrieval.

Currently the local file system is used for storage of the original documents (as individual entities) and the database of the repository metadata (as a XML database). Future plans are to provide support for Apache Hadoop's HDFS for filing and alternative database back-ends (MySQL and Apache Cassandra).

Interesting Code

  • src/odin/sleipnir/MetadataWorker.java - Lines 68-110 - Uses Apache Tika to identify the document's MIME type and extract its metadata.

Planned Features

  • Improve the GUI
  • Provide database reporting for SysAdmins
  • Provide communication (SOAP) between servers
  • Provide a means for rule creation on a server
  • Allow rules to be negotiated between servers and propogated accordingly

Required Libraries

  • Apache Tika
  • Apache Commons Compress
  • Saxon HE
  • XQuery for Java (XQJ)

More Information

For more information about this project (as part of my MSc dissertation) please visit: https://sites.google.com/site/printmiles/home/projects/postgraduate-dissertation

About

Data curation and archival tool created as part of my MSc dissertation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages