Home

OMERO Hadoop integration

The aim of this project is to investigate ways of accessing high performance computing (HPC) resources from OMERO, using Apache Hadoop as an example.

There are many potential ways in which OMERO could take advantage of a HPC cluster, such as distributing the generation of image pyramids (this is required to implement the image zooming capabilities in the image viewers), and for image processing and analysis.

There are currently two main strands to this work:

Distributed processing: OMERO should be able to delegate the execution of computationally-intensive task to a cluster which may be shared with other non-OMERO users.
Support for the Hadoop file system (HDFS) in OMERO: Allow the storage and retrieval of data to/from the cluster filesystem.

This work is heavily dependent on Pydoop (http://pydoop.sourceforge.net), a Python interface for Hadoop developed by CRS4 (http://www.crs4.it). Current progress can be found at OMERO Hadoop development notes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

OMERO Hadoop integration

Clone this wiki locally