Skip to content

Pipistrello is a utility that allows you to create and run Map/Reduce jobs with any executable as mapper/reducer, acting on any kind of binary files (NetCDF, jpg, png, mp3... you name it!).

juanmcloaiza/Pipistrello

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 

Repository files navigation

Pipistrello

Pipistrello is a utility that allows you to create and run Map/Reduce jobs with (almost) any executable or script as the mapper and/or the reducer in pretty much the same way as hadoop-streaming. The fundamental difference is that Pipistrello is specially designed to operate on binary files (NetCDF, jpg, png, mp3... you name it!).

Background

Pipistrello's original goal is to bring powerful technologies used by the industry (Google, Facebook, Spotify, Amazon...) into scientists laboratories. As the amount of data generated by science grows at unprecedented speed, scientists need software to catch up and harness the power of their data. Unfortunately, the general situation up to now is that the software available in their labs is quite poor when compared to the software they use at home for handling their music or photo collections. Science deserves better.

Acknowledgements:

This project was entirely developed by Juan Manuel Carmona-Loaiza as part of the Master in High Performance Computing (MHPC) of the International Center of Theoretical Physics (ICTP) and the Scuola Internazionale Superiore di Studi Avanzati (SISSA) under the supervision of Graziano Giuliani and funded by the Istituto Nazionale di Oceanografia e di Geofisica Sperimentale (OGS) and CINECA under HPC-TRES program award with number 2015-04.

About

Pipistrello is a utility that allows you to create and run Map/Reduce jobs with any executable as mapper/reducer, acting on any kind of binary files (NetCDF, jpg, png, mp3... you name it!).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published