Introduction to the MR4C repo
MR4C is an implementation framework that allows you to run native code within the Hadoop execution framework. Pairing the performance and flexibility of natively developed algorithms with the unfettered scalability and throughput inherent in Hadoop, MR4C enables large-scale deployment of advanced data processing applications.
Map to this repo
This repository includes user guide, tutorials and source code for the MR4C framework created by Google Inc. We suggest you run through this repo in the following order:
- Make sure that you have all dependencies and build (see below).
- Test that MR4C install was successful
test_mr4c.shfrom the test directory
- Study up on MR4C
- README.md in the UserGuide directory covers the basic concepts behind MR4C
- Run through the example algorithms in the tutorial directory
- Build your own algorithm using the examples as templates and let us know if you have questions or comments!
- tested with Ubuntu 12.04 and CentOS 6.5
- tested with CDH 5.2.0 (either MRV1 or YARN)
- ant (1.8.2 min)
- java (1.6 min)
- ivy (2.1 min)
- make (3.8.1 min)
- g++ (4.6.3 min)
- log4cxx (0.10.0)
- jansson (2.2.1 min)
- cppunit (1.12.1 min)
- proj4 (4.8.0 min)
- gdal (1.10 min)
There are four scripts included to build, clean, deploy and/or remove mr4c. Build with:
Clean previous builds with:
Deploy to /usr/local/mr4c using:
Remove all components with:
If you get stuck, have questions, or would like to provide any feedback, please don’t hesitate to contact us at email@example.com. Let’s do big things together.