Diagrams describing Apache Hadoop internals (2.3.0 or later).
Clone or download
Latest commit 784ea43 Dec 18, 2017
Type Name Latest commit message Commit time
Failed to load latest commit information.
sources ~ Apr 7, 2014
website updating website Dec 18, 2017
README.md Update README.md Apr 7, 2014
batch-rename.py First release of diagrams Mar 16, 2014



This project contains several diagrams describing Apache Hadoop internals (2.3.0 or later). Even if these diagrams are NOT specified in any formal or unambiguous language (e.g., UML), they should be reasonably understandable and useful for any person who want to grasp the main ideas behind Hadoop. Unfortunately, not all the internal details are covered by these diagrams. You are free to help :)

Ready? Go to the project website

Images linked in the wiki are dinamically generated (from LucidChart) but, in the source directory, you can find diagram snapshots in the following formats:

  • PNG
  • Visio (VDX)

A VDX file can be opened with one of the many VISIO editors (e.g., I am using the web-application editor LucidChart but unfortunately only pro users can edit an imported file). These files are periodically synced with the ones showed inside the wiki. If requested, I can share LucidChart files using Google Drive and you can help me in this project (in this case, the free account on LucidChart is enough for editing).