Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Consider storing root tablets list of files in zookeeper #936
User tablets store their list of files as absolute URIs in Accumulo's metadata table. Metadata tablets store their list of files in the root tablet. The root tablet does not store its list of files anywhere and relies on HDFS. This creates specialized code for the root tablet. This also make some changes like #642 more difficult and makes running in S3 more complicated.
There are three major changes in this commit : * An abstraction layer for interacting with Accumulo's persisted metadata called Ample was introduced. The goal is to eventually make all metadata read and write operations use Ample. * How the root tablet's metadata is stored in zookeeper was changed. It was changed to use a single zookeeper node (which is good for making atomic updates to multiple fields). In this single zookeeper node a json value is stored. This json value has the same schema as all other metadata tablets, it uses the same column families and qualifiers. This makes updating the json using Accumulo mutations easy and reading the json as Accumulo Key values easy. * Alot of the root tablet code that used to interact directly with Zookeeper was updated to use Ample. In follow on changes, a lot of specialized root tablet code can be completely removed. Those changes were not made in this commit inorder to keep it from becoming too large, making it hard to review. This change is starting point to support many other changes like #816, #817, #936, #1121
This commit changes Accumulo to always call the volume chooser every time a tablet creates a new file. It also changes the interpretation of the srv:dir column in the metadata table. This column used to contain a URI to a directory on a specific volume that was used for all new tablet files. Now the srv:dir column only contains a directory name. This directory name will be used for new tablet files across all volumes. This change necessitated to ~del markers in the metadata table used for garbage collection. When a table is cloned or tablets are merged out of existance it can result in ~del markers for tablet dirs being placed in the metadata table. These ~del markers used to reference a specific volume. With this change, the ~del marker now use a special URI of the form accumulo://allVolumes/accumulo/tables/<tableId>/<dir name> When the Accumulo GC sees this, it will delete the dir on all configured volumes when its no longer used. This change superceded apache#642. These changes are possible because of the changes made in apache#936.