Feature: blob size in DDFS #408

tspurway opened this Issue Jan 22, 2014 · 2 comments


None yet
3 participants

tspurway commented Jan 22, 2014

It would be useful to be able to track how much space blobs occupy in DDFS. We could modify some of the ddfs subcommand to report and/or aggregate on this data to help diagnose cluster free space issues.


oldmantaiter commented Feb 7, 2015

Was thinking about this today, it would be interesting to add something in the extended attributes of the tag for the following metrics:

  • last accessed
  • first created
  • size in cluster

Then we could issue something along the lines of a ddfs stat for the tags and get similar output to stat on the filesystem.

This would add some overhead for each tagging operation, but we could also do this as a daily type internal job that could scrape every tag/blob and get the size from the filesystem. Depending on the amount of blobs and tags this might take > 24hrs though and would not be real time.


oldmantaiter commented Feb 7, 2015

We could also add something like retention, where a "janitor" type job could iterate over the tags and see if that tag is set to expire, that way it could automatically clean them up for the next GC. Currently we use a script for this that is not very intuitive as we have to add tags to them if new types of data are pushed to the cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment