Permalink
Browse files

hdp-tree for printing a tree view of an hdfs directory

  • Loading branch information...
1 parent c06a47a commit 818d86ec5a2382c883faad777a6e0ea258551037 @thedatachef thedatachef committed Mar 9, 2011
Showing with 54 additions and 0 deletions.
  1. +28 −0 README.textile
  2. +26 −0 bin/hdp-tree
View
@@ -167,6 +167,34 @@ flow.describe
flow.run(:plot_results)
</code></pre>
+h3. Utils
+
+There's a fun little program to emphasize the ease of using the filesystem abstraction called 'hdp-tree':
+
+<pre><code>
+$: bin/hdp-tree /tmp/my_hdfs_directory
+---
+/tmp/my_hdfs_directory:
+ - my_hdfs_directory:
+ - sub_dir_a: leaf_file_1
+ - sub_dir_a: leaf_file_2
+ - sub_dir_a: leaf_file_3
+ - my_hdfs_directory:
+ - sub_dir_b: leaf_file_1
+ - sub_dir_b: leaf_file_2
+ - sub_dir_b: leaf_file_3
+ - my_hdfs_directory:
+ - sub_dir_c: leaf_file_1
+ - sub_dir_c: leaf_file_2
+ - sub_dir_c: leaf_file_3
+ - sub_dir_c:
+ - sub_sub_dir_a: yet_another_leaf_file
+ - sub_dir_c: sub_sub_dir_b
+ - sub_dir_c: sub_sub_dir_c
+</code></pre>
+
+I know, it's not as pretty as unix tree, but this IS github...
+
h3. TODO
* next task in a workflow should NOT run if the previous step failed
View
@@ -0,0 +1,26 @@
+#!/usr/bin/env jruby
+
+require 'swineherd'
+
+#
+# Creates a 'tree' view of an hdfs path. It's not as pretty as the
+# unix tree command but that's only because I'm not smart enough to
+# print the hierarchy properly.
+#
+
+FS = Swineherd::FileSystem.get(:hdfs)
+path = ARGV[0]
+
+# Recursively list paths
+def lr path
+ paths = FS.entries(path)
+ if (paths && !paths.empty?)
+ paths.map{|e| {File.basename(path) => lr(e)}}.flatten
+ else
+ File.basename(path)
+ end
+end
+
+
+tree = {path => lr(path)}.to_yaml
+puts tree

0 comments on commit 818d86e

Please sign in to comment.