Skip to content

Commit

Permalink
hdp-tree for printing a tree view of an hdfs directory
Browse files Browse the repository at this point in the history
  • Loading branch information
Jacob Perkins committed Mar 9, 2011
1 parent c06a47a commit 818d86e
Show file tree
Hide file tree
Showing 2 changed files with 54 additions and 0 deletions.
28 changes: 28 additions & 0 deletions README.textile
Expand Up @@ -167,6 +167,34 @@ flow.describe
flow.run(:plot_results)
</code></pre>

h3. Utils

There's a fun little program to emphasize the ease of using the filesystem abstraction called 'hdp-tree':

<pre><code>
$: bin/hdp-tree /tmp/my_hdfs_directory
---
/tmp/my_hdfs_directory:
- my_hdfs_directory:
- sub_dir_a: leaf_file_1
- sub_dir_a: leaf_file_2
- sub_dir_a: leaf_file_3
- my_hdfs_directory:
- sub_dir_b: leaf_file_1
- sub_dir_b: leaf_file_2
- sub_dir_b: leaf_file_3
- my_hdfs_directory:
- sub_dir_c: leaf_file_1
- sub_dir_c: leaf_file_2
- sub_dir_c: leaf_file_3
- sub_dir_c:
- sub_sub_dir_a: yet_another_leaf_file
- sub_dir_c: sub_sub_dir_b
- sub_dir_c: sub_sub_dir_c
</code></pre>

I know, it's not as pretty as unix tree, but this IS github...

h3. TODO

* next task in a workflow should NOT run if the previous step failed
Expand Down
26 changes: 26 additions & 0 deletions bin/hdp-tree
@@ -0,0 +1,26 @@
#!/usr/bin/env jruby

require 'swineherd'

#
# Creates a 'tree' view of an hdfs path. It's not as pretty as the
# unix tree command but that's only because I'm not smart enough to
# print the hierarchy properly.
#

FS = Swineherd::FileSystem.get(:hdfs)
path = ARGV[0]

# Recursively list paths
def lr path
paths = FS.entries(path)
if (paths && !paths.empty?)
paths.map{|e| {File.basename(path) => lr(e)}}.flatten
else
File.basename(path)
end
end


tree = {path => lr(path)}.to_yaml
puts tree

0 comments on commit 818d86e

Please sign in to comment.