Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaner implementation of the CLI #136

Closed
wants to merge 8 commits into from
Closed

Conversation

barakmich
Copy link

Hi!

So I found your tool while setting up an HDFS cluster on Kubernetes and I like it a lot. However, especially under Kubernetes, I got bitten by #123 -- Hadoop by default uses IP addresses for datanodes (which works well in k8s) -- in particular, the default for dfs.client.use.datanode.hostname in hdfs-default.xml is false. (#118 is related)

So I reverted a79e6ee for my own purposes, but the correct answer is clearly to understand configurations better from inside the tool (environment variables, maybe a config file of its own). I noticed your CLI was fairly messy, and this ties into understanding configurations. So I went ahead and implemented a nice https://github.com/spf13/cobra CLI around the same underlying implementations of the subcommands, with minor changes.

My end goal is to support both modes of operation and let it be extensible for other such config options (eg, #76). Your ClientOptions struct is great, it just needs a way to be handled with flags and environment variables and the like. (And it needs to be piped in at the call sites in a79e6ee)

So here is my feature-parity (hopefully test neutral; we'll see what travis says) new CLI. Every action is self-contained on the file level, so new actions are also easy (making rebasing #133 easy, among others). I've omitted bash completions as a follow-up, as they're built into cobra (obviates #91).

Follow-ups:

  • Bash Completion
  • Viper integration (env variables, potential config file)-
  • Revert the accursed a79e6ee upstream with a replacement flag/env

"Screenshot":

(22:46|174)[barak@ganymede ~/src/hdfs/src/github.com/colinmarc/hdfs]
$ ./hdfs 
GoHDFS is a very fast client for HDFS clusters

Usage:
  hdfs [flags]
  hdfs [command]

Available Commands:
  cat         concatenate HDFS files and print on the standard output
  checksum    calculate the checksum of the HDFS files
  chmod       change HDFS file mode bits
  chown       change HDFS file owner and group
  df          concatenate HDFS files and print on the standard output
  du          estimate HDFS space usage
  get         get files from HDFS to the local filesystem
  getmerge    get a directory from HDFS and merge the files into a single local file
  head        output the first part of HDFS files
  help        Help about any command
  ls          list HDFS directory contents
  mkdir       make HDFS directories
  mv          move HDFS files
  put         put files from local filesystem to HDFS
  rm          remove HDFS files or directories
  tail        output the last part of HDFS files
  touch       create HDFS files and modify their timestamps

Flags:
  -h, --help   help for hdfs

Use "hdfs [command] --help" for more information about a command.
(22:46|175)[barak@ganymede ~/src/hdfs/src/github.com/colinmarc/hdfs]
$ ./hdfs help ls
list HDFS directory contents

Usage:
  hdfs ls [-lah] [FILE]...

Flags:
  -a, --all              show all entries, including hidden ones
      --help             display help
  -h, --human-readable   print human-readable sizes
  -l, --list             use a long listing format

@colinmarc
Copy link
Owner

Hi @barakmich,

Thanks for the work here, but I wish you'd asked me before embarking on it - I would've told you that I evaluated cobra back when I was starting out, but I prefer just using getopt to heavyweight frameworks. I'm also not interested in generalizing environment config (in fact, I'd like to move away from that).

#118 is easily fixed by adding a field in ClientOptions and a line to ClientOptionsFromConf, if you'd like to contribute that change.

@colinmarc colinmarc closed this Jul 29, 2018
@colinmarc
Copy link
Owner

colinmarc commented Jul 29, 2018

As for -n vs -f in mv, I'm following the posix spec: http://pubs.opengroup.org/onlinepubs/9699919799/utilities/mv.html

@barakmich
Copy link
Author

I'm sorry you disagree with me. You should really re-evaluate cobra, as, in this particular case, it's the same number of dependencies (one), gives actually helpful usage output (I don't have to run to man of the original command), gives you long options, and bash completions for free, something you're currently maintaining by hand.

If you don't want to go with full configuration, that's fine, this change doesn't do that.

I think I'll fork this out into its own repo and have your library as an upstream dep. I'll fix #118, but be advised the current library behavior does not follow the proper defaults. Thanks for all the fish.

(As for -n vs -f -- you should probably follow whatever the original hdfs tool does, but I suppose it's moot)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants