Ruby Interacive Shell
Switch branches/tags
Nothing to show
Pull request Compare This branch is 59 commits behind hayeah:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
lib
LICENSE
README.textile
Rakefile
VERSION.yml

README.textile

Rubish

Rubish is a shell in Ruby. It is object oriented,
and it only uses Ruby’s own syntax (no
metasyntax
of its own). Rubish is pronounced
Roobish, as opposed to Rubbish, unlike Bash.

Getting Started

Install:
$ gem install hayeah-rubish —source http://gems.github.com

Fire up an irb, and start Rubish

$ irb -rrubish irb> Rubish.repl rbh> date Tue Mar 17 17:06:04 PDT 2009 rbh> uname :svrm Linux 2.6.24-16-generic #1 SMP Thu Apr 10 13:23:42 UTC 2008 i686

A few tips upfront. Sometimes Rubish could mess up
your terminal. If that happens, try hitting
C-c and C-d to get back to Bash, then,

$ reset

to reset your terminal. Also, Rubish doesn’t have
shell history. But it uses the readline library,
so you can use its history mechanism. C-r
to match a previously entered line with
string.

If you make changes to Rubish, you hit C-c to go
back to irb, and,


  irb> Rubish.reload

Command

Rubish REPL takes a line and instance_eval s it
with the shell session object. If the method is
undefined, they call is translated into a
Executable object with method_missing.


rbh> ls
awk.rb	command_builder.rb  command.rb executable.rb  LICENSE	pipe.rb  README.textile rubish.rb  sed.rb  session.rb	streamer.rb

# ls evaluates to a Rubish::Command object, which
# is a subclass of Rubish::Executable
rbh> ls.inspect
"#<Rubish::Command::ShellCommand:0xb7ac297c @args=\"\", @status=nil, @cmd=\"ls \">"

# you can store a command in an instance variable
rbh> @cmd = ls; nil
nil
# if the shell evaluates to a command, the shell
# calls the exec method on it.
rbh> @cmd # same as @cmd.exec
awk.rb	command_builder.rb  command.rb	executable.rb  LICENSE	pipe.rb  README.textile  rubish.rb  sed.rb  session.rb	streamer.rb
rbh> @cmd
awk.rb	command_builder.rb  command.rb	executable.rb  LICENSE	pipe.rb  README.textile  rubish.rb  sed.rb  session.rb	streamer.rb

You can invoke a command with arguments of String,
Symbol, or Array (of String, Symbol, or Array
(recursively)). A String argument is taken as it
is. A Symbol is translated to a flag (:flag
=> -flag). Arguments in an Array are treated
likewise. Finally, all the arguments are flatten
and joined together.

The followings are equivalent,


rbh> ls :l, "awk.rb", "sed.rb"
rbh> ls "-l awk.rb sed.rb"
rbh> ls :l, %w(awk.rb sed.rb)

Pipe


rbh> p { ls ; tr "a-z A-Z" }
AWK.RB
COMMAND_BUILDER.RB
COMMAND.RB
EXECUTABLE.RB
LICENSE
PIPE.RB
README.TEXTILE
RUBISH.RB
SED.RB
SESSION.RB
STREAMER.RB

Pipe and Command actually share the same parent
object: Rubish::Executable. So pipes are first
class values too!


  rbh> @pipe = p { ls ; tr "a-z A-Z" }; nil
  # again, we return nil so @pipe doesn't get executed.
  rbh> @pipe
  # execute @pipe once
  rbh> @pipe
  # execute @pipe again

IO redirections

IO redirections are done by methods defined in
Rubish::Executable.

Rubish::Executable#i(io=nil) Set the $stdin of the executable when it is executed. If called without an argument, return the executable’s IO object. Rubish::Executable#o(io=nil) Ditto for $stdout Rubish::Executable#err(io=nil) Ditto for $stderr

rbh> ls.o("ls-result")
rbh> cat.i("ls-result")
awk.rb
command_builder.rb
command.rb
executable.rb
LICENSE
ls-result
pipe.rb
README.textile
rubish.rb
sed.rb
session.rb
streamer.rb

Rubish can take 4 kinds of objects for IO. String
(used as a file), Integer (used as file
descriptor), IO object, or a ruby block. Using the
a block for IO, the block receives a pipe
connecting it to the command, for reading or writing.


# pump numbers into cat
rbh> cat.i { |p| p.puts((1..5).to_a) }
1
2
3
4
5

# upcase all filenames
rbh> ls.o { |p| p.each_line {|l| puts l.upcase} }
AWK.RB
COMMAND_BUILDER.RB
COMMAND.RB
EXECUTABLE.RB
LICENSE
LS-RESULT
PIPE.RB
README.TEXTILE
RUBISH.RB
SED.RB
SESSION.RB
STREAMER.RB

# kinda funny, pump numbers into cat, then pull
# them out again.
rbh> cat.i { |p| p.puts((1..10).to_a) }.o {|p| p.each_line {|l| puts l.to_i+100 }}
101
102
103
104
105
106
107
108
109
110

The input and output blocks are executed in their
own threads. Use them at your own peril. Next
section shows another way to process output with
ruby without using threads.

Rubish with Ruby

Rubish is designed so it’s easy to interface Unix
command with Ruby.

Rubish::Executable#each yield each line of output to a block. Rubish::Executable#map Like #each, but collect the values returned by the block. If no block given, collect each line of the output.

Since this is Ruby, there’s no crazy metasyntatic
issues when you want to process the output
lines.


# print filename and its extension side by side.
rbh> ls.each { |f| puts "#{f}\t#{File.extname(f)}" }
address.rb	.rb
awk.output	.output
awk.rb	.rb
command_builder.rb	.rb
command.rb	.rb
executable.rb	.rb
foo	
foobar	
foo.bar	.bar
foo.rb	.rb
LICENSE	
my.rb	.rb
pipe.rb	.rb
#README.textile#	.textile#
README.textile	.textile
rubish.rb	.rb
ruby-termios-0.9.5	.5
ruby-termios-0.9.5.tar.gz	.gz
#sed.rb#	.rb#
sed.rb	.rb
session.rb	.rb
streamer.rb	.rb
todo	
util

You can execute a command within the each block.


rbh> ls.each { |f| wc(f).exec  }
  64  131 1013 awk.rb
 116  202 1914 command_builder.rb
  56  113 1034 command.rb
 196  563 4388 executable.rb
  24  217 1469 LICENSE
 12  12 132 ls-result
  78  245 1917 pipe.rb
 142  544 3388 README.textile
 107  278 2340 rubish.rb
 46  54 546 sed.rb
  95  206 1870 session.rb
 264  708 5906 streamer.rb

One nifty thing to do is to collect the outputs of
nested commands.


rbh> ls.map {|f| stat(f).map }
[["  File: `awk.rb'",
  "  Size: 1013      \tBlocks: 8          IO Block: 4096   regular file",
  "Device: 801h/2049d\tInode: 984369      Links: 1",
  "Access: (0644/-rw-r--r--)  Uid: ( 1000/  howard)   Gid: ( 1000/  howard)",
  "Access: 2009-03-17 21:02:25.000000000 -0700",
  "Modify: 2009-03-17 21:02:13.000000000 -0700",
  "Change: 2009-03-17 21:02:13.000000000 -0700"],
 ["  File: `command_builder.rb'",
  "  Size: 1914      \tBlocks: 8          IO Block: 4096   regular file",
  "Device: 801h/2049d\tInode: 984371      Links: 1",
  "Access: (0644/-rw-r--r--)  Uid: ( 1000/  howard)   Gid: ( 1000/  howard)",
  "Access: 2009-03-17 21:02:25.000000000 -0700",
  "Modify: 2009-03-17 21:02:13.000000000 -0700",
  "Change: 2009-03-17 21:02:13.000000000 -0700"],
...
]

All the above apply to pipes as well. We can
find out how many files are in a directory as a
Ruby Integer.


rbh> p { ls; wc}
     23      23     248
rbh> p { ls; wc}.map
["     23      23     248\n"]
rbh> p { ls; wc}.map.first.split
["23", "23", "248"]
rbh> p { ls; wc}.map.first.split.first.to_i
23

Sed and Awk

Rubish has sedish and awkish things that are not quite
like sed and awk, but not entirely unlike sed and
awk.

Rubish::Sed doesn’t implicitly print (unlike real
sed). There’s actually no option to turn on
implicit printing.


  Rubish::Sed#line
    the current line sed is processing
  Rubish::Sed#p(*args)
    print current line if no argument is given.
  Rubish::Sed#s(regexp,str)
    String#sub! on the current line
  Rubish::Sed#gs(regexp,str)
    String#gsub! on the current line
  Rubish::Sed#q
    quit from sed processing.

rbh> ls.sed { gs /b/, "bee"; p if line =~ /.rbee$/ }
awk.rbee
command_beeuilder.rbee
command.rbee
executabeele.rbee
pipe.rbee
rubeeish.rbee
sed.rbee
session.rbee
streamer.rbee

# output to a file
rbh> ls.sed { p }.o "sed.result"

Rubish::Sed doesn’t have the concepts of swapping,
appending, modifying, or any interaction between
pattern space and hold space. Good riddance. The
block is instance_eval by the Sed object, so you
can keep track of state using instance variables.

Awk is a lot like sed. But you can associate
actions to be done before or after awk processing.

Rubish::Awk#begin(&block) block is instance_eval by the Awk object before processing. Rubish::Awk#act(&block) blcok is instance_eval by the Awk object for each line. Rubish::Awk#end(&block) block is instance_eval at the end of processing. Its value is returned as the result.

  rbh> ls.awk { puts do_something(line)}
  # you can have begin and end blocks for awk.
  rbh> ls.awk.begin { ...init }.act { ...body}.end { ...final}

You can associate multiple blocks with either awk
or sed. Each block is an “action” that’s processed
in left-to-right order.


  rbh> cmd.sed.act { ... }.act { ... }
  rbh> cmd.awk.act { ... }.act { ... }

Streamer

Rubish::{Sed,Awk} actually share the
Rubish::Streamer mixin. Most of their mechanisms
are implemented by this mixin. It has two
interesting features:

  1. Line buffering allows arbitrary peek ahead (of
    lines). This lets you do what sed can with
    hold space, but in a much cleaner way.
  2. Aggregation is what awk is all about. But
    Rubish::Streamer implements special aggregators
    inspired by Common Lisp’s Loop facilities.

Let’s see line buffering first.

Rubish::Streamer#peek(n=1) Return the next n lines (as Array of Strings), and put these lines in the stream buffer. Rubish::Streamer#skip(n=1) Skip the next n lines. Rubish::Streamer#next(n=1) Skip other actions in the streamer, and process next line. Rubish::Streamer#quit(n=1) Quit the streaming process.

By the way, isn’t it nice that these methods all
have four chars?


  # print files in groups of 3, separated by blank lines.
  rbh> ls.sed { p; puts peek(2); puts ""; skip(3) }

In general, the aggregating methods take a name, a
value, and an optional key. The aggregated result
is accumulated in an instance variable named by
the given name. Each aggregator type basically
does foldl on an initial value. The optional key
is used to partition an aggregation.

Rubish::Streamer#count(name,key=nil) count number of times it’s called. Rubish::Streamer#max(name,val,key=nil) Rubish::Streamer#min(name,val,key=nil) Rubish::Streamer#collect(name,val,key=nil) collect vals into an array. Rubish::Streamer#hold(name,size,val,key=nil) collect vals into a fixed-size FIFO queue. Rubish::Streamer#pick(name,val,key=nil,&block) pass the block old_val and new_val, and the value returned by block is saved in “name”.

Each aggregator’s name is used to create a
bucket. A reader method named by name can be used
to access that bucket. A bucket is a hash of
partitioned accumulation keyed by key. The special
key nil aggregates over the entire domain (like
MySQL’s rollup).


# find the length of the longest file name, and
# collect the file names.
rbh> ls.awk { f=a[0]; max(:fl,f.length,File.extname(f)); collect(:fn,f)}.end { pp buckets; [fl,fl(""),fn] }
{:fl=>{""=>10, nil=>18, ".textile"=>14, ".rb"=>18},
 :fn=>
  {nil=>
    ["awk.rb",
     "command_builder.rb",
     "command.rb",
     "executable.rb",
     "LICENSE",
     "ls-result",
     "pipe.rb",
     "README.textile",
     "rubish.rb",
     "sed.rb",
     "sed-result",
     "session.rb",
     "streamer.rb"]}}
[18,
 10,
 ["awk.rb",
  "command_builder.rb",
  "command.rb",
  "executable.rb",
  "LICENSE",
  "ls-result",
  "pipe.rb",
  "README.textile",
  "rubish.rb",
  "sed.rb",
  "sed-result",
  "session.rb",
  "streamer.rb"]]

The first printout of hash is from pp buckets. You can see the aggregation partitioned
by file extensions (in the case of fl). Note
that fl(nil) holds the max length over all the
files (the entire domain).

Happy Hacking!