Skip to content
This repository

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
branch: master

Dec 03, 2011

  1. Dan Lucraft

    Bump version

    danlucraft authored
  2. Dan Lucraft

    Bump version, and remove Hoe

    danlucraft authored
  3. Dan Lucraft

    Merge remote-tracking branch 'taw/master'

    Conflicts:
    	bin/rak
    	spec/help_spec.rb
    	spec/spec_helper.rb
    danlucraft authored
  4. Dan Lucraft

    Add a --lua option

    danlucraft authored
  5. Dan Lucraft

    Fix up specs

    danlucraft authored
  6. Dan Lucraft

    Merge pull request #2 from FooBarWidget/master

    Fixed Ruby 1.9 encoding problems
    danlucraft authored

Sep 03, 2010

  1. Hongli Lai

    Fix some Ruby 1.9 encoding problems.

    FooBarWidget authored

Jul 29, 2010

  1. Tomasz Wegrzanowski

    A few bugs caused by combinations of -m/-v/-c options fixed.

    Both these work in ack:
    * -v -c displayed incorrect results if multiple matches per line occurred.
    * -m -A/-C didn't display context after last match.
    
    Also:
    * -v -c -m treated lines after last match as missing instead of non-matching.
    
    -m can be interpreted in two ways:
    * Lines after limit still match, just don't print them (ack interpretation)
    * Lines after limit just don't match (what I made rak do)
    
    I don't see any way to have any kind of --eval and ack way.
    --eval 'break' means code after that should never execute.
    
    This only affects a few edge cases like:
    * matches in -A/-C context past -m limit
    * -v -c -m counts
    * -c -m counts
    
    I'm not convinced -c -m is such a sensible combination.
    -A/-C -m is rather sensible, but neither way seems to me
    to be obviously wrong (except with --eval).
    taw authored
  2. Tomasz Wegrzanowski

    Finishing spec cleanup.

    As we never leave ANSI escape codes untouched,
    and we almost always want indentation, make rak()
    do both by default. Now specs really look pretty good.
    
    Also replaced split("\n") by String#lines (which keeps final \n) -
    makes code somewhat cleaner in a few places.
    taw authored
  3. Tomasz Wegrzanowski

    Big spec cleanup.

    * Extra test added checking that all supported options are mentioned in help message.
    * To make test readable, <<-END.unindent(6) + properly indented text everywhere.
    * rak() command to do suitable ansi stripping itself.
    * RSpec supports syntax like:
        foo.should include bar
      which seems more readable than:
        foo.include?(bar).should be_true
    
    This isn't how I want spec to look like, I'm just committing as
    it's a good cleanup milestone.
    taw authored
  4. Tomasz Wegrzanowski

    Fixed minor mistakes in rak help message.

    taw authored

Jul 27, 2010

  1. Tomasz Wegrzanowski

    Merge remote branch 'danlucraft/master'

    Conflicts (trivial):
    	spec/spec_helpers.rb
    taw authored
  2. Tomasz Wegrzanowski

    First attempt at making --eval interface for filtering with arbitrary…

    … Ruby code.
    
    It's sort of like -nle/-ple. It's controlled mainly with next.
    
    Simple examples (if rak_spec.rb is too unreadable):
    
    Print lines 15..25 (only line 20 is highlighted)
      rak -C5 --eval 'next unless $. == 20'
    
    Grep numbers.
    Except skip every section demarkated by =begin and =end.
    (skipped lines could still show up as context)
      rak --eval 'next if $_[/^=begin/]..$_[/^=end]; next unless $_ =~ /\d+/'
    
    Look for fails but only in first 1000 lines per file.
      rak --eval 'break if $. >= 1000; next unless $_ =~ /\bfail/i'
    
    This next/break would be quite straightforward except we don't
    actually want to next/break File#each_line's loop - we just want
    to go to no-match case once (for next) or for the rest of file
    (for break).
    
    We should probably really break once after-context clears,
    for the sake of performance.
    
    (By the way -m in rak actually breaks, and by doing so it doesn't
    print context requested. It works in ack, but it keeps highlighting -
    I'm not sure if this is right)
    
    This --eval doesn't have elegant solution for multiple matches.
    $_.scan(/x/) { matches << $~ } is the best it can do.
    It really should be something simple like /x/g.
    
    This interface wasn't really all that well thought, I just hacked
    it together and it seems to work for now.
    
    Full --eval API:
    * $_ is current line. $. is current line number. $~ is nil.
    * next means no match
    * break means no match for either this line or the rest of file
    * finishing without next or break means a match
    * if you put something in matches, we're done
      (horrible things will happen if these don't correspond to current state of $_)
    * otherwise if $~ is not-nil - it's considered a match
      (likewise, horrible things will happen if $~ is not against current $_)
    * otherwise, the entire line is considered a single match.
    taw authored

Jul 25, 2010

  1. Tomasz Wegrzanowski

    Code generation cleanup.

    I was mostly trying to figure out why rak is slower than ack.
    Turns out it's mostly Ruby i/o being slower, and these tiny changes
    have nearly no measurable effect. Oh well.
    taw authored
  2. Tomasz Wegrzanowski

    Use Pathname, not String for pathnames.

    Dir["#{path}/*"] will fail if path contains any unusual characters,
    like {}s from firefox extension ids for example.
    Now it works and is even marginally nicer.
    taw authored
  3. Tomasz Wegrzanowski

    Option to print skipped files.

    Weird file extensions happen all the time, and it's often difficult
    to figure out if rak doesn't find a match because it's not there,
    or because it doesn't recognize some extension. --skipped will
    provide a quick way to check that.
    
    Related - added some extra extensions for SML.
    taw authored

Jul 23, 2010

  1. Tomasz Wegrzanowski

    More source file types.

    I merged three lists:
    * One from rak
    * One from ack (except special types like --text/--binary/--skipped)
    * One I made up for a script for grepping my own code repository some time ago
      (only those types that seem likely to be useful more generally)
    
    In case you're wondering .l / .y / .mll / .mly are lex/yacc for C/ocaml.
    taw authored
  2. Tomasz Wegrzanowski

    --help types to look at actual list of types, don't repeat yourself.

    taw authored
  3. Dan Lucraft

    Fix help

    danlucraft authored
  4. Tomasz Wegrzanowski

    If something is specified in cmdline, it should be followed even if a…

    … symlink
    taw authored
  5. Tomasz Wegrzanowski

    Major refactoring of file finding part. It was all so entangled that …

    …it wasn't at all
    
    clear what it was doing, and it turned out it was doing many dubious things.
    Some decisions current version does could be argued with too, but it should at least
    be a lot cleaner what it does.
    
    * Except when sorting, files are yielded to matching or printing
      immediately. This makes a huge difference in cases like grepping server
      logs over sshfs - something I do pretty much every day.
    * File matches .x if it ends with .x,
      Old rak treated all .css files as .c files.
      (also take a look at extension_regexp method)
    * Always check shebang for type if unknown, even with -a.
      Otherwise -a can actually cause fewer files to match than not
      using -a, and cause other weirdness. For example
      --perl matched #!perl scripts, but -a --perl did not, leading to
      confusion.
    * VCS check is only performed on descend. If you explicitly
      ask for .svn - or for something need inside .svn , you should get it.
    * Regexp vs Oniguruma::ORegexp stuff mostly moved aside to separate
      methods.
    * Somehow this didn't make any spec fail. You should recheck that
      it does what it's supposed to anyway. Big changes that don't cause
      test failures are suspicious ;-)
    
    I think this closes list of issues I had with rak.
    taw authored

Jul 22, 2010

  1. Tomasz Wegrzanowski

    I want to make rak able to match files immediately once found,

    not with a huge delay.
    
    Code related to that is now very messy, so first a few batches of refactoring.
    Some comments:
    
    * You cannot have both :do_search and :print_filelist be true
      It's possible that neither is true (help messages etc.) but then
      we instantly quit. Using that, a lot of cleanup.
    * xs.each{|x| puts x}
      is exactly what puts does on collections already
      puts xs
    * Testing if(x == false) is usually a bad idea. A lot of operations
      don't really bother being consistent about returning false vs nil,
      or true vs anything-but-false-and-nil. Better test if(!x) or unless(x).
    * huge cascades of nested if/elses can usually be simplified a lot.
    * when doing open(file).readline there's not much reason to rescue
      out of readline's problems but not of open's
      By the way ruby files don't need closing - gc handles them really well,
      If you write to file then open(fn,'w'){|f| ... } ensures close
      on block exit, and so you know file state. For just reading,
      open(fn).read_something works pretty much as well.
      (whichever is cleaner)
    taw authored
  2. Tomasz Wegrzanowski

    Not only is all != re not a valid shortcut, it also slows rak down a …

    …lot,
    
    and I cannot think of any case where it actually improves performance.
    
    For piping in real time - rak just couldn't do that.
    
    For huge files, rak would runs out of memory instead of just keeping a few lines at time.
    Or if we only cared for the first N first - then just reading so much more data in
    was pointless.
    
    Not to mention regexps that had exponential backtracking - they are totally fine
    if matched against one line at a time, but a huge file?
    
    Anyway, now it's faster, corrected, and can handle streams.
    taw authored
  3. Tomasz Wegrzanowski

    Merge remote branch 'danlucraft/master'

    taw authored
  4. Tomasz Wegrzanowski

    Got rid of extra \n on the end.

    Also made -v use the same separator newlines as normal match.
    In both cases new behaviour is exactly what ack does.
    taw authored
  5. Tomasz Wegrzanowski

    With tests showing failures this time.

    You need to parenthesize a regexp before prepending or appending
    ^$\b etc.
    
    $ echo "ruby" | rak_before -s 'x|y'
    1|ruby
    $ echo "ruby" | rak_after -s 'x|y'
    
    Now this solves -s/-e/-x, but with -w it's a more complicated
    matter.
    
    I started cross-checking these things against other
    regexp tools and failures are everywhere.
    
    Bug fixed in this patch, ack also has it
      $ echo 123 | egrep -w "1|3"
      $ echo 123 | pcregrep -w "1|3"
      123
      $ echo 123 | ack -w "1|3"
      123
      $ echo 123 | rak_before -w "1|3"
      1|123
      $ echo 123 | rak_after -w "1|3"
    
    ack also has another bug on its own
    it skips \b if *regexp* starts/ends with non-\w
    That is it thinks -w of /12?/ is /\b12?/
      $ echo 123 | egrep -w "12?"
      $ echo 123 | pcregrep -w "12?"
      $ echo 123 | ack -w "12?"
      123
      $ echo 123 | rak_before -w "12?"
      $ echo 123 | rak_after -w "12?"
    
    So -w definitely doesn't mean "one word"
      $ echo "0 0" | egrep -w "0.0"
      0 0
      $ echo "0 0" | pcregrep -w "0.0"
      0 0
      $ echo "0 0" | ack -w "0.0"
      0 0
      $ echo "0 0" | rak_before -w "0.0"
      1|0 0
      $ echo "0 0" | rak_after -w "0.0"
      1|0 0
    
    ack/rak say '' is -w , egrep/pcregrep say '' is not -w
      $ echo 123 | egrep -w "1?"
      $ echo 123 | pcregrep -w "1?"
      $ echo 123 | ack -w "1?"
      123
      $ echo 123 | rak_before -w "1?"
      1|123
      $ echo 123 | rak_after -w "1?"
      1|123
    
    Similar, except ack matches 1 due to bug, rak matches ''
      $ echo "1xx" | egrep -w "^\d*"
      $ echo "1xx" | pcregrep -w "^\d*"
      $ echo "1xx" | ack -w "^\d*"
      1xx
      $ echo "1xx" | rak_before -w "^\d*"
      1|1xx
      $ echo "1xx" | rak_after -w "^\d*"
      1|1xx
    
    And this is just ridiculous. Now egrep matches '', but rak doesn't?
      $ echo " 1xx" | egrep -w "^\d*"
      1xx
      $ echo " 1xx" | pcregrep -w "^\d*"
      $ echo " 1xx" | ack -w "^\d*"
      1xx
      $ echo " 1xx" | rak_before -w "^\d*"
      $ echo " 1xx" | rak_after -w "^\d*"
    
    And to make confusion complete:
      $ echo "1xx" | egrep "\b^\d*\b"
      1xx
      $ echo "1xx" | pcregrep "\b^\d*\b"
      $ echo "1xx" | ack "\b^\d*\b"
      1xx
      $ echo "1xx" | rak_before "\b^\d*\b"
      1|1xx
      $ echo "1xx" | rak_after "\b^\d*\b"
      1|1xx
    
    By the way /\b./ matches "1" but not " " in perl/ruby/egrep
    regexps - start and end of string are treated as non-words.
    (also \b^ = ^\b - they are zero length so their order doesn't matter)
    
    If you have more clue than me, do tell.
    taw authored
  6. Dan Lucraft

    Update version to 1.1

    danlucraft authored
  7. Tomasz Wegrzanowski

    All specs now pass.

    * Pathname from Ruby stdlib is an instant solution to the huge mess of File.blahs/Dir.blehs
    * Using supposedly correct way of finding Ruby binary (supports ruby1.9, jruby, ruby.exe etc.).
    * All specs standarized on behaviour that a group like:
      file:
      1| line
      2| line
    
      is always followed by empty line.
      I don't think the last one should - newline should be separator not terminator,
      but this is another issue.
    * Reason #1 why Ruby is awesome for Unix scripting is that it makes it so easy
      to totally avoid manipulating global state. Changing directory in any
      way other than Dir.chdir(dir){ ... } or exactly once on start/exit is just so shell...
      Current directory is global state of the worst kind - affecting every
      almost every i/o operation. Never do that.
    * ENV isn't as bad as chdir, but it's far better to encapsulate it.
      Here oddly Perl pwns Ruby with its local $ENV{A}='B' that restores
      either old value or its absence on exit.
      Ruby code for that would be something horrible like:
    
      def with_changed_env(key, *args)
        existed, old = ENV.has_key?(key), ENV[key]
        begin
          if args.empty?
            ENV.delete(key)
          else
            ENV[key] = args[0]
          end
          yield
        ensure
          if existed
            ENV[key] = old
          else
            ENV.delete(old)
          end
        end
      end
    
      But know RAK_TEST doesn't exist when started, so it's simplified.
    
    Anyway there is no global mutable state now left,
    and all specs pass.
    taw authored
  8. Tomasz Wegrzanowski

    Fixed handling of files with no \n at the end

    taw authored
  9. Dan Lucraft

    Harmless changes to specs to make them pass

    danlucraft authored
  10. Dan Lucraft

    Merge remote branch 'taw/master'

    danlucraft authored
  11. Dan Lucraft

    Clean up tests

    danlucraft authored
  12. Tomasz Wegrzanowski

    Fixed XML files autodetection.

    Ruby doesn't support /\Q<?xml/ from Perl's regexps.
    To get \Q effect you need to backslash manually /<\?xml/,
    call Regexp.escape("<?xml"), or use just ignore regexps
    and do something like line["<?xml"].
    taw authored
  13. Tomasz Wegrzanowski

    It's a big interconnected patch, but these issues are related.

    To find multiple matches you cannot cut string after first match,
    and match against the rest. If you do that /^x/ will match 'xxx'
    three times. The first match is done correctly, so same lines get printed,
    but highlighting and --output will be broken.
    
    This also led to infinite loops if regexp could match empty string
    (eg. matching empty lines with /^$/, (?=) look-ahead hackery, or common mistakes)
    (pcregrep seems to have the same bug)
    
    If last match also matches final \n, puts'ing its post_match will
    produce double \ns in output.
    
    If matched line contained any tabs (like far too much code),
    then prefixing it with anything will just break it.
    
    Now ignoring this issue entirely seems to be the standard practice.
    I'd still prefer to fix it, as workarounds are really hard:
    * there's no way to pre-expand tabs before rak if more than one file is searched
    * piping to tab expansion disables highlighting
    * Even with highlighting forced, results of tab expansion on output
      would still be completely wrong because of line prefixes.
      (not to mention added difficulty of expanding tabs in
      presence of unprintable escape codes)
    * expanding tabs in original files would be a good idea,
      except when it's someone else's code repository ;-)
    
    So it has to be either done in rak, or stay broken.
    
    Expanding tabs is usually simple (String#expand_tabs).
    
    There is however a big complication:
    * We cannot expand tabs before matching, as that would change
      what matches and what doesn't.
    * We cannot expand tabs between matching and highlighting,
      as match indexes refer to old string, and completely
      incorrect things would get highlighted.
    * We cannot expand between highlighting and printing,
      unless we teach String#expand_tabs which characters
      are unprintable highlighting codes.
    
    We can cut string into pieces on match boundaries and expand these separately.
    Recursive formula is only slightly nontrivial, as we need to count offsets
    after not before expansion:
      (a+b).expand_tabs(i):
        ax = a.expand_tabs(i)
        bx = b.expand_tabs(i+ax.size)
        (ax+bx)
    
    It's not terribly pretty but it works just fine.
    taw authored
  14. Tomasz Wegrzanowski

    Improved shebang matches:

    * Accept any numbers after executable name (ruby1.9, python3.0 etc.)
    * Added /sh/ for shell and /make/ for Makefile
    * Made it case insensitive (Python.app uses that...)
    * If something has shebang, it's almost certainly code, so it should be searched by default.
    taw authored
Something went wrong with that request. Please try again.