Skip to content
Browse files

Reformatting.

- Bringing indents in a little.
- Simplifying a little logic so that boolean shortcuts are taken faster.
  • Loading branch information...
1 parent 07a0a2b commit 56ce5139362e4e4fa26ceb763c79deee67489106 @halostatue committed
Showing with 1,352 additions and 1,385 deletions.
  1. +0 −2 lib/diff-lcs.rb
  2. +642 −661 lib/diff/lcs.rb
  3. +1 −14 lib/diff/lcs/array.rb
  4. +4 −17 lib/diff/lcs/block.rb
  5. +218 −221 lib/diff/lcs/callbacks.rb
  6. +2 −3 lib/diff/lcs/change.rb
  7. +0 −2 lib/diff/lcs/htmldiff.rb
  8. +65 −56 lib/diff/lcs/hunk.rb
  9. +262 −227 lib/diff/lcs/internals.rb
  10. +157 −168 lib/diff/lcs/ldiff.rb
  11. +1 −14 lib/diff/lcs/string.rb
View
2 lib/diff-lcs.rb
@@ -1,5 +1,3 @@
# -*- ruby encoding: utf-8 -*-
require 'diff/lcs'
-
-# vim: ft=ruby
View
1,303 lib/diff/lcs.rb
@@ -1,136 +1,135 @@
# -*- ruby encoding: utf-8 -*-
-module Diff
- # = Diff::LCS 1.1.3
- # Computes "intelligent" differences between two sequenced Enumerables.
- # This is an implementation of the McIlroy-Hunt "diff" algorithm for
- # Enumerable objects that include Diffable.
- #
- # Based on Mario I. Wolczko's Smalltalk version (1.2, 1993) and Ned Konz's
- # Perl version (Algorithm::Diff 1.15).
- #
- # == Synopsis
- # require 'diff/lcs'
- #
- # seq1 = %w(a b c e h j l m n p)
- # seq2 = %w(b c d e f j k l m r s t)
- #
- # lcs = Diff::LCS.LCS(seq1, seq2)
- # diffs = Diff::LCS.diff(seq1, seq2)
- # sdiff = Diff::LCS.sdiff(seq1, seq2)
- # seq = Diff::LCS.traverse_sequences(seq1, seq2, callback_obj)
- # bal = Diff::LCS.traverse_balanced(seq1, seq2, callback_obj)
- # seq2 == Diff::LCS.patch(seq1, diffs)
- # seq2 == Diff::LCS.patch!(seq1, diffs)
- # seq1 == Diff::LCS.unpatch(seq2, diffs)
- # seq1 == Diff::LCS.unpatch!(seq2, diffs)
- # seq2 == Diff::LCS.patch(seq1, sdiff)
- # seq2 == Diff::LCS.patch!(seq1, sdiff)
- # seq1 == Diff::LCS.unpatch(seq2, sdiff)
- # seq1 == Diff::LCS.unpatch!(seq2, sdiff)
- #
- # Alternatively, objects can be extended with Diff::LCS:
- #
- # seq1.extend(Diff::LCS)
- # lcs = seq1.lcs(seq2)
- # diffs = seq1.diff(seq2)
- # sdiff = seq1.sdiff(seq2)
- # seq = seq1.traverse_sequences(seq2, callback_obj)
- # bal = seq1.traverse_balanced(seq2, callback_obj)
- # seq2 == seq1.patch(diffs)
- # seq2 == seq1.patch!(diffs)
- # seq1 == seq2.unpatch(diffs)
- # seq1 == seq2.unpatch!(diffs)
- # seq2 == seq1.patch(sdiff)
- # seq2 == seq1.patch!(sdiff)
- # seq1 == seq2.unpatch(sdiff)
- # seq1 == seq2.unpatch!(sdiff)
- #
- # Default extensions are provided for Array and String objects through the
- # use of 'diff/lcs/array' and 'diff/lcs/string'.
- #
- # == Introduction (by Mark-Jason Dominus)
- #
- # <em>The following text is from the Perl documentation. The only changes
- # have been to make the text appear better in Rdoc</em>.
- #
- # I once read an article written by the authors of +diff+; they said that
- # they hard worked very hard on the algorithm until they found the right
- # one.
- #
- # I think what they ended up using (and I hope someone will correct me,
- # because I am not very confident about this) was the `longest common
- # subsequence' method. In the LCS problem, you have two sequences of
- # items:
- #
- # a b c d f g h j q z
- # a b c d e f g i j k r x y z
- #
- # and you want to find the longest sequence of items that is present in
- # both original sequences in the same order. That is, you want to find a
- # new sequence *S* which can be obtained from the first sequence by
- # deleting some items, and from the second sequence by deleting other
- # items. You also want *S* to be as long as possible. In this case *S* is:
- #
- # a b c d f g j z
- #
- # From there it's only a small step to get diff-like output:
- #
- # e h i k q r x y
- # + - + + - + + +
- #
- # This module solves the LCS problem. It also includes a canned function
- # to generate +diff+-like output.
- #
- # It might seem from the example above that the LCS of two sequences is
- # always pretty obvious, but that's not always the case, especially when
- # the two sequences have many repeated elements. For example, consider
- #
- # a x b y c z p d q
- # a b c a x b y c z
- #
- # A naive approach might start by matching up the +a+ and +b+ that appear
- # at the beginning of each sequence, like this:
- #
- # a x b y c z p d q
- # a b c a b y c z
- #
- # This finds the common subsequence +a b c z+. But actually, the LCS is
- # +a x b y c z+:
- #
- # a x b y c z p d q
- # a b c a x b y c z
- #
- # == Author
- # This version is by Austin Ziegler <austin@rubyforge.org>.
- #
- # It is based on the Perl Algorithm::Diff (1.15) by Ned Konz , copyright
- # &copy; 2000&ndash;2002 and the Smalltalk diff version by Mario I.
- # Wolczko, copyright &copy; 1993. Documentation includes work by
- # Mark-Jason Dominus.
- #
- # == Licence
- # Copyright &copy; 2004 Austin Ziegler
- # This program is free software; you can redistribute it and/or modify it
- # under the same terms as Ruby, or alternatively under the Perl Artistic
- # licence.
- #
- # == Credits
- # Much of the documentation is taken directly from the Perl
- # Algorithm::Diff implementation and was written originally by Mark-Jason
- # Dominus and later by Ned Konz. The basic Ruby implementation was
- # re-ported from the Smalltalk implementation, available at
- # ftp://st.cs.uiuc.edu/pub/Smalltalk/MANCHESTER/manchester/4.0/diff.st
- #
- # #sdiff and #traverse_balanced were written for the Perl version by Mike
- # Schilli <m@perlmeister.com>.
- #
- # "The algorithm is described in <em>A Fast Algorithm for Computing
- # Longest Common Subsequences</em>, CACM, vol.20, no.5, pp.350-353, May
- # 1977, with a few minor improvements to improve the speed."
- module LCS
- VERSION = '1.1.3'
- end
+module Diff; end unless defined? Diff
+# = Diff::LCS 1.2.0
+#
+# Computes "intelligent" differences between two sequenced Enumerables. This
+# is an implementation of the McIlroy-Hunt "diff" algorithm for Enumerable
+# objects that include Diffable.
+#
+# Based on Mario I. Wolczko's Smalltalk version (1.2, 1993) and Ned Konz's
+# Perl version (Algorithm::Diff 1.15).
+#
+# == Synopsis
+# require 'diff/lcs'
+#
+# seq1 = %w(a b c e h j l m n p)
+# seq2 = %w(b c d e f j k l m r s t)
+#
+# lcs = Diff::LCS.lcs(seq1, seq2)
+# diffs = Diff::LCS.diff(seq1, seq2)
+# sdiff = Diff::LCS.sdiff(seq1, seq2)
+# seq = Diff::LCS.traverse_sequences(seq1, seq2, callback_obj)
+# bal = Diff::LCS.traverse_balanced(seq1, seq2, callback_obj)
+# seq2 == Diff::LCS.patch(seq1, diffs)
+# seq2 == Diff::LCS.patch!(seq1, diffs)
+# seq1 == Diff::LCS.unpatch(seq2, diffs)
+# seq1 == Diff::LCS.unpatch!(seq2, diffs)
+# seq2 == Diff::LCS.patch(seq1, sdiff)
+# seq2 == Diff::LCS.patch!(seq1, sdiff)
+# seq1 == Diff::LCS.unpatch(seq2, sdiff)
+# seq1 == Diff::LCS.unpatch!(seq2, sdiff)
+#
+# Alternatively, objects can be extended with Diff::LCS:
+#
+# seq1.extend(Diff::LCS)
+# lcs = seq1.lcs(seq2)
+# diffs = seq1.diff(seq2)
+# sdiff = seq1.sdiff(seq2)
+# seq = seq1.traverse_sequences(seq2, callback_obj)
+# bal = seq1.traverse_balanced(seq2, callback_obj)
+# seq2 == seq1.patch(diffs)
+# seq2 == seq1.patch!(diffs)
+# seq1 == seq2.unpatch(diffs)
+# seq1 == seq2.unpatch!(diffs)
+# seq2 == seq1.patch(sdiff)
+# seq2 == seq1.patch!(sdiff)
+# seq1 == seq2.unpatch(sdiff)
+# seq1 == seq2.unpatch!(sdiff)
+#
+# Default extensions are provided for Array and String objects through the
+# use of 'diff/lcs/array' and 'diff/lcs/string'.
+#
+# == Introduction (by Mark-Jason Dominus)
+#
+# <em>The following text is from the Perl documentation. The only changes
+# have been to make the text appear better in Rdoc</em>.
+#
+# I once read an article written by the authors of +diff+; they said that
+# they hard worked very hard on the algorithm until they found the right
+# one.
+#
+# I think what they ended up using (and I hope someone will correct me,
+# because I am not very confident about this) was the `longest common
+# subsequence' method. In the LCS problem, you have two sequences of items:
+#
+# a b c d f g h j q z
+# a b c d e f g i j k r x y z
+#
+# and you want to find the longest sequence of items that is present in both
+# original sequences in the same order. That is, you want to find a new
+# sequence *S* which can be obtained from the first sequence by deleting
+# some items, and from the second sequence by deleting other items. You also
+# want *S* to be as long as possible. In this case *S* is:
+#
+# a b c d f g j z
+#
+# From there it's only a small step to get diff-like output:
+#
+# e h i k q r x y
+# + - + + - + + +
+#
+# This module solves the LCS problem. It also includes a canned function to
+# generate +diff+-like output.
+#
+# It might seem from the example above that the LCS of two sequences is
+# always pretty obvious, but that's not always the case, especially when the
+# two sequences have many repeated elements. For example, consider
+#
+# a x b y c z p d q
+# a b c a x b y c z
+#
+# A naive approach might start by matching up the +a+ and +b+ that appear at
+# the beginning of each sequence, like this:
+#
+# a x b y c z p d q
+# a b c a b y c z
+#
+# This finds the common subsequence +a b c z+. But actually, the LCS is +a x
+# b y c z+:
+#
+# a x b y c z p d q
+# a b c a x b y c z
+#
+# == Author
+# This version is by Austin Ziegler <austin@rubyforge.org>.
+#
+# It is based on the Perl Algorithm::Diff (1.15) by Ned Konz , copyright
+# &copy; 2000&ndash;2002 and the Smalltalk diff version by Mario I.
+# Wolczko, copyright &copy; 1993. Documentation includes work by
+# Mark-Jason Dominus.
+#
+# == Licence
+# Copyright &copy; 2004&ndash;2102 Austin Ziegler
+# This program is free software; you can redistribute it and/or modify it
+# under the same terms as Ruby, or alternatively under the Perl Artistic
+# licence.
+#
+# == Credits
+# Much of the documentation is taken directly from the Perl Algorithm::Diff
+# implementation and was written originally by Mark-Jason Dominus and later
+# by Ned Konz. The basic Ruby implementation was re-ported from the
+# Smalltalk implementation, available at
+# ftp://st.cs.uiuc.edu/pub/Smalltalk/MANCHESTER/manchester/4.0/diff.st
+#
+# #sdiff and #traverse_balanced were written for the Perl version by Mike
+# Schilli <m@perlmeister.com>.
+#
+# "The algorithm is described in <em>A Fast Algorithm for Computing Longest
+# Common Subsequences</em>, CACM, vol.20, no.5, pp.350-353, May
+# 1977, with a few minor improvements to improve the speed."
+module Diff::LCS
+ VERSION = '1.2.0'
end
require 'diff/lcs/callbacks'
@@ -142,7 +141,7 @@ module Diff::LCS
#
# lcs = seq1.lcs(seq2)
def lcs(other, &block) #:yields self[ii] if there are matched subsequences:
- Diff::LCS.LCS(self, other, &block)
+ Diff::LCS.lcs(self, other, &block)
end
# Returns the difference set between +self+ and +other+. See
@@ -197,449 +196,388 @@ def unpatch!(patchset)
end
end
-module Diff::LCS
- class << self
- # Given two sequenced Enumerables, LCS returns an Array containing their
- # longest common subsequences.
- #
- # lcs = Diff::LCS.LCS(seq1, seq2)
- #
- # This array whose contents is such that:
- #
- # lcs.each_with_index do |ee, ii|
- # assert(ee.nil? || (seq1[ii] == seq2[ee]))
- # end
- #
- # If a block is provided, the matching subsequences will be yielded from
- # +seq1+ in turn and may be modified before they are placed into the
- # returned Array of subsequences.
- def LCS(seq1, seq2, &block) #:yields seq1[ii] for each matched:
- matches = Diff::LCS::Internals.lcs(seq1, seq2)
- ret = []
- matches.each_with_index do |ee, ii|
- unless matches[ii].nil?
- if block_given?
- ret << (yield seq1[ii])
- else
- ret << seq1[ii]
- end
+class << Diff::LCS
+ def lcs(seq1, seq2, &block) #:yields seq1[ii] for each matched:
+ matches = Diff::LCS::Internals.lcs(seq1, seq2)
+ ret = []
+ matches.each_with_index do |ee, ii|
+ unless matches[ii].nil?
+ if block_given?
+ ret << (yield seq1[ii])
+ else
+ ret << seq1[ii]
end
end
- ret
end
+ ret
+ end
+ alias_method :LCS, :lcs
- # Diff::LCS.diff computes the smallest set of additions and deletions
- # necessary to turn the first sequence into the second, and returns a
- # description of these changes.
- #
- # See Diff::LCS::DiffCallbacks for the default behaviour. An alternate
- # behaviour may be implemented with Diff::LCS::ContextDiffCallbacks. If
- # a Class argument is provided for +callbacks+, #diff will attempt to
- # initialise it. If the +callbacks+ object (possibly initialised)
- # responds to #finish, it will be called.
- def diff(seq1, seq2, callbacks = nil, &block) # :yields diff changes:
- callbacks ||= Diff::LCS::DiffCallbacks
- if callbacks.kind_of?(Class)
- cb = callbacks.new rescue callbacks
- callbacks = cb
- end
- traverse_sequences(seq1, seq2, callbacks)
- callbacks.finish if callbacks.respond_to?(:finish)
+ # #diff computes the smallest set of additions and deletions necessary to
+ # turn the first sequence into the second, and returns a description of
+ # these changes.
+ #
+ # See Diff::LCS::DiffCallbacks for the default behaviour. An alternate
+ # behaviour may be implemented with Diff::LCS::ContextDiffCallbacks. If a
+ # Class argument is provided for +callbacks+, #diff will attempt to
+ # initialise it. If the +callbacks+ object (possibly initialised) responds
+ # to #finish, it will be called.
+ def diff(seq1, seq2, callbacks = nil, &block) # :yields diff changes:
+ callbacks ||= Diff::LCS::DiffCallbacks
+ if callbacks.kind_of?(Class)
+ cb = callbacks.new rescue callbacks
+ callbacks = cb
+ end
+ traverse_sequences(seq1, seq2, callbacks)
+ callbacks.finish if callbacks.respond_to?(:finish)
- if block_given?
- res = callbacks.diffs.map do |hunk|
- if hunk.kind_of?(Array)
- hunk = hunk.map { |hunk_block| yield hunk_block }
- else
- yield hunk
- end
+ if block_given?
+ res = callbacks.diffs.map do |hunk|
+ if hunk.kind_of?(Array)
+ hunk = hunk.map { |hunk_block| yield hunk_block }
+ else
+ yield hunk
end
- res
- else
- callbacks.diffs
end
+ res
+ else
+ callbacks.diffs
end
+ end
- # Diff::LCS.sdiff computes all necessary components to show two sequences
- # and their minimized differences side by side, just like the Unix
- # utility <em>sdiff</em> does:
- #
- # old < -
- # same same
- # before | after
- # - > new
- #
- # See Diff::LCS::SDiffCallbacks for the default behaviour. An alternate
- # behaviour may be implemented with Diff::LCS::ContextDiffCallbacks. If
- # a Class argument is provided for +callbacks+, #diff will attempt to
- # initialise it. If the +callbacks+ object (possibly initialised)
- # responds to #finish, it will be called.
- def sdiff(seq1, seq2, callbacks = nil, &block) #:yields diff changes:
- callbacks ||= Diff::LCS::SDiffCallbacks
- if callbacks.kind_of?(Class)
- cb = callbacks.new rescue callbacks
- callbacks = cb
- end
- traverse_balanced(seq1, seq2, callbacks)
- callbacks.finish if callbacks.respond_to?(:finish)
+ # #sdiff computes all necessary components to show two sequences and their
+ # minimized differences side by side, just like the Unix utility
+ # <em>sdiff</em> does:
+ #
+ # old < -
+ # same same
+ # before | after
+ # - > new
+ #
+ # See Diff::LCS::SDiffCallbacks for the default behaviour. An alternate
+ # behaviour may be implemented with Diff::LCS::ContextDiffCallbacks. If a
+ # Class argument is provided for +callbacks+, #diff will attempt to
+ # initialise it. If the +callbacks+ object (possibly initialised) responds
+ # to #finish, it will be called.
+ def sdiff(seq1, seq2, callbacks = nil, &block) #:yields diff changes:
+ callbacks ||= Diff::LCS::SDiffCallbacks
+ if callbacks.kind_of?(Class)
+ cb = callbacks.new rescue callbacks
+ callbacks = cb
+ end
+ traverse_balanced(seq1, seq2, callbacks)
+ callbacks.finish if callbacks.respond_to?(:finish)
- if block_given?
- res = callbacks.diffs.map do |hunk|
- if hunk.kind_of?(Array)
- hunk = hunk.map { |hunk_block| yield hunk_block }
- else
- yield hunk
- end
+ if block_given?
+ res = callbacks.diffs.map do |hunk|
+ if hunk.kind_of?(Array)
+ hunk = hunk.map { |hunk_block| yield hunk_block }
+ else
+ yield hunk
end
- res
- else
- callbacks.diffs
end
+ res
+ else
+ callbacks.diffs
end
+ end
- # Diff::LCS.traverse_sequences is the most general facility provided by this
- # module; +diff+ and +LCS+ are implemented as calls to it.
- #
- # The arguments to #traverse_sequences are the two sequences to
- # traverse, and a callback object, like this:
- #
- # traverse_sequences(seq1, seq2, Diff::LCS::ContextDiffCallbacks.new)
- #
- # #diff is implemented with #traverse_sequences.
- #
- # == Callback Methods
- # Optional callback methods are <em>emphasized</em>.
- #
- # callbacks#match:: Called when +a+ and +b+ are pointing
- # to common elements in +A+ and +B+.
- # callbacks#discard_a:: Called when +a+ is pointing to an
- # element not in +B+.
- # callbacks#discard_b:: Called when +b+ is pointing to an
- # element not in +A+.
- # <em>callbacks#finished_a</em>:: Called when +a+ has reached the end of
- # sequence +A+.
- # <em>callbacks#finished_b</em>:: Called when +b+ has reached the end of
- # sequence +B+.
- #
- # == Algorithm
- # a---+
- # v
- # A = a b c e h j l m n p
- # B = b c d e f j k l m r s t
- # ^
- # b---+
- #
- # If there are two arrows (+a+ and +b+) pointing to elements of
- # sequences +A+ and +B+, the arrows will initially point to the first
- # elements of their respective sequences. #traverse_sequences will
- # advance the arrows through the sequences one element at a time,
- # calling a method on the user-specified callback object before each
- # advance. It will advance the arrows in such a way that if there are
- # elements <tt>A[ii]</tt> and <tt>B[jj]</tt> which are both equal and
- # part of the longest common subsequence, there will be some moment
- # during the execution of #traverse_sequences when arrow +a+ is pointing
- # to <tt>A[ii]</tt> and arrow +b+ is pointing to <tt>B[jj]</tt>. When
- # this happens, #traverse_sequences will call <tt>callbacks#match</tt>
- # and then it will advance both arrows.
- #
- # Otherwise, one of the arrows is pointing to an element of its sequence
- # that is not part of the longest common subsequence.
- # #traverse_sequences will advance that arrow and will call
- # <tt>callbacks#discard_a</tt> or <tt>callbacks#discard_b</tt>, depending
- # on which arrow it advanced. If both arrows point to elements that are
- # not part of the longest common subsequence, then #traverse_sequences
- # will advance one of them and call the appropriate callback, but it is
- # not specified which it will call.
- #
- # The methods for <tt>callbacks#match</tt>, <tt>callbacks#discard_a</tt>,
- # and <tt>callbacks#discard_b</tt> are invoked with an event comprising
- # the action ("=", "+", or "-", respectively), the indicies +ii+ and
- # +jj+, and the elements <tt>A[ii]</tt> and <tt>B[jj]</tt>. Return
- # values are discarded by #traverse_sequences.
- #
- # === End of Sequences
- # If arrow +a+ reaches the end of its sequence before arrow +b+ does,
- # #traverse_sequence will try to call <tt>callbacks#finished_a</tt> with
- # the last index and element of +A+ (<tt>A[-1]</tt>) and the current
- # index and element of +B+ (<tt>B[jj]</tt>). If
- # <tt>callbacks#finished_a</tt> does not exist, then
- # <tt>callbacks#discard_b</tt> will be called on each element of +B+
- # until the end of the sequence is reached (the call
- # will be done with <tt>A[-1]</tt> and <tt>B[jj]</tt> for each element).
- #
- # If +b+ reaches the end of +B+ before +a+ reaches the end of +A+,
- # <tt>callbacks#finished_b</tt> will be called with the current index
- # and element of +A+ (<tt>A[ii]</tt>) and the last index and element of
- # +B+ (<tt>A[-1]</tt>). Again, if <tt>callbacks#finished_b</tt> does not
- # exist on the callback object, then <tt>callbacks#discard_a</tt> will
- # be called on each element of +A+ until the end of the sequence is
- # reached (<tt>A[ii]</tt> and <tt>B[-1]</tt>).
- #
- # There is a chance that one additional <tt>callbacks#discard_a</tt> or
- # <tt>callbacks#discard_b</tt> will be called after the end of the
- # sequence is reached, if +a+ has not yet reached the end of +A+ or +b+
- # has not yet reached the end of +B+.
- def traverse_sequences(seq1, seq2, callbacks = Diff::LCS::SequenceCallbacks, &block) #:yields change events:
- matches = Diff::LCS::Internals.lcs(seq1, seq2)
-
- run_finished_a = run_finished_b = false
- string = seq1.kind_of?(String)
-
- a_size = seq1.size
- b_size = seq2.size
- ai = bj = 0
-
- (0 .. matches.size).each do |ii|
- b_line = matches[ii]
-
- ax = string ? seq1[ii, 1] : seq1[ii]
- bx = string ? seq2[bj, 1] : seq2[bj]
+ # #traverse_sequences is the most general facility provided by this
+ # module; #diff and #lcs are implemented as calls to it.
+ #
+ # The arguments to #traverse_sequences are the two sequences to traverse,
+ # and a callback object, like this:
+ #
+ # traverse_sequences(seq1, seq2, Diff::LCS::ContextDiffCallbacks.new)
+ #
+ # == Callback Methods
+ #
+ # Optional callback methods are <em>emphasized</em>.
+ #
+ # callbacks#match:: Called when +a+ and +b+ are pointing to
+ # common elements in +A+ and +B+.
+ # callbacks#discard_a:: Called when +a+ is pointing to an
+ # element not in +B+.
+ # callbacks#discard_b:: Called when +b+ is pointing to an
+ # element not in +A+.
+ # <em>callbacks#finished_a</em>:: Called when +a+ has reached the end of
+ # sequence +A+.
+ # <em>callbacks#finished_b</em>:: Called when +b+ has reached the end of
+ # sequence +B+.
+ #
+ # == Algorithm
+ #
+ # a---+
+ # v
+ # A = a b c e h j l m n p
+ # B = b c d e f j k l m r s t
+ # ^
+ # b---+
+ #
+ # If there are two arrows (+a+ and +b+) pointing to elements of sequences
+ # +A+ and +B+, the arrows will initially point to the first elements of
+ # their respective sequences. #traverse_sequences will advance the arrows
+ # through the sequences one element at a time, calling a method on the
+ # user-specified callback object before each advance. It will advance the
+ # arrows in such a way that if there are elements <tt>A[ii]</tt> and
+ # <tt>B[jj]</tt> which are both equal and part of the longest common
+ # subsequence, there will be some moment during the execution of
+ # #traverse_sequences when arrow +a+ is pointing to <tt>A[ii]</tt> and
+ # arrow +b+ is pointing to <tt>B[jj]</tt>. When this happens,
+ # #traverse_sequences will call <tt>callbacks#match</tt> and then it will
+ # advance both arrows.
+ #
+ # Otherwise, one of the arrows is pointing to an element of its sequence
+ # that is not part of the longest common subsequence. #traverse_sequences
+ # will advance that arrow and will call <tt>callbacks#discard_a</tt> or
+ # <tt>callbacks#discard_b</tt>, depending on which arrow it advanced. If
+ # both arrows point to elements that are not part of the longest common
+ # subsequence, then #traverse_sequences will advance one of them and call
+ # the appropriate callback, but it is not specified which it will call.
+ #
+ # The methods for <tt>callbacks#match</tt>, <tt>callbacks#discard_a</tt>,
+ # and <tt>callbacks#discard_b</tt> are invoked with an event comprising
+ # the action ("=", "+", or "-", respectively), the indicies +ii+ and +jj+,
+ # and the elements <tt>A[ii]</tt> and <tt>B[jj]</tt>. Return values are
+ # discarded by #traverse_sequences.
+ #
+ # === End of Sequences
+ #
+ # If arrow +a+ reaches the end of its sequence before arrow +b+ does,
+ # #traverse_sequence will try to call <tt>callbacks#finished_a</tt> with
+ # the last index and element of +A+ (<tt>A[-1]</tt>) and the current index
+ # and element of +B+ (<tt>B[jj]</tt>). If <tt>callbacks#finished_a</tt>
+ # does not exist, then <tt>callbacks#discard_b</tt> will be called on each
+ # element of +B+ until the end of the sequence is reached (the call will
+ # be done with <tt>A[-1]</tt> and <tt>B[jj]</tt> for each element).
+ #
+ # If +b+ reaches the end of +B+ before +a+ reaches the end of +A+,
+ # <tt>callbacks#finished_b</tt> will be called with the current index and
+ # element of +A+ (<tt>A[ii]</tt>) and the last index and element of +B+
+ # (<tt>A[-1]</tt>). Again, if <tt>callbacks#finished_b</tt> does not exist
+ # on the callback object, then <tt>callbacks#discard_a</tt> will be called
+ # on each element of +A+ until the end of the sequence is reached
+ # (<tt>A[ii]</tt> and <tt>B[-1]</tt>).
+ #
+ # There is a chance that one additional <tt>callbacks#discard_a</tt> or
+ # <tt>callbacks#discard_b</tt> will be called after the end of the
+ # sequence is reached, if +a+ has not yet reached the end of +A+ or +b+
+ # has not yet reached the end of +B+.
+ def traverse_sequences(seq1, seq2, callbacks = Diff::LCS::SequenceCallbacks, &block) #:yields change events:
+ matches = Diff::LCS::Internals.lcs(seq1, seq2)
- if b_line.nil?
- unless ax.nil? or (string and ax.empty?)
- event = Diff::LCS::ContextChange.new('-', ii, ax, bj, bx)
- event = yield event if block_given?
- callbacks.discard_a(event)
- end
- else
- loop do
- break unless bj < b_line
- bx = string ? seq2[bj, 1] : seq2[bj]
- event = Diff::LCS::ContextChange.new('+', ii, ax, bj, bx)
- event = yield event if block_given?
- callbacks.discard_b(event)
- bj += 1
- end
- bx = string ? seq2[bj, 1] : seq2[bj]
- event = Diff::LCS::ContextChange.new('=', ii, ax, bj, bx)
- event = yield event if block_given?
- callbacks.match(event)
- bj += 1
- end
- ai = ii
- end
- ai += 1
+ run_finished_a = run_finished_b = false
+ string = seq1.kind_of?(String)
- # The last entry (if any) processed was a match. +ai+ and +bj+ point
- # just past the last matching lines in their sequences.
- while (ai < a_size) or (bj < b_size)
- # last A?
- if ai == a_size and bj < b_size
- if callbacks.respond_to?(:finished_a) and not run_finished_a
- ax = string ? seq1[-1, 1] : seq1[-1]
- bx = string ? seq2[bj, 1] : seq2[bj]
- event = Diff::LCS::ContextChange.new('>', (a_size - 1), ax, bj, bx)
- event = yield event if block_given?
- callbacks.finished_a(event)
- run_finished_a = true
- else
- ax = string ? seq1[ai, 1] : seq1[ai]
- loop do
- bx = string ? seq2[bj, 1] : seq2[bj]
- event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
- event = yield event if block_given?
- callbacks.discard_b(event)
- bj += 1
- break unless bj < b_size
- end
- end
- end
+ a_size = seq1.size
+ b_size = seq2.size
+ ai = bj = 0
- # last B?
- if bj == b_size and ai < a_size
- if callbacks.respond_to?(:finished_b) and not run_finished_b
- ax = string ? seq1[ai, 1] : seq1[ai]
- bx = string ? seq2[-1, 1] : seq2[-1]
- event = Diff::LCS::ContextChange.new('<', ai, ax, (b_size - 1), bx)
- event = yield event if block_given?
- callbacks.finished_b(event)
- run_finished_b = true
- else
- bx = string ? seq2[bj, 1] : seq2[bj]
- loop do
- ax = string ? seq1[ai, 1] : seq1[ai]
- event = Diff::LCS::ContextChange.new('-', ai, ax, bj, bx)
- event = yield event if block_given?
- callbacks.discard_a(event)
- ai += 1
- break unless bj < b_size
- end
- end
- end
+ (0 .. matches.size).each do |ii|
+ b_line = matches[ii]
- if ai < a_size
- ax = string ? seq1[ai, 1] : seq1[ai]
- bx = string ? seq2[bj, 1] : seq2[bj]
- event = Diff::LCS::ContextChange.new('-', ai, ax, bj, bx)
+ ax = string ? seq1[ii, 1] : seq1[ii]
+ bx = string ? seq2[bj, 1] : seq2[bj]
+
+ if b_line.nil?
+ unless ax.nil? or (string and ax.empty?)
+ event = Diff::LCS::ContextChange.new('-', ii, ax, bj, bx)
event = yield event if block_given?
callbacks.discard_a(event)
- ai += 1
end
-
- if bj < b_size
- ax = string ? seq1[ai, 1] : seq1[ai]
+ else
+ loop do
+ break unless bj < b_line
bx = string ? seq2[bj, 1] : seq2[bj]
- event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
+ event = Diff::LCS::ContextChange.new('+', ii, ax, bj, bx)
event = yield event if block_given?
callbacks.discard_b(event)
bj += 1
end
+ bx = string ? seq2[bj, 1] : seq2[bj]
+ event = Diff::LCS::ContextChange.new('=', ii, ax, bj, bx)
+ event = yield event if block_given?
+ callbacks.match(event)
+ bj += 1
end
+ ai = ii
end
-
- # #traverse_balanced is an alternative to #traverse_sequences. It
- # uses a different algorithm to iterate through the entries in the
- # computed longest common subsequence. Instead of viewing the changes as
- # insertions or deletions from one of the sequences, #traverse_balanced
- # will report <em>changes</em> between the sequences. To represent a
- #
- # The arguments to #traverse_balanced are the two sequences to traverse
- # and a callback object, like this:
- #
- # traverse_balanced(seq1, seq2, Diff::LCS::ContextDiffCallbacks.new)
- #
- # #sdiff is implemented with #traverse_balanced.
- #
- # == Callback Methods
- # Optional callback methods are <em>emphasized</em>.
- #
- # callbacks#match:: Called when +a+ and +b+ are pointing
- # to common elements in +A+ and +B+.
- # callbacks#discard_a:: Called when +a+ is pointing to an
- # element not in +B+.
- # callbacks#discard_b:: Called when +b+ is pointing to an
- # element not in +A+.
- # <em>callbacks#change</em>:: Called when +a+ and +b+ are pointing
- # to the same relative position, but
- # <tt>A[a]</tt> and <tt>B[b]</tt> are
- # not the same; a <em>change</em> has
- # occurred.
- #
- # #traverse_balanced might be a bit slower than #traverse_sequences,
- # noticable only while processing huge amounts of data.
- #
- # The +sdiff+ function of this module is implemented as call to
- # #traverse_balanced.
- #
- # == Algorithm
- # a---+
- # v
- # A = a b c e h j l m n p
- # B = b c d e f j k l m r s t
- # ^
- # b---+
- #
- # === Matches
- # If there are two arrows (+a+ and +b+) pointing to elements of
- # sequences +A+ and +B+, the arrows will initially point to the first
- # elements of their respective sequences. #traverse_sequences will
- # advance the arrows through the sequences one element at a time,
- # calling a method on the user-specified callback object before each
- # advance. It will advance the arrows in such a way that if there are
- # elements <tt>A[ii]</tt> and <tt>B[jj]</tt> which are both equal and
- # part of the longest common subsequence, there will be some moment
- # during the execution of #traverse_sequences when arrow +a+ is pointing
- # to <tt>A[ii]</tt> and arrow +b+ is pointing to <tt>B[jj]</tt>. When
- # this happens, #traverse_sequences will call <tt>callbacks#match</tt>
- # and then it will advance both arrows.
- #
- # === Discards
- # Otherwise, one of the arrows is pointing to an element of its sequence
- # that is not part of the longest common subsequence.
- # #traverse_sequences will advance that arrow and will call
- # <tt>callbacks#discard_a</tt> or <tt>callbacks#discard_b</tt>,
- # depending on which arrow it advanced.
- #
- # === Changes
- # If both +a+ and +b+ point to elements that are not part of the longest
- # common subsequence, then #traverse_sequences will try to call
- # <tt>callbacks#change</tt> and advance both arrows. If
- # <tt>callbacks#change</tt> is not implemented, then
- # <tt>callbacks#discard_a</tt> and <tt>callbacks#discard_b</tt> will be
- # called in turn.
- #
- # The methods for <tt>callbacks#match</tt>, <tt>callbacks#discard_a</tt>,
- # <tt>callbacks#discard_b</tt>, and <tt>callbacks#change</tt> are
- # invoked with an event comprising the action ("=", "+", "-", or "!",
- # respectively), the indicies +ii+ and +jj+, and the elements
- # <tt>A[ii]</tt> and <tt>B[jj]</tt>. Return values are discarded by
- # #traverse_balanced.
- #
- # === Context
- # Note that +ii+ and +jj+ may not be the same index position, even if
- # +a+ and +b+ are considered to be pointing to matching or changed
- # elements.
- def traverse_balanced(seq1, seq2, callbacks = Diff::LCS::BalancedCallbacks)
- matches = Diff::LCS::Internals.lcs(seq1, seq2)
- a_size = seq1.size
- b_size = seq2.size
- ai = bj = mb = 0
- ma = -1
- string = seq1.kind_of?(String)
-
- # Process all the lines in the match vector.
- loop do
- # Find next match indices +ma+ and +mb+
- loop do
- ma += 1
- break unless ma < matches.size and matches[ma].nil?
+ ai += 1
+
+ # The last entry (if any) processed was a match. +ai+ and +bj+ point
+ # just past the last matching lines in their sequences.
+ while (ai < a_size) or (bj < b_size)
+ # last A?
+ if ai == a_size and bj < b_size
+ if callbacks.respond_to?(:finished_a) and not run_finished_a
+ ax = string ? seq1[-1, 1] : seq1[-1]
+ bx = string ? seq2[bj, 1] : seq2[bj]
+ event = Diff::LCS::ContextChange.new('>', (a_size - 1), ax, bj, bx)
+ event = yield event if block_given?
+ callbacks.finished_a(event)
+ run_finished_a = true
+ else
+ ax = string ? seq1[ai, 1] : seq1[ai]
+ loop do
+ bx = string ? seq2[bj, 1] : seq2[bj]
+ event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
+ event = yield event if block_given?
+ callbacks.discard_b(event)
+ bj += 1
+ break unless bj < b_size
+ end
end
+ end
- break if ma >= matches.size # end of matches?
- mb = matches[ma]
-
- # Change(seq2)
- while (ai < ma) or (bj < mb)
+ # last B?
+ if bj == b_size and ai < a_size
+ if callbacks.respond_to?(:finished_b) and not run_finished_b
ax = string ? seq1[ai, 1] : seq1[ai]
+ bx = string ? seq2[-1, 1] : seq2[-1]
+ event = Diff::LCS::ContextChange.new('<', ai, ax, (b_size - 1), bx)
+ event = yield event if block_given?
+ callbacks.finished_b(event)
+ run_finished_b = true
+ else
bx = string ? seq2[bj, 1] : seq2[bj]
-
- case [(ai < ma), (bj < mb)]
- when [true, true]
- if callbacks.respond_to?(:change)
- event = Diff::LCS::ContextChange.new('!', ai, ax, bj, bx)
- event = yield event if block_given?
- callbacks.change(event)
- ai += 1
- bj += 1
- else
- event = Diff::LCS::ContextChange.new('-', ai, ax, bj, bx)
- event = yield event if block_given?
- callbacks.discard_a(event)
- ai += 1
- ax = string ? seq1[ai, 1] : seq1[ai]
- event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
- event = yield event if block_given?
- callbacks.discard_b(event)
- bj += 1
- end
- when [true, false]
+ loop do
+ ax = string ? seq1[ai, 1] : seq1[ai]
event = Diff::LCS::ContextChange.new('-', ai, ax, bj, bx)
event = yield event if block_given?
callbacks.discard_a(event)
ai += 1
- when [false, true]
- event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
- event = yield event if block_given?
- callbacks.discard_b(event)
- bj += 1
+ break unless bj < b_size
end
end
+ end
- # Match
+ if ai < a_size
ax = string ? seq1[ai, 1] : seq1[ai]
bx = string ? seq2[bj, 1] : seq2[bj]
- event = Diff::LCS::ContextChange.new('=', ai, ax, bj, bx)
+ event = Diff::LCS::ContextChange.new('-', ai, ax, bj, bx)
event = yield event if block_given?
- callbacks.match(event)
+ callbacks.discard_a(event)
ai += 1
+ end
+
+ if bj < b_size
+ ax = string ? seq1[ai, 1] : seq1[ai]
+ bx = string ? seq2[bj, 1] : seq2[bj]
+ event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
+ event = yield event if block_given?
+ callbacks.discard_b(event)
bj += 1
end
+ end
+ end
- while (ai < a_size) or (bj < b_size)
+ # #traverse_balanced is an alternative to #traverse_sequences. It uses a
+ # different algorithm to iterate through the entries in the computed
+ # longest common subsequence. Instead of viewing the changes as insertions
+ # or deletions from one of the sequences, #traverse_balanced will report
+ # <em>changes</em> between the sequences.
+ #
+ # The arguments to #traverse_balanced are the two sequences to traverse
+ # and a callback object, like this:
+ #
+ # traverse_balanced(seq1, seq2, Diff::LCS::ContextDiffCallbacks.new)
+ #
+ # #sdiff is implemented with #traverse_balanced.
+ #
+ # == Callback Methods
+ #
+ # Optional callback methods are <em>emphasized</em>.
+ #
+ # callbacks#match:: Called when +a+ and +b+ are pointing to
+ # common elements in +A+ and +B+.
+ # callbacks#discard_a:: Called when +a+ is pointing to an
+ # element not in +B+.
+ # callbacks#discard_b:: Called when +b+ is pointing to an
+ # element not in +A+.
+ # <em>callbacks#change</em>:: Called when +a+ and +b+ are pointing to
+ # the same relative position, but
+ # <tt>A[a]</tt> and <tt>B[b]</tt> are not
+ # the same; a <em>change</em> has
+ # occurred.
+ #
+ # #traverse_balanced might be a bit slower than #traverse_sequences,
+ # noticable only while processing huge amounts of data.
+ #
+ # == Algorithm
+ #
+ # a---+
+ # v
+ # A = a b c e h j l m n p
+ # B = b c d e f j k l m r s t
+ # ^
+ # b---+
+ #
+ # === Matches
+ #
+ # If there are two arrows (+a+ and +b+) pointing to elements of sequences
+ # +A+ and +B+, the arrows will initially point to the first elements of
+ # their respective sequences. #traverse_sequences will advance the arrows
+ # through the sequences one element at a time, calling a method on the
+ # user-specified callback object before each advance. It will advance the
+ # arrows in such a way that if there are elements <tt>A[ii]</tt> and
+ # <tt>B[jj]</tt> which are both equal and part of the longest common
+ # subsequence, there will be some moment during the execution of
+ # #traverse_sequences when arrow +a+ is pointing to <tt>A[ii]</tt> and
+ # arrow +b+ is pointing to <tt>B[jj]</tt>. When this happens,
+ # #traverse_sequences will call <tt>callbacks#match</tt> and then it will
+ # advance both arrows.
+ #
+ # === Discards
+ #
+ # Otherwise, one of the arrows is pointing to an element of its sequence
+ # that is not part of the longest common subsequence. #traverse_sequences
+ # will advance that arrow and will call <tt>callbacks#discard_a</tt> or
+ # <tt>callbacks#discard_b</tt>, depending on which arrow it advanced.
+ #
+ # === Changes
+ #
+ # If both +a+ and +b+ point to elements that are not part of the longest
+ # common subsequence, then #traverse_sequences will try to call
+ # <tt>callbacks#change</tt> and advance both arrows. If
+ # <tt>callbacks#change</tt> is not implemented, then
+ # <tt>callbacks#discard_a</tt> and <tt>callbacks#discard_b</tt> will be
+ # called in turn.
+ #
+ # The methods for <tt>callbacks#match</tt>, <tt>callbacks#discard_a</tt>,
+ # <tt>callbacks#discard_b</tt>, and <tt>callbacks#change</tt> are invoked
+ # with an event comprising the action ("=", "+", "-", or "!",
+ # respectively), the indicies +ii+ and +jj+, and the elements
+ # <tt>A[ii]</tt> and <tt>B[jj]</tt>. Return values are discarded by
+ # #traverse_balanced.
+ #
+ # === Context
+ # Note that +ii+ and +jj+ may not be the same index position, even if +a+
+ # and +b+ are considered to be pointing to matching or changed elements.
+ def traverse_balanced(seq1, seq2, callbacks = Diff::LCS::BalancedCallbacks)
+ matches = Diff::LCS::Internals.lcs(seq1, seq2)
+ a_size = seq1.size
+ b_size = seq2.size
+ ai = bj = mb = 0
+ ma = -1
+ string = seq1.kind_of?(String)
+
+ # Process all the lines in the match vector.
+ loop do
+ # Find next match indices +ma+ and +mb+
+ loop do
+ ma += 1
+ break unless ma < matches.size and matches[ma].nil?
+ end
+
+ break if ma >= matches.size # end of matches?
+ mb = matches[ma]
+
+ # Change(seq2)
+ while (ai < ma) or (bj < mb)
ax = string ? seq1[ai, 1] : seq1[ai]
bx = string ? seq2[bj, 1] : seq2[bj]
- case [(ai < a_size), (bj < b_size)]
+ case [(ai < ma), (bj < mb)]
when [true, true]
if callbacks.respond_to?(:change)
event = Diff::LCS::ContextChange.new('!', ai, ax, bj, bx)
@@ -670,169 +608,212 @@ def traverse_balanced(seq1, seq2, callbacks = Diff::LCS::BalancedCallbacks)
bj += 1
end
end
+
+ # Match
+ ax = string ? seq1[ai, 1] : seq1[ai]
+ bx = string ? seq2[bj, 1] : seq2[bj]
+ event = Diff::LCS::ContextChange.new('=', ai, ax, bj, bx)
+ event = yield event if block_given?
+ callbacks.match(event)
+ ai += 1
+ bj += 1
end
- PATCH_MAP = { #:nodoc:
- :patch => { '+' => '+', '-' => '-', '!' => '!', '=' => '=' },
- :unpatch => { '+' => '-', '-' => '+', '!' => '!', '=' => '=' }
- }
-
- # Applies a +patchset+ to the sequence +src+ according to the
- # +direction+ (<tt>:patch</tt> or <tt>:unpatch</tt>).
- #
- # If the +direction+ is not specified, Diff::LCS::patch will attempt to
- # discover the direction of the +patchset+.
- #
- # A +patchset+ can be considered to apply forward (<tt>:patch</tt>) if
- # the following expression is true:
- #
- # patch(s1, diff(s1, s2)) -> s2
- #
- # A +patchset+ can be considered to apply backward (<tt>:unpatch</tt>)
- # if the following expression is true:
- #
- # patch(s2, diff(s1, s2)) -> s1
- #
- # If the +patchset+ contains no changes, the +src+ value will be
- # returned as either <tt>src.dup</tt> or +src+. A +patchset+ can be
- # deemed as having no changes if the following predicate returns true:
- #
- # patchset.empty? or
- # patchset.flatten.all? { |change| change.unchanged? }
- #
- # === Patchsets
- # A +patchset+ is always an enumerable sequence of changes, hunks of
- # changes, or a mix of the two. A hunk of changes is an enumerable
- # sequence of changes:
- #
- # [ # patchset
- # # change
- # [ # hunk
- # # change
- # ]
- # ]
- #
- # The +patch+ method accepts <tt>patchset</tt>s that are enumerable
- # sequences containing either Diff::LCS::Change objects (or a subclass)
- # or the array representations of those objects. Prior to application,
- # array representations of Diff::LCS::Change objects will be reified.
- def patch(src, patchset, direction = nil)
- # Normalize the patchset.
- has_changes, patchset = Diff::LCS::Internals.analyze_patchset(patchset)
-
- if not has_changes
- return src.dup if src.respond_to? :dup
- return src
+ while (ai < a_size) or (bj < b_size)
+ ax = string ? seq1[ai, 1] : seq1[ai]
+ bx = string ? seq2[bj, 1] : seq2[bj]
+
+ case [(ai < a_size), (bj < b_size)]
+ when [true, true]
+ if callbacks.respond_to?(:change)
+ event = Diff::LCS::ContextChange.new('!', ai, ax, bj, bx)
+ event = yield event if block_given?
+ callbacks.change(event)
+ ai += 1
+ bj += 1
+ else
+ event = Diff::LCS::ContextChange.new('-', ai, ax, bj, bx)
+ event = yield event if block_given?
+ callbacks.discard_a(event)
+ ai += 1
+ ax = string ? seq1[ai, 1] : seq1[ai]
+ event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
+ event = yield event if block_given?
+ callbacks.discard_b(event)
+ bj += 1
+ end
+ when [true, false]
+ event = Diff::LCS::ContextChange.new('-', ai, ax, bj, bx)
+ event = yield event if block_given?
+ callbacks.discard_a(event)
+ ai += 1
+ when [false, true]
+ event = Diff::LCS::ContextChange.new('+', ai, ax, bj, bx)
+ event = yield event if block_given?
+ callbacks.discard_b(event)
+ bj += 1
end
+ end
+ end
- string = src.kind_of?(String)
- # Start with a new empty type of the source's class
- res = src.class.new
+ PATCH_MAP = { #:nodoc:
+ :patch => { '+' => '+', '-' => '-', '!' => '!', '=' => '=' },
+ :unpatch => { '+' => '-', '-' => '+', '!' => '!', '=' => '=' }
+ }
- direction ||= Diff::LCS::Internals.diff_direction(src, patchset)
+ # Applies a +patchset+ to the sequence +src+ according to the +direction+
+ # (<tt>:patch</tt> or <tt>:unpatch</tt>).
+ #
+ # If the +direction+ is not specified, Diff::LCS::patch will attempt to
+ # discover the direction of the +patchset+.
+ #
+ # A +patchset+ can be considered to apply forward (<tt>:patch</tt>) if the
+ # following expression is true:
+ #
+ # patch(s1, diff(s1, s2)) -> s2
+ #
+ # A +patchset+ can be considered to apply backward (<tt>:unpatch</tt>) if
+ # the following expression is true:
+ #
+ # patch(s2, diff(s1, s2)) -> s1
+ #
+ # If the +patchset+ contains no changes, the +src+ value will be returned
+ # as either <tt>src.dup</tt> or +src+. A +patchset+ can be deemed as
+ # having no changes if the following predicate returns true:
+ #
+ # patchset.empty? or
+ # patchset.flatten.all? { |change| change.unchanged? }
+ #
+ # === Patchsets
+ # A +patchset+ is always an enumerable sequence of changes, hunks of
+ # changes, or a mix of the two. A hunk of changes is an enumerable
+ # sequence of changes:
+ #
+ # [ # patchset
+ # # change
+ # [ # hunk
+ # # change
+ # ]
+ # ]
+ #
+ # The +patch+ method accepts <tt>patchset</tt>s that are enumerable
+ # sequences containing either Diff::LCS::Change objects (or a subclass) or
+ # the array representations of those objects. Prior to application, array
+ # representations of Diff::LCS::Change objects will be reified.
+ def patch(src, patchset, direction = nil)
+ # Normalize the patchset.
+ has_changes, patchset = Diff::LCS::Internals.analyze_patchset(patchset)
+
+ if not has_changes
+ return src.dup if src.respond_to? :dup
+ return src
+ end
- ai = bj = 0
+ string = src.kind_of?(String)
+ # Start with a new empty type of the source's class
+ res = src.class.new
- patch_map = PATCH_MAP[direction]
+ direction ||= Diff::LCS::Internals.diff_direction(src, patchset)
- patchset.flatten.each do |change|
- # Both Change and ContextChange support #action
- action = patch_map[change.action]
+ ai = bj = 0
- case change
- when Diff::LCS::ContextChange
- case direction
- when :patch
- el = change.new_element
- op = change.old_position
- np = change.new_position
- when :unpatch
- el = change.old_element
- op = change.new_position
- np = change.old_position
- end
+ patch_map = PATCH_MAP[direction]
- case action
- when '-' # Remove details from the old string
- while ai < op
- res << (string ? src[ai, 1] : src[ai])
- ai += 1
- bj += 1
- end
+ patchset.flatten.each do |change|
+ # Both Change and ContextChange support #action
+ action = patch_map[change.action]
+
+ case change
+ when Diff::LCS::ContextChange
+ case direction
+ when :patch
+ el = change.new_element
+ op = change.old_position
+ np = change.new_position
+ when :unpatch
+ el = change.old_element
+ op = change.new_position
+ np = change.old_position
+ end
+
+ case action
+ when '-' # Remove details from the old string
+ while ai < op
+ res << (string ? src[ai, 1] : src[ai])
+ ai += 1
+ bj += 1
+ end
+ ai += 1
+ when '+'
+ while bj < np
+ res << (string ? src[ai, 1] : src[ai])
ai += 1
- when '+'
- while bj < np
- res << (string ? src[ai, 1] : src[ai])
- ai += 1
- bj += 1
- end
+ bj += 1
+ end
+ res << el
+ bj += 1
+ when '='
+ # This only appears in sdiff output with the SDiff callback.
+ # Therefore, we only need to worry about dealing with a single
+ # element.
res << el
- bj += 1
- when '='
- # This only appears in sdiff output with the SDiff callback.
- # Therefore, we only need to worry about dealing with a single
- # element.
- res << el
+ ai += 1
+ bj += 1
+ when '!'
+ while ai < op
+ res << (string ? src[ai, 1] : src[ai])
ai += 1
bj += 1
- when '!'
- while ai < op
- res << (string ? src[ai, 1] : src[ai])
- ai += 1
- bj += 1
- end
+ end
- bj += 1
- ai += 1
+ bj += 1
+ ai += 1
- res << el
+ res << el
+ end
+ when Diff::LCS::Change
+ case action
+ when '-'
+ while ai < change.position
+ res << (string ? src[ai, 1] : src[ai])
+ ai += 1
+ bj += 1
end
- when Diff::LCS::Change
- case action
- when '-'
- while ai < change.position
- res << (string ? src[ai, 1] : src[ai])
- ai += 1
- bj += 1
- end
+ ai += 1
+ when '+'
+ while bj < change.position
+ res << (string ? src[ai, 1] : src[ai])
ai += 1
- when '+'
- while bj < change.position
- res << (string ? src[ai, 1] : src[ai])
- ai += 1
- bj += 1
- end
-
bj += 1
-
- res << change.element
end
- end
- end
- while ai < src.size
- res << (string ? src[ai, 1] : src[ai])
- ai += 1
bj += 1
- end
- res
+ res << change.element
+ end
+ end
end
- # Given a set of patchset, convert the current version to the prior
- # version. Does no auto-discovery.
- def unpatch!(src, patchset)
- Diff::LCS.patch(src, patchset, :unpatch)
+ while ai < src.size
+ res << (string ? src[ai, 1] : src[ai])
+ ai += 1
+ bj += 1
end
- # Given a set of patchset, convert the current version to the next
- # version. Does no auto-discovery.
- def patch!(src, patchset)
- Diff::LCS.patch(src, patchset, :patch)
- end
+ res
+ end
+
+ # Given a set of patchset, convert the current version to the prior
+ # version. Does no auto-discovery.
+ def unpatch!(src, patchset)
+ patch(src, patchset, :unpatch)
end
-end
-# vim: ft=ruby
+ # Given a set of patchset, convert the current version to the next
+ # version. Does no auto-discovery.
+ def patch!(src, patchset)
+ patch(src, patchset, :patch)
+ end
+end
View
15 lib/diff/lcs/array.rb
@@ -1,17 +1,4 @@
-#--
-# Copyright 2004 Austin Ziegler <diff-lcs@halostatue.ca>
-# adapted from:
-# Algorithm::Diff (Perl) by Ned Konz <perl@bike-nomad.com>
-# Smalltalk by Mario I. Wolczko <mario@wolczko.com>
-# implements McIlroy-Hunt diff algorithm
-#
-# This program is free software. It may be redistributed and/or modified under
-# the terms of the GPL version 2 (or later), the Perl Artistic licence, or the
-# Ruby licence.
-#
-# $Id$
-#++
-# Includes Diff::LCS into the Array built-in class.
+# -*- ruby encoding: utf-8 -*-
require 'diff/lcs'
View
21 lib/diff/lcs/block.rb
@@ -1,21 +1,8 @@
-#--
-# Copyright 2004 Austin Ziegler <diff-lcs@halostatue.ca>
-# adapted from:
-# Algorithm::Diff (Perl) by Ned Konz <perl@bike-nomad.com>
-# Smalltalk by Mario I. Wolczko <mario@wolczko.com>
-# implements McIlroy-Hunt diff algorithm
-#
-# This program is free software. It may be redistributed and/or modified under
-# the terms of the GPL version 2 (or later), the Perl Artistic licence, or the
-# Ruby licence.
-#
-# $Id$
-#++
-# Contains Diff::LCS::Block for bin/ldiff.
+# -*- ruby encoding: utf-8 -*-
- # A block is an operation removing, adding, or changing a group of items.
- # Basically, this is just a list of changes, where each change adds or
- # deletes a single item. Used by bin/ldiff.
+# A block is an operation removing, adding, or changing a group of items.
+# Basically, this is just a list of changes, where each change adds or
+# deletes a single item. Used by bin/ldiff.
class Diff::LCS::Block
attr_reader :changes, :insert, :remove
View
439 lib/diff/lcs/callbacks.rb
@@ -1,43 +1,31 @@
-#--
-# Copyright 2004 Austin Ziegler <diff-lcs@halostatue.ca>
-# adapted from:
-# Algorithm::Diff (Perl) by Ned Konz <perl@bike-nomad.com>
-# Smalltalk by Mario I. Wolczko <mario@wolczko.com>
-# implements McIlroy-Hunt diff algorithm
-#
-# This program is free software. It may be redistributed and/or modified under
-# the terms of the GPL version 2 (or later), the Perl Artistic licence, or the
-# Ruby licence.
-#
-# $Id$
-#++
-# Contains definitions for all default callback objects.
+# -*- ruby encoding: utf-8 -*-
require 'diff/lcs/change'
module Diff::LCS
- # This callback object implements the default set of callback events, which
- # only returns the event itself. Note that #finished_a and #finished_b are
- # not implemented -- I haven't yet figured out where they would be useful.
- #
- # Note that this is intended to be called as is, e.g.,
- #
- # Diff::LCS.LCS(seq1, seq2, Diff::LCS::DefaultCallbacks)
+ # This callback object implements the default set of callback events,
+ # which only returns the event itself. Note that #finished_a and
+ # #finished_b are not implemented -- I haven't yet figured out where they
+ # would be useful.
+ #
+ # Note that this is intended to be called as is, e.g.,
+ #
+ # Diff::LCS.LCS(seq1, seq2, Diff::LCS::DefaultCallbacks)
class DefaultCallbacks
class << self
- # Called when two items match.
+ # Called when two items match.
def match(event)
event
end
- # Called when the old value is discarded in favour of the new value.
+ # Called when the old value is discarded in favour of the new value.
def discard_a(event)
event
end
- # Called when the new value is discarded in favour of the old value.
+ # Called when the new value is discarded in favour of the old value.
def discard_b(event)
event
end
- # Called when both the old and new values have changed.
+ # Called when both the old and new values have changed.
def change(event)
event
end
@@ -46,65 +34,70 @@ def change(event)
end
end
- # An alias for DefaultCallbacks that is used in Diff::LCS#traverse_sequences.
- #
- # Diff::LCS.LCS(seq1, seq2, Diff::LCS::SequenceCallbacks)
+ # An alias for DefaultCallbacks that is used in
+ # Diff::LCS#traverse_sequences.
+ #
+ # Diff::LCS.LCS(seq1, seq2, Diff::LCS::SequenceCallbacks)
SequenceCallbacks = DefaultCallbacks
- # An alias for DefaultCallbacks that is used in Diff::LCS#traverse_balanced.
- #
- # Diff::LCS.LCS(seq1, seq2, Diff::LCS::BalancedCallbacks)
+
+ # An alias for DefaultCallbacks that is used in
+ # Diff::LCS#traverse_balanced.
+ #
+ # Diff::LCS.LCS(seq1, seq2, Diff::LCS::BalancedCallbacks)
BalancedCallbacks = DefaultCallbacks
end
- # This will produce a compound array of simple diff change objects. Each
- # element in the #diffs array is a +hunk+ or +hunk+ array, where each
- # element in each +hunk+ array is a single Change object representing the
- # addition or removal of a single element from one of the two tested
- # sequences. The +hunk+ provides the full context for the changes.
- #
- # diffs = Diff::LCS.diff(seq1, seq2)
- # # This example shows a simplified array format.
- # # [ [ [ '-', 0, 'a' ] ], # 1
- # # [ [ '+', 2, 'd' ] ], # 2
- # # [ [ '-', 4, 'h' ], # 3
- # # [ '+', 4, 'f' ] ],
- # # [ [ '+', 6, 'k' ] ], # 4
- # # [ [ '-', 8, 'n' ], # 5
- # # [ '-', 9, 'p' ],
- # # [ '+', 9, 'r' ],
- # # [ '+', 10, 's' ],
- # # [ '+', 11, 't' ] ] ]
- #
- # There are five hunks here. The first hunk says that the +a+ at position 0
- # of the first sequence should be deleted (<tt>'-'</tt>). The second hunk
- # says that the +d+ at position 2 of the second sequence should be inserted
- # (<tt>'+'</tt>). The third hunk says that the +h+ at position 4 of the
- # first sequence should be removed and replaced with the +f+ from position 4
- # of the second sequence. The other two hunks are described similarly.
- #
- # === Use
- # This callback object must be initialised and is used by the Diff::LCS#diff
- # method.
- #
- # cbo = Diff::LCS::DiffCallbacks.new
- # Diff::LCS.LCS(seq1, seq2, cbo)
- # cbo.finish
- #
- # Note that the call to #finish is absolutely necessary, or the last set of
- # changes will not be visible. Alternatively, can be used as:
- #
- # cbo = Diff::LCS::DiffCallbacks.new { |tcbo| Diff::LCS.LCS(seq1, seq2, tcbo) }
- #
- # The necessary #finish call will be made.
- #
- # === Simplified Array Format
- # The simplified array format used in the example above can be obtained
- # with:
- #
- # require 'pp'
- # pp diffs.map { |e| e.map { |f| f.to_a } }
+# This will produce a compound array of simple diff change objects. Each
+# element in the #diffs array is a +hunk+ or +hunk+ array, where each
+# element in each +hunk+ array is a single Change object representing the
+# addition or removal of a single element from one of the two tested
+# sequences. The +hunk+ provides the full context for the changes.
+#
+# diffs = Diff::LCS.diff(seq1, seq2)
+# # This example shows a simplified array format.
+# # [ [ [ '-', 0, 'a' ] ], # 1
+# # [ [ '+', 2, 'd' ] ], # 2
+# # [ [ '-', 4, 'h' ], # 3
+# # [ '+', 4, 'f' ] ],
+# # [ [ '+', 6, 'k' ] ], # 4
+# # [ [ '-', 8, 'n' ], # 5
+# # [ '-', 9, 'p' ],
+# # [ '+', 9, 'r' ],
+# # [ '+', 10, 's' ],
+# # [ '+', 11, 't' ] ] ]
+#
+# There are five hunks here. The first hunk says that the +a+ at position 0
+# of the first sequence should be deleted (<tt>'-'</tt>). The second hunk
+# says that the +d+ at position 2 of the second sequence should be inserted
+# (<tt>'+'</tt>). The third hunk says that the +h+ at position 4 of the
+# first sequence should be removed and replaced with the +f+ from position 4
+# of the second sequence. The other two hunks are described similarly.
+#
+# === Use
+#
+# This callback object must be initialised and is used by the Diff::LCS#diff
+# method.
+#
+# cbo = Diff::LCS::DiffCallbacks.new
+# Diff::LCS.LCS(seq1, seq2, cbo)
+# cbo.finish
+#
+# Note that the call to #finish is absolutely necessary, or the last set of
+# changes will not be visible. Alternatively, can be used as:
+#
+# cbo = Diff::LCS::DiffCallbacks.new { |tcbo| Diff::LCS.LCS(seq1, seq2, tcbo) }
+#
+# The necessary #finish call will be made.
+#
+# === Simplified Array Format
+#
+# The simplified array format used in the example above can be obtained
+# with:
+#
+# require 'pp'
+# pp diffs.map { |e| e.map { |f| f.to_a } }
class Diff::LCS::DiffCallbacks
- # Returns the difference set collected during the diff process.
+ # Returns the difference set collected during the diff process.
attr_reader :diffs
def initialize # :yields self:
@@ -120,14 +113,14 @@ def initialize # :yields self:
end
end
- # Finalizes the diff process. If an unprocessed hunk still exists, then it
- # is appended to the diff list.
+ # Finalizes the diff process. If an unprocessed hunk still exists, then it
+ # is appended to the diff list.
def finish
- add_nonempty_hunk
+ finish_hunk
end
def match(event)
- add_nonempty_hunk
+ finish_hunk
end
def discard_a(event)
@@ -138,86 +131,88 @@ def discard_b(event)
@hunk << Diff::LCS::Change.new('+', event.new_position, event.new_element)
end
-private
- def add_nonempty_hunk
+ def finish_hunk
@diffs << @hunk unless @hunk.empty?
@hunk = []
end
+ private :finish_hunk
end
- # This will produce a compound array of contextual diff change objects. Each
- # element in the #diffs array is a "hunk" array, where each element in each
- # "hunk" array is a single change. Each change is a Diff::LCS::ContextChange
- # that contains both the old index and new index values for the change. The
- # "hunk" provides the full context for the changes. Both old and new objects
- # will be presented for changed objects. +nil+ will be substituted for a
- # discarded object.
- #
- # seq1 = %w(a b c e h j l m n p)
- # seq2 = %w(b c d e f j k l m r s t)
- #
- # diffs = Diff::LCS.diff(seq1, seq2, Diff::LCS::ContextDiffCallbacks)
- # # This example shows a simplified array format.
- # # [ [ [ '-', [ 0, 'a' ], [ 0, nil ] ] ], # 1
- # # [ [ '+', [ 3, nil ], [ 2, 'd' ] ] ], # 2
- # # [ [ '-', [ 4, 'h' ], [ 4, nil ] ], # 3
- # # [ '+', [ 5, nil ], [ 4, 'f' ] ] ],
- # # [ [ '+', [ 6, nil ], [ 6, 'k' ] ] ], # 4
- # # [ [ '-', [ 8, 'n' ], [ 9, nil ] ], # 5
- # # [ '+', [ 9, nil ], [ 9, 'r' ] ],
- # # [ '-', [ 9, 'p' ], [ 10, nil ] ],
- # # [ '+', [ 10, nil ], [ 10, 's' ] ],
- # # [ '+', [ 10, nil ], [ 11, 't' ] ] ] ]
- #
- # The five hunks shown are comprised of individual changes; if there is a
- # related set of changes, they are still shown individually.
- #
- # This callback can also be used with Diff::LCS#sdiff, which will produce
- # results like:
- #
- # diffs = Diff::LCS.sdiff(seq1, seq2, Diff::LCS::ContextCallbacks)
- # # This example shows a simplified array format.
- # # [ [ [ "-", [ 0, "a" ], [ 0, nil ] ] ], # 1
- # # [ [ "+", [ 3, nil ], [ 2, "d" ] ] ], # 2
- # # [ [ "!", [ 4, "h" ], [ 4, "f" ] ] ], # 3
- # # [ [ "+", [ 6, nil ], [ 6, "k" ] ] ], # 4
- # # [ [ "!", [ 8, "n" ], [ 9, "r" ] ], # 5
- # # [ "!", [ 9, "p" ], [ 10, "s" ] ],
- # # [ "+", [ 10, nil ], [ 11, "t" ] ] ] ]
- #
- # The five hunks are still present, but are significantly shorter in total
- # presentation, because changed items are shown as changes ("!") instead of
- # potentially "mismatched" pairs of additions and deletions.
- #
- # The result of this operation is similar to that of
- # Diff::LCS::SDiffCallbacks. They may be compared as:
- #
- # s = Diff::LCS.sdiff(seq1, seq2).reject { |e| e.action == "=" }
- # c = Diff::LCS.sdiff(seq1, seq2, Diff::LCS::ContextDiffCallbacks).flatten
- #
- # s == c # -> true
- #
- # === Use
- # This callback object must be initialised and can be used by the
- # Diff::LCS#diff or Diff::LCS#sdiff methods.
- #
- # cbo = Diff::LCS::ContextDiffCallbacks.new
- # Diff::LCS.LCS(seq1, seq2, cbo)
- # cbo.finish
- #
- # Note that the call to #finish is absolutely necessary, or the last set of
- # changes will not be visible. Alternatively, can be used as:
- #
- # cbo = Diff::LCS::ContextDiffCallbacks.new { |tcbo| Diff::LCS.LCS(seq1, seq2, tcbo) }
- #
- # The necessary #finish call will be made.
- #
- # === Simplified Array Format
- # The simplified array format used in the example above can be obtained
- # with:
- #
- # require 'pp'
- # pp diffs.map { |e| e.map { |f| f.to_a } }
+# This will produce a compound array of contextual diff change objects. Each
+# element in the #diffs array is a "hunk" array, where each element in each
+# "hunk" array is a single change. Each change is a Diff::LCS::ContextChange
+# that contains both the old index and new index values for the change. The
+# "hunk" provides the full context for the changes. Both old and new objects
+# will be presented for changed objects. +nil+ will be substituted for a
+# discarded object.
+#
+# seq1 = %w(a b c e h j l m n p)
+# seq2 = %w(b c d e f j k l m r s t)
+#
+# diffs = Diff::LCS.diff(seq1, seq2, Diff::LCS::ContextDiffCallbacks)
+# # This example shows a simplified array format.
+# # [ [ [ '-', [ 0, 'a' ], [ 0, nil ] ] ], # 1
+# # [ [ '+', [ 3, nil ], [ 2, 'd' ] ] ], # 2
+# # [ [ '-', [ 4, 'h' ], [ 4, nil ] ], # 3
+# # [ '+', [ 5, nil ], [ 4, 'f' ] ] ],
+# # [ [ '+', [ 6, nil ], [ 6, 'k' ] ] ], # 4
+# # [ [ '-', [ 8, 'n' ], [ 9, nil ] ], # 5
+# # [ '+', [ 9, nil ], [ 9, 'r' ] ],
+# # [ '-', [ 9, 'p' ], [ 10, nil ] ],
+# # [ '+', [ 10, nil ], [ 10, 's' ] ],
+# # [ '+', [ 10, nil ], [ 11, 't' ] ] ] ]
+#
+# The five hunks shown are comprised of individual changes; if there is a
+# related set of changes, they are still shown individually.
+#
+# This callback can also be used with Diff::LCS#sdiff, which will produce
+# results like:
+#
+# diffs = Diff::LCS.sdiff(seq1, seq2, Diff::LCS::ContextCallbacks)
+# # This example shows a simplified array format.
+# # [ [ [ "-", [ 0, "a" ], [ 0, nil ] ] ], # 1
+# # [ [ "+", [ 3, nil ], [ 2, "d" ] ] ], # 2
+# # [ [ "!", [ 4, "h" ], [ 4, "f" ] ] ], # 3
+# # [ [ "+", [ 6, nil ], [ 6, "k" ] ] ], # 4
+# # [ [ "!", [ 8, "n" ], [ 9, "r" ] ], # 5
+# # [ "!", [ 9, "p" ], [ 10, "s" ] ],
+# # [ "+", [ 10, nil ], [ 11, "t" ] ] ] ]
+#
+# The five hunks are still present, but are significantly shorter in total
+# presentation, because changed items are shown as changes ("!") instead of
+# potentially "mismatched" pairs of additions and deletions.
+#
+# The result of this operation is similar to that of
+# Diff::LCS::SDiffCallbacks. They may be compared as:
+#
+# s = Diff::LCS.sdiff(seq1, seq2).reject { |e| e.action == "=" }
+# c = Diff::LCS.sdiff(seq1, seq2, Diff::LCS::ContextDiffCallbacks).flatten
+#
+# s == c # -> true
+#
+# === Use
+#
+# This callback object must be initialised and can be used by the
+# Diff::LCS#diff or Diff::LCS#sdiff methods.
+#
+# cbo = Diff::LCS::ContextDiffCallbacks.new
+# Diff::LCS.LCS(seq1, seq2, cbo)
+# cbo.finish
+#
+# Note that the call to #finish is absolutely necessary, or the last set of
+# changes will not be visible. Alternatively, can be used as:
+#
+# cbo = Diff::LCS::ContextDiffCallbacks.new { |tcbo| Diff::LCS.LCS(seq1, seq2, tcbo) }
+#
+# The necessary #finish call will be made.
+#
+# === Simplified Array Format
+#
+# The simplified array format used in the example above can be obtained
+# with:
+#
+# require 'pp'
+# pp diffs.map { |e| e.map { |f| f.to_a } }
class Diff::LCS::ContextDiffCallbacks < Diff::LCS::DiffCallbacks
def discard_a(event)
@hunk << Diff::LCS::ContextChange.simplify(event)
@@ -232,70 +227,72 @@ def change(event)
end
end
- # This will produce a simple array of diff change objects. Each element in
- # the #diffs array is a single ContextChange. In the set of #diffs provided
- # by SDiffCallbacks, both old and new objects will be presented for both
- # changed <strong>and unchanged</strong> objects. +nil+ will be substituted
- # for a discarded object.
- #
- # The diffset produced by this callback, when provided to Diff::LCS#sdiff,
- # will compute and display the necessary components to show two sequences
- # and their minimized differences side by side, just like the Unix utility
- # +sdiff+.
- #
- # same same
- # before | after
- # old < -
- # - > new
- #
- # seq1 = %w(a b c e h j l m n p)
- # seq2 = %w(b c d e f j k l m r s t)
- #
- # diffs = Diff::LCS.sdiff(seq1, seq2)
- # # This example shows a simplified array format.
- # # [ [ "-", [ 0, "a"], [ 0, nil ] ],
- # # [ "=", [ 1, "b"], [ 0, "b" ] ],
- # # [ "=", [ 2, "c"], [ 1, "c" ] ],
- # # [ "+", [ 3, nil], [ 2, "d" ] ],
- # # [ "=", [ 3, "e"], [ 3, "e" ] ],
- # # [ "!", [ 4, "h"], [ 4, "f" ] ],
- # # [ "=", [ 5, "j"], [ 5, "j" ] ],
- # # [ "+", [ 6, nil], [ 6, "k" ] ],
- # # [ "=", [ 6, "l"], [ 7, "l" ] ],
- # # [ "=", [ 7, "m"], [ 8, "m" ] ],
- # # [ "!", [ 8, "n"], [ 9, "r" ] ],
- # # [ "!", [ 9, "p"], [ 10, "s" ] ],
- # # [ "+", [ 10, nil], [ 11, "t" ] ] ]
- #
- # The result of this operation is similar to that of
- # Diff::LCS::ContextDiffCallbacks. They may be compared as:
- #
- # s = Diff::LCS.sdiff(seq1, seq2).reject { |e| e.action == "=" }
- # c = Diff::LCS.sdiff(seq1, seq2, Diff::LCS::ContextDiffCallbacks).flatten
- #
- # s == c # -> true
- #
- # === Use
- # This callback object must be initialised and is used by the Diff::LCS#sdiff
- # method.
- #
- # cbo = Diff::LCS::SDiffCallbacks.new
- # Diff::LCS.LCS(seq1, seq2, cbo)
- #
- # As with the other initialisable callback objects, Diff::LCS::SDiffCallbacks
- # can be initialised with a block. As there is no "fininishing" to be done,
- # this has no effect on the state of the object.
- #
- # cbo = Diff::LCS::SDiffCallbacks.new { |tcbo| Diff::LCS.LCS(seq1, seq2, tcbo) }
- #
- # === Simplified Array Format
- # The simplified array format used in the example above can be obtained
- # with:
- #
- # require 'pp'
- # pp diffs.map { |e| e.to_a }
+# This will produce a simple array of diff change objects. Each element in
+# the #diffs array is a single ContextChange. In the set of #diffs provided
+# by SDiffCallbacks, both old and new objects will be presented for both
+# changed <strong>and unchanged</strong> objects. +nil+ will be substituted
+# for a discarded object.
+#
+# The diffset produced by this callback, when provided to Diff::LCS#sdiff,
+# will compute and display the necessary components to show two sequences
+# and their minimized differences side by side, just like the Unix utility
+# +sdiff+.
+#
+# same same
+# before | after
+# old < -
+# - > new
+#
+# seq1 = %w(a b c e h j l m n p)
+# seq2 = %w(b c d e f j k l m r s t)
+#
+# diffs = Diff::LCS.sdiff(seq1, seq2)
+# # This example shows a simplified array format.
+# # [ [ "-", [ 0, "a"], [ 0, nil ] ],
+# # [ "=", [ 1, "b"], [ 0, "b" ] ],
+# # [ "=", [ 2, "c"], [ 1, "c" ] ],
+# # [ "+", [ 3, nil], [ 2, "d" ] ],
+# # [ "=", [ 3, "e"], [ 3, "e" ] ],
+# # [ "!", [ 4, "h"], [ 4, "f" ] ],
+# # [ "=", [ 5, "j"], [ 5, "j" ] ],
+# # [ "+", [ 6, nil], [ 6, "k" ] ],
+# # [ "=", [ 6, "l"], [ 7, "l" ] ],
+# # [ "=", [ 7, "m"], [ 8, "m" ] ],
+# # [ "!", [ 8, "n"], [ 9, "r" ] ],
+# # [ "!", [ 9, "p"], [ 10, "s" ] ],
+# # [ "+", [ 10, nil], [ 11, "t" ] ] ]
+#
+# The result of this operation is similar to that of
+# Diff::LCS::ContextDiffCallbacks. They may be compared as:
+#
+# s = Diff::LCS.sdiff(seq1, seq2).reject { |e| e.action == "=" }
+# c = Diff::LCS.sdiff(seq1, seq2, Diff::LCS::ContextDiffCallbacks).flatten
+#
+# s == c # -> true
+#
+# === Use
+#
+# This callback object must be initialised and is used by the Diff::LCS#sdiff
+# method.
+#
+# cbo = Diff::LCS::SDiffCallbacks.new
+# Diff::LCS.LCS(seq1, seq2, cbo)
+#
+# As with the other initialisable callback objects,
+# Diff::LCS::SDiffCallbacks can be initialised with a block. As there is no
+# "fininishing" to be done, this has no effect on the state of the object.
+#
+# cbo = Diff::LCS::SDiffCallbacks.new { |tcbo| Diff::LCS.LCS(seq1, seq2, tcbo) }
+#
+# === Simplified Array Format
+#
+# The simplified array format used in the example above can be obtained
+# with:
+#
+# require 'pp'
+# pp diffs.map { |e| e.to_a }
class Diff::LCS::SDiffCallbacks
- # Returns the difference set collected during the diff process.
+ # Returns the difference set collected during the diff process.
attr_reader :diffs
def initialize #:yields self:
View
5 lib/diff/lcs/change.rb
@@ -1,5 +1,4 @@
# -*- ruby encoding: utf-8 -*-
-# Provides Diff::LCS::Change and Diff::LCS::ContextChange.
# Represents a simplistic (non-contextual) change. Represents the removal or
# addition of an element from either the old or the new sequenced
@@ -33,7 +32,7 @@ def initialize(*args)
end
def inspect
- %Q(#<#{self.class.name}:#{__id__.to_s(16)} @action=#{action} position=#{position} element=#{element.inspect})
+ to_a.inspect
end
def to_a
@@ -131,7 +130,7 @@ def to_a
end
def inspect(*args)
- %Q(#<#{self.class.name}:#{__id__} @action=#{action} positions=#{old_position},#{new_position} elements=#{old_element.inspect},#{new_element.inspect}>)
+ to_a.inspect
end
def self.from_a(arr)
View
2 lib/diff/lcs/htmldiff.rb
@@ -147,5 +147,3 @@ def run
OUTPUT
end
end
-
-# vim: ft=ruby
View
121 lib/diff/lcs/hunk.rb
@@ -1,13 +1,15 @@
+# -*- ruby encoding: utf-8 -*-
+
require 'diff/lcs/block'
- # A Hunk is a group of Blocks which overlap because of the context
- # surrounding each block. (So if we're not using context, every hunk will
- # contain one block.) Used in the diff program (bin/diff).
+# A Hunk is a group of Blocks which overlap because of the context
+# surrounding each block. (So if we're not using context, every hunk will
+# contain one block.) Used in the diff program (bin/diff).
class Diff::LCS::Hunk
- # Create a hunk using references to both the old and new data, as well as
- # the piece of data
- def initialize(data_old, data_new, piece, context, file_length_difference)
- # At first, a hunk will have just one Block in it
+ # Create a hunk using references to both the old and new data, as well as
+ # the piece of data.
+ def initialize(data_old, data_new, piece, flag_context, file_length_difference)
+ # At first, a hunk will have just one Block in it
@blocks = [ Diff::LCS::Block.new(piece) ]
@data_old = data_old
@data_new = data_new
@@ -16,10 +18,10 @@ def initialize(data_old, data_new, piece, context, file_length_difference)
after += @blocks[0].diff_size
@file_length_difference = after # The caller must get this manually
- # Save the start & end of each array. If the array doesn't exist
- # (e.g., we're only adding items in this block), then figure out the
- # line number based on the line number of the other file and the
- # current difference in file lengths.
+ # Save the start & end of each array. If the array doesn't exist (e.g.,
+ # we're only adding items in this block), then figure out the line
+ # number based on the line number of the other file and the current
+ # difference in file lengths.
if @blocks[0].remove.empty?
a1 = a2 = nil
else
@@ -39,7 +41,7 @@ def initialize(data_old, data_new, piece, context, file_length_difference)
@end_old = a2 || (b2 - after)
@end_new = b2 || (a2 + after)
- self.flag_context = context
+ self.flag_context = flag_context
end
attr_reader :blocks
@@ -47,10 +49,10 @@ def initialize(data_old, data_new, piece, context, file_length_difference)
attr_reader :end_old, :end_new
attr_reader :file_length_difference
- # Change the "start" and "end" fields to note that context should be added
- # to this hunk
+ # Change the "start" and "end" fields to note that context should be added
+ # to this hunk.
attr_accessor :flag_context
- undef :flag_context=
+ undef :flag_context=;
def flag_context=(context) #:nodoc:
return if context.nil? or context.zero?
@@ -67,22 +69,28 @@ def flag_context=(context) #:nodoc:
@end_new += add_end
end
- def unshift(hunk)
- @start_old = hunk.start_old
- @start_new = hunk.start_new
- blocks.unshift(*hunk.blocks)
+ # Merges this hunk and the provided hunk together if they overlap. Returns
+ # a truthy value so that if there is no overlap, you can know the merge
+ # was skipped.
+ def merge(hunk)
+ if overlaps?(hunk)
+ @start_old = hunk.start_old
+ @start_new = hunk.start_new
+ blocks.unshift(*hunk.blocks)
+ else
+ nil
+ end
end
- # Is there an overlap between hunk arg0 and old hunk arg1? Note: if end
- # of old hunk is one less than beginning of second, they overlap
- def overlaps?(hunk = nil)
- return nil if hunk.nil?
-
- a = (@start_old - hunk.end_old) <= 1
- b = (@start_new - hunk.end_new) <= 1
- return (a or b)
+ # Determines whether there is an overlap between this hunk and the
+ # provided hunk. This will be true if the difference between the two hunks
+ # start or end positions is within one position of each other.
+ def overlaps?(hunk)
+ hunk and (((@start_old - hunk.end_old) <= 1) or
+ ((@start_new - hunk.end_new) <= 1))
end
+ # Returns a diff string based on a format.
def diff(format)
case format
when :old
@@ -100,44 +108,40 @@ def diff(format)
end
end
- def each_old(block)
- @data_old[@start_old .. @end_old].each { |e| yield e }
- end
-
- private
- # Note that an old diff can't have any context. Therefore, we know that
- # there's only one block in the hunk.
+ # Note that an old diff can't have any context. Therefore, we know that
+ # there's only one block in the hunk.
def old_diff
warn "Expecting only one block in an old diff hunk!" if @blocks.size > 1
op_act = { "+" => 'a', "-" => 'd', "!" => "c" }
block = @blocks[0]
- # Calculate item number range. Old diff range is just like a context
- # diff range, except the ranges are on one line with the action between
- # them.
+ # Calculate item number range. Old diff range is just like a context
+ # diff range, except the ranges are on one line with the action between
+ # them.
s = "#{context_range(:old)}#{op_act[block.op]}#{context_range(:new)}\n"
- # If removing anything, just print out all the remove lines in the hunk
- # which is just all the remove lines in the block.
+ # If removing anything, just print out all the remove lines in the hunk
+ # which is just all the remove lines in the block.
@data_old[@start_old .. @end_old].each { |e| s << "< #{e}\n" } unless block.remove.empty?
s << "---\n" if block.op == "!"
@data_new[@start_new .. @end_new].each { |e| s << "> #{e}\n" } unless block.insert.empty?
s
end
+ private :old_diff
def unified_diff
- # Calculate item number range.
+ # Calculate item number range.
s = "@@ -#{unified_range(:old)} +#{unified_range(:new)} @@\n"
- # Outlist starts containing the hunk of the old file. Removing an item
- # just means putting a '-' in front of it. Inserting an item requires
- # getting it from the new file and splicing it in. We splice in
- # +num_added+ items. Remove blocks use +num_added+ because splicing
- # changed the length of outlist.
- #
- # We remove +num_removed+ items. Insert blocks use +num_removed+
- # because their item numbers -- corresponding to positions in the NEW
- # file -- don't take removed items into account.
+ # Outlist starts containing the hunk of the old file. Removing an item
+ # just means putting a '-' in front of it. Inserting an item requires
+ # getting it from the new file and splicing it in. We splice in
+ # +num_added+ items. Remove blocks use +num_added+ because splicing
+ # changed the length of outlist.
+ #
+ # We remove +num_removed+ items. Insert blocks use +num_removed+
+ # because their item numbers -- corresponding to positions in the NEW
+ # file -- don't take removed items into account.
lo, hi, num_added, num_removed = @start_old, @end_old, 0, 0
outlist = @data_old[lo .. hi].collect { |e| e.gsub(/^/, ' ') }
@@ -159,14 +163,15 @@ def unified_diff
s << outlist.join("\n")
end
+ private :unified_diff
def context_diff
s = "***************\n"
s << "*** #{context_range(:old)} ****\n"
r = context_range(:new)
- # Print out file 1 part for each block in context diff format if there
- # are any blocks that remove items
+ # Print out file 1 part for each block in context diff format if there
+ # are any blocks that remove items
lo, hi = @start_old, @end_old
removes = @blocks.select { |e| not e.remove.empty? }
if removes
@@ -193,6 +198,7 @@ def context_diff
end
s
end
+ private :context_diff
def ed_diff(format)
op_act = { "+" => 'a', "-" => 'd', "!" => "c" }
@@ -210,9 +216,10 @@ def ed_diff(format)
end
s
end
+ private :ed_diff
- # Generate a range of item numbers to print. Only print 1 number if the
- # range has only one item in it. Otherwise, it's 'start,end'
+ # Generate a range of item numbers to print. Only print 1 number if the
+ # range has only one item in it. Otherwise, it's 'start,end'
def context_range(mode)
case mode
when :old
@@ -223,10 +230,11 @@ def context_range(mode)
(s < e) ? "#{s},#{e}" : "#{e}"
end
+ private :context_range
- # Generate a range of item numbers to print for unified diff. Print
- # number where block starts, followed by number of lines in the block
- # (don't print number of lines if it's 1)
+ # Generate a range of item numbers to print for unified diff. Print number
+ # where block starts, followed by number of lines in the block
+ # (don't print number of lines if it's 1)
def unified_range(mode)
case mode
when :old
@@ -239,4 +247,5 @@ def unified_range(mode)
first = (length < 2) ? e : s # "strange, but correct"
(length == 1) ? "#{first}" : "#{first},#{length}"
end
+ private :unified_range
end
View
489 lib/diff/lcs/internals.rb
@@ -1,271 +1,306 @@
# -*- ruby encoding: utf-8 -*-
module Diff::LCS::Internals # :nodoc:
- class << self
- # Compute the longest common subsequence between the sequenced
- # Enumerables +a+ and +b+. The result is an array whose contents is such
- # that
- #
- # result = Diff::LCS::Internals.lcs(a, b)
- # result.each_with_index do |e, ii|
- # assert_equal(a[ii], b[e]) unless e.nil?
- # end
- def lcs(a, b)
- a_start = b_start = 0
- a_finish = a.size - 1
- b_finish = b.size - 1
- vector = []
-
- # Prune off any common elements at the beginning...
- while (a_start <= a_finish) and
- (b_start <= b_finish) and
- (a[a_start] == b[b_start])
- vector[a_start] = b_start
- a_start += 1
- b_start += 1
- end
-
- # Now the end...
- while (a_start <= a_finish) and
- (b_start <= b_finish) and
- (a[a_finish] == b[b_finish])
- vector[a_finish] = b_finish
- a_finish -= 1
- b_finish -= 1
- end
+end
- # Now, compute the equivalence classes of positions of elements.
- b_matches = position_hash(b, b_start .. b_finish)
+class << Diff::LCS::Internals
+ # Compute the longest common subsequence between the sequenced
+ # Enumerables +a+ and +b+. The result is an array whose contents is such
+ # that
+ #
+ # result = Diff::LCS::Internals.lcs(a, b)
+ # result.each_with_index do |e, ii|
+ # assert_equal(a[ii], b[e]) unless e.nil?
+ # end
+ def lcs(a, b)
+ a_start = b_start = 0
+ a_finish = a.size - 1
+ b_finish = b.size - 1
+ vector = []
- thresh = []
- links = []
+ # Prune off any common elements at the beginning...
+ while (a_start <= a_finish) and
+ (b_start <= b_finish) and
+ (a[a_start] == b[b_start])
+ vector[a_start] = b_start
+ a_start += 1
+ b_start += 1
+ end
- (a_start .. a_finish).each do |ii|
- ai = a.kind_of?(String) ? a[ii, 1] : a[ii]
- bm = b_matches[ai]
- kk = nil
- bm.reverse_each do |jj|
- if kk and (thresh[kk] > jj) and (thresh[kk - 1] < jj)
- thresh[kk] = jj
- else
- kk = replace_next_larger(thresh, jj, kk)
- end
- links[kk] = [ (kk > 0) ? links[kk - 1] : nil, ii, jj ] unless kk.nil?
- end
- end
+ # Now the end...
+ while (a_start <= a_finish) and
+ (b_start <= b_finish) and
+ (a[a_finish] == b[b_finish])
+ vector[a_finish] = b_finish
+ a_finish -= 1
+ b_finish -= 1
+ end
- unless thresh.empty?
- link = links[thresh.size - 1]
- while not link.nil?
- vector[link[1]] = link[2]
- link = link[0]
- end
- end
+ # Now, compute the equivalence classes of positions of elements.
+ b_matches = position_hash(b, b_start .. b_finish)
- vector
- end
+ thresh = []
+ links = []
- # This method will analyze the provided patchset to provide a
- # single-pass normalization (conversion of the array form of
- # Diff::LCS::Change objects to the object form of same) and detection of
- # whether the patchset represents changes to be made.
- def analyze_patchset(patchset, depth = 0)
- raise "Patchset too complex" if depth > 1
-
- has_changes = false
-
- # Format:
- # [ # patchset
- # # hunk (change)
- # [ # hunk
- # # change
- # ]
- # ]
-