Skip to content

dgawlik/diffy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

46 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Diffy

Download Download Download

Project attempts to ease work on parsing long char/byte sequences by providing diff functionality in Java. At the core it is an implementation of Meyer's general purpose diff algorithm. Comparisons return structure containing insert|delete|replace|match ranges.

Features

Here are some tool's features:

  • fast general purpose diff in nearly linear time for similar sequences
  • readable representation of diff result either symbolic or colorful
  • extensible and configurable
  • minimal dependencies
  • implementation with ease of understanding in mind

Installation

You need to first add JCenter to your repositories. And then:

Maven

<dependency>
  <groupId>org.bytediff</groupId>
  <artifactId>diffy</artifactId>
  <version>1.1.0</version>
  <type>pom</type>
</dependency>

Gradle

implementation 'org.bytediff:diffy:1.1.0'

Usage

Let's start with comparing two strings:

char[] source = "quickbrownfoxjumpingoverlazydog".toCharArray();
char[] target = "quickfoxjumpingoverlazydog".toCharArray();

DiffInfo info = Diff.compute(source, target);
String result = Printer.from(info).print();
System.out.println(result);

Console:

quick--[brown]foxjumpingoverlazydog

Comparing two arrays is results in DiffInfo. It holds various methods for retrieving diff data programmatically. Currently there is one Printer but it is fully configurable, more on this in the next example.

According to console output to make source comparable to target we need to delete word brown. Here --[word] stands for delete, ++[word] for insertion and ~~[word] for replacement. A match prints sequence unchanged.

When it comes to really long strings this output wouldn't be so handy. There is a way to see only the modification itself and with some surrounding context.

Here is the next example:

char[] source = "quickbrownfoxjumpingoverlazydown".toCharArray();
char[] target = "quickfoxjumpingoverlazydown".toCharArray();

DiffInfo info = Diff.compute(source, target);
String result = Printer
    .from(info)
    .verbose()
    .withLeftContext(7)
    .withRightContext(4)
    .print();
System.out.println(result);

Console:

...brownfox++[jumping]over...

Now the output is one line per insert|delete|replace with preceding 7 characters and followed by 4 characters.

If this symbolic representation is confusing there is an option to display modifications in color.

Let's modify our first example:

char[] source = "quickbrownfoxjumpingoverlazydown".toCharArray();
char[] target = "quickfoxjumpingoverlazydown".toCharArray();

DiffInfo info = Diff.compute(source, target);
String result = Printer
  .from(info)
  .withFormatter(new AnsiColorFormatter())
  .print();
System.out.println(result);

Lastly in case of working with raw bytes, much of the sequence wouldn't be printable. The next example encodes raw bytes to char[] and then displays their ordinals.

byte[] source = new byte[]{1, 2, 3};
byte[] target = new byte[]{4, 2, 3};

char[] sourceC = Raw.bytesToChars(source);
char[] targetC = Raw.bytesToChars(target);

DiffInfo info = Diff.compute(sourceC, targetC);
Printer p = Printer
    .from(info)
    .withEncoding(new RawValueEncoder(10));
System.out.println(p.print());

Console:

~~[\4 ]\2 \3 

Contributing

Feel free to raise an issue, submit PR or suggest improvement.