Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use common lineHash to share indice between text1 and text2 for correct line diffs #135

Merged
merged 1 commit into from
Jan 13, 2023

Conversation

nrnrk
Copy link
Contributor

@nrnrk nrnrk commented Oct 15, 2022

What?

Use common cache of line contents between two texts in DiffLinesToChars to get line diffs correctly.

Why?

In some cases, line diffs cannot be retrieved correctly in the standard way. The following code is one of the examples.

package main

import (
	"fmt"

	"github.com/sergi/go-diff/diffmatchpatch"
)

const (
	text1 = `hoge:
  step11:
  - arrayitem1
  - arrayitem2
  step12:
    step21: hoge
    step22: -93
fuga: flatitem
`
	text2 = `hoge:
  step11:
  - arrayitem4
  - arrayitem2
  - arrayitem3
  step12:
    step21: hoge
    step22: -92
fuga: flatitem
`
)

func main() {
	dmp := diffmatchpatch.New()
	a, b, c := dmp.DiffLinesToChars(text1, text2)
	diffs := dmp.DiffMain(a, b, false)
	diffs = dmp.DiffCharsToLines(diffs, c)
	// DiffCleanupSemantic improves a little but not enough
	// diffs = dmp.DiffCleanupSemantic(diffs)
	fmt.Println(diffs)
}
[{Insert hoge:
  step11:
hoge:
} {Equal hoge:
} {Insert hoge:
} {Equal   step11:
} {Insert hoge:
} {Equal   - arrayitem1
} {Insert hoge:
} {Equal   - arrayitem2
} {Insert hoge:
} {Equal   step12:
} {Insert hoge:
} {Equal     step21: hoge
} {Insert hoge:
} {Equal     step22: -93
} {Delete fuga: flatitem
}]

This fix corresponds to javascript implementation.

Testing?

Add a unit testcase

Anything Else?

This is my first contribution to this repository, therefore I would really appreciate any feedback, suggestions and change requests.

Use common cache of line contents between two texts in `DiffLinesToChars` to get line diffs correctly.
In some cases, line diffs cannot be retrieved correctly in the standard way (https://github.com/google/diff-match-patch/wiki/Line-or-Word-Diffs#line-mode).
In the below case, we failed to get line diffs correctly before this fix.

```go:main.go
package main

import (
	"fmt"

	"github.com/sergi/go-diff/diffmatchpatch"
)

const (
	text1 = `hoge:
  step11:
  - arrayitem1
  - arrayitem2
  step12:
    step21: hoge
    step22: -93
fuga: flatitem
`
	text2 = `hoge:
  step11:
  - arrayitem4
  - arrayitem2
  - arrayitem3
  step12:
    step21: hoge
    step22: -92
fuga: flatitem
`
)

func main() {
	dmp := diffmatchpatch.New()
	a, b, c := dmp.DiffLinesToChars(text1, text2)
	diffs := dmp.DiffMain(a, b, false)
	diffs = dmp.DiffCharsToLines(diffs, c)
	// diffs = dmp.DiffCleanupSemantic(diffs)
	fmt.Println(diffs)
}
```

```text:output
[{Insert hoge:
  step11:
hoge:
} {Equal hoge:
} {Insert hoge:
} {Equal   step11:
} {Insert hoge:
} {Equal   - arrayitem1
} {Insert hoge:
} {Equal   - arrayitem2
} {Insert hoge:
} {Equal   step12:
} {Insert hoge:
} {Equal     step21: hoge
} {Insert hoge:
} {Equal     step22: -93
} {Delete fuga: flatitem
}]
```

Note: This fix corresponds to a javascript implementation.
(ref: https://github.com/google/diff-match-patch/blob/62f2e689f498f9c92dbc588c58750addec9b1654/javascript/diff_match_patch_uncompressed.js#L466)
Copy link
Owner

@sergi sergi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sergi sergi merged commit 11fe19c into sergi:master Jan 13, 2023
@nrnrk nrnrk deleted the fix_use_common_line_hash branch January 13, 2023 09:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants