Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken spans in diffs #14678

Merged
merged 1 commit into from
Feb 14, 2021
Merged

Conversation

zeripath
Copy link
Contributor

@zeripath zeripath commented Feb 14, 2021

Gitea runs diff on highlighted code fragment for each line in order to provide
code highlight diffs. Unfortunately this diff algorithm is not aware that span tags
and entities are atomic and cannot be split.

The current fixup code makes some attempt to fix these broken tags however, it cannot
handle situations where a tag is split over multiple blocks.

This PR provides a more algorithmic fixup mechanism whereby spans and entities are
completely coalesced into their respective blocks.

This may result in a incompletely reduced diff but - it will definitely prevent the
broken entities and spans that are currently possible.

As a result of this fixup several inconsistencies were discovered in our testcases
and these were also fixed.

Fix #14231

Signed-off-by: Andrew Thornton art27@cantab.net

Gitea runs diff on highlighted code fragment for each line in order to provide
code highlight diffs. Unfortunately this diff algorithm is not aware that span tags
and entities are atomic and cannot be split.

The current fixup code makes some attempt to fix these broken tags however, it cannot
handle situations where a tag is split over multiple blocks.

This PR provides a more algorithmic fixup mechanism whereby spans and entities are
completely coalesced into their respective blocks.

This may result in a incompletely reduced diff but - it will definitely prevent the
broken entities and spans that are currently possible.

As a result of this fixup several inconsistencies were discovered in our testcases
and these were also fixed.

Fix go-gitea#14231

Signed-off-by: Andrew Thornton <art27@cantab.net>
@@ -61,22 +62,22 @@ func TestDiffToHTML(t *testing.T) {
{Type: dmp.DiffEqual, Text: "</span><span class=\"p\">)</span>"},
}, DiffLineDel))

assertEqual(t, "<span class=\"nx\">r</span><span class=\"p\">.</span><span class=\"nf\">WrapperRenderer</span><span class=\"p\">(</span><span class=\"nx\">w</span><span class=\"p\">,</span> <span class=\"removed-code\"><span class=\"nx\">language</span></span><span class=\"removed-code\"><span class=\"p\">,</span> <span class=\"kc\">true</span><span class=\"p\">,</span> <span class=\"nx\">attrs</span></span><span class=\"p\">,</span> <span class=\"kc\">false</span><span class=\"p\">)</span>", diffToHTML("", []dmp.Diff{
assertEqual(t, "<span class=\"nx\">r</span><span class=\"p\">.</span><span class=\"nf\">WrapperRenderer</span><span class=\"p\">(</span><span class=\"nx\">w</span><span class=\"p\">,</span> <span class=\"removed-code\"><span class=\"nx\">language</span><span class=\"p\">,</span> <span class=\"kc\">true</span><span class=\"p\">,</span> <span class=\"nx\">attrs</span></span><span class=\"p\">,</span> <span class=\"kc\">false</span><span class=\"p\">)</span>", diffToHTML("", []dmp.Diff{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removes an unnecessary close and reopen of a removed-code span

{Type: dmp.DiffEqual, Text: "<span class=\"nx\">r</span><span class=\"p\">.</span><span class=\"nf\">WrapperRenderer</span><span class=\"p\">(</span><span class=\"nx\">w</span><span class=\"p\">,</span> <span class=\"nx\">"},
{Type: dmp.DiffDelete, Text: "language</span><span "},
{Type: dmp.DiffEqual, Text: "c"},
{Type: dmp.DiffDelete, Text: "lass=\"p\">,</span> <span class=\"kc\">true</span><span class=\"p\">,</span> <span class=\"nx\">attrs"},
{Type: dmp.DiffEqual, Text: "</span><span class=\"p\">,</span> <span class=\"kc\">false</span><span class=\"p\">)</span>"},
}, DiffLineDel))

assertEqual(t, "<span class=\"added-code\">language</span></span><span class=\"added-code\"><span class=\"p\">,</span> <span class=\"kc\">true</span><span class=\"p\">,</span> <span class=\"nx\">attrs</span></span><span class=\"p\">,</span> <span class=\"kc\">false</span><span class=\"p\">)</span>", diffToHTML("", []dmp.Diff{
assertEqual(t, "<span class=\"added-code\">language</span><span class=\"p\">,</span> <span class=\"kc\">true</span><span class=\"p\">,</span> <span class=\"nx\">attrs</span></span><span class=\"p\">,</span> <span class=\"kc\">false</span><span class=\"p\">)</span>", diffToHTML("", []dmp.Diff{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removes an unnecessary close and reopen of a added-code span

{Type: dmp.DiffInsert, Text: "language</span><span "},
{Type: dmp.DiffEqual, Text: "c"},
{Type: dmp.DiffInsert, Text: "lass=\"p\">,</span> <span class=\"kc\">true</span><span class=\"p\">,</span> <span class=\"nx\">attrs"},
{Type: dmp.DiffEqual, Text: "</span><span class=\"p\">,</span> <span class=\"kc\">false</span><span class=\"p\">)</span>"},
}, DiffLineAdd))

assertEqual(t, "<span class=\"k\">print</span><span class=\"added-code\"></span><span class=\"added-code\"><span class=\"p\">(</span></span><span class=\"sa\"></span><span class=\"s2\">&#34;</span><span class=\"s2\">// </span><span class=\"s2\">&#34;</span><span class=\"p\">,</span> <span class=\"n\">sys</span><span class=\"o\">.</span><span class=\"n\">argv</span><span class=\"added-code\"><span class=\"p\">)</span></span>", diffToHTML("", []dmp.Diff{
assertEqual(t, "<span class=\"k\">print</span><span class=\"added-code\"><span class=\"p\">(</span></span><span class=\"sa\"></span><span class=\"s2\">&#34;</span><span class=\"s2\">// </span><span class=\"s2\">&#34;</span><span class=\"p\">,</span> <span class=\"n\">sys</span><span class=\"o\">.</span><span class=\"n\">argv</span><span class=\"added-code\"><span class=\"p\">)</span></span>", diffToHTML("", []dmp.Diff{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removes an unnecessary close and reopen of a removed-code span

@@ -85,14 +86,14 @@ func TestDiffToHTML(t *testing.T) {
{Type: dmp.DiffInsert, Text: "<span class=\"p\">)</span>"},
}, DiffLineAdd))

assertEqual(t, "sh <span class=\"added-code\">&#39;useradd -u $(stat -c &#34;%u&#34; .gitignore) jenkins</span>&#39;", diffToHTML("", []dmp.Diff{
assertEqual(t, "sh <span class=\"added-code\">&#39;useradd -u $(stat -c &#34;%u&#34; .gitignore) jenkins&#39;</span>", diffToHTML("", []dmp.Diff{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the provided diff array and what was previously listed as expected you'll see that we actually expected something incorrect!

{Type: dmp.DiffEqual, Text: "sh &#3"},
{Type: dmp.DiffDelete, Text: "4;useradd -u 111 jenkins&#34"},
{Type: dmp.DiffInsert, Text: "9;useradd -u $(stat -c &#34;%u&#34; .gitignore) jenkins&#39"},
{Type: dmp.DiffEqual, Text: ";"},
}, DiffLineAdd))

assertEqual(t, "<span class=\"x\"> &lt;h<span class=\"added-code\">4 class=</span><span class=\"added-code\">&#34;release-list-title df ac&#34;</span>&gt;</span>", diffToHTML("", []dmp.Diff{
assertEqual(t, "<span class=\"x\"> &lt;h<span class=\"added-code\">4 class=&#34;release-list-title df ac&#34;</span>&gt;</span>", diffToHTML("", []dmp.Diff{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removes an unnecessary close and reopen of a addded-code span

Comment on lines +467 to +476
func TestDiffToHTML_14231(t *testing.T) {
setting.Cfg = ini.Empty()
diffRecord := diffMatchPatch.DiffMain(highlight.Code("main.v", " run()\n"), highlight.Code("main.v", " run(db)\n"), true)
diffRecord = diffMatchPatch.DiffCleanupEfficiency(diffRecord)

expected := ` <span class="n">run</span><span class="added-code"><span class="o">(</span><span class="n">db</span></span><span class="o">)</span>`
output := diffToHTML("main.v", diffRecord, DiffLineAdd)

assertEqual(t, expected, output)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here we add a test case specific for the issue that this PR fixes.

@GiteaBot GiteaBot added the lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. label Feb 14, 2021
@codecov-io
Copy link

Codecov Report

Merging #14678 (09ab1be) into master (487f2ee) will increase coverage by 0.07%.
The diff coverage is 98.36%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #14678      +/-   ##
==========================================
+ Coverage   42.21%   42.29%   +0.07%     
==========================================
  Files         767      767              
  Lines       81624    81739     +115     
==========================================
+ Hits        34458    34569     +111     
- Misses      41531    41534       +3     
- Partials     5635     5636       +1     
Impacted Files Coverage Δ
modules/structs/issue.go 0.00% <ø> (ø)
modules/queue/unique_queue_disk_channel.go 53.73% <50.00%> (-0.12%) ⬇️
modules/setting/setting.go 49.03% <100.00%> (ø)
modules/templates/base.go 42.30% <100.00%> (+1.13%) ⬆️
services/gitdiff/gitdiff.go 73.70% <100.00%> (+4.93%) ⬆️
models/unit.go 46.57% <0.00%> (-2.74%) ⬇️
models/repo_list.go 77.87% <0.00%> (-0.89%) ⬇️
routers/repo/view.go 41.16% <0.00%> (-0.63%) ⬇️
models/login_source.go 27.70% <0.00%> (+0.26%) ⬆️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f3847c9...ae9dc35. Read the comment docs.

@GiteaBot GiteaBot added lgtm/need 1 This PR needs approval from one additional maintainer to be merged. and removed lgtm/need 2 This PR needs two approvals by maintainers to be considered for merging. labels Feb 14, 2021
Copy link
Member

@6543 6543 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unless we want to rewrite whole diff code ... this is the way to go 🚀

@GiteaBot GiteaBot added lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. and removed lgtm/need 1 This PR needs approval from one additional maintainer to be merged. labels Feb 14, 2021
@6543
Copy link
Member

6543 commented Feb 14, 2021

🚀

@6543 6543 merged commit beb2058 into go-gitea:master Feb 14, 2021
@zeripath zeripath deleted the fix-14231-fix-up-diff-line branch February 14, 2021 14:53
zeripath added a commit to zeripath/gitea that referenced this pull request Feb 14, 2021
Backport go-gitea#14678

Gitea runs diff on highlighted code fragment for each line in order to
provide code highlight diffs. Unfortunately this diff algorithm is not
aware that span tags and entities are atomic and cannot be split.

The current fixup code makes some attempt to fix these broken tags
however, it cannot handle situations where a tag is split over multiple
blocks.

This PR provides a more algorithmic fixup mechanism whereby spans and
entities are completely coalesced into their respective blocks.

This may result in a incompletely reduced diff but - it will definitely
prevent the broken entities and spans that are currently possible.

As a result of this fixup several inconsistencies were discovered in our
testcases and these were also fixed.

Fix go-gitea#14231

Signed-off-by: Andrew Thornton <art27@cantab.net>
6543 added a commit that referenced this pull request Feb 14, 2021
Backport #14678

Gitea runs diff on highlighted code fragment for each line in order to
provide code highlight diffs. Unfortunately this diff algorithm is not
aware that span tags and entities are atomic and cannot be split.

The current fixup code makes some attempt to fix these broken tags
however, it cannot handle situations where a tag is split over multiple
blocks.

This PR provides a more algorithmic fixup mechanism whereby spans and
entities are completely coalesced into their respective blocks.

This may result in a incompletely reduced diff but - it will definitely
prevent the broken entities and spans that are currently possible.

As a result of this fixup several inconsistencies were discovered in our
testcases and these were also fixed.

Fix #14231

Signed-off-by: Andrew Thornton <art27@cantab.net>

Co-authored-by: 6543 <6543@obermui.de>
@zeripath zeripath added the backport/done All backports for this PR have been created label Mar 1, 2021
@go-gitea go-gitea locked and limited conversation to collaborators May 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport/done All backports for this PR have been created lgtm/done This PR has enough approvals to get merged. There are no important open reservations anymore. type/bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Display issue in commit diff
5 participants