Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UI: Detect and restore encoding and BOM in content #6727

Merged
merged 11 commits into from Apr 26, 2019

Conversation

@zeripath
Copy link
Contributor

commented Apr 23, 2019

Fixes #6716

When decoding content, if the first 3 bytes match the UTF-8 BOM remove it.

When updating content through the editor, check the previous content for the encoding and BOM and reencode to that. If we can't encode to that then default to utf8.

Signed-off-by: Andrew Thornton art27@cantab.net

detect and remove a decoded BOM
Signed-off-by: Andrew Thornton <art27@cantab.net>

@zeripath zeripath added this to the 1.9.0 milestone Apr 23, 2019

@zeripath

This comment has been minimized.

Copy link
Contributor Author

commented Apr 23, 2019

This could be easily backported to 1.8 less easy to backport back to 1.8 now because of the reencode step but still possible.

@codecov-io

This comment has been minimized.

Copy link

commented Apr 23, 2019

Codecov Report

Merging #6727 into master will decrease coverage by 0.02%.
The diff coverage is 24.44%.

Impacted file tree graph

@@            Coverage Diff            @@
##           master   #6727      +/-   ##
=========================================
- Coverage   41.03%     41%   -0.03%     
=========================================
  Files         421     421              
  Lines       57967   58050      +83     
=========================================
+ Hits        23784   23804      +20     
- Misses      31024   31078      +54     
- Partials     3159    3168       +9
Impacted Files Coverage Δ
modules/repofiles/update.go 39.47% <23.37%> (-5.27%) ⬇️
modules/templates/helper.go 48.66% <25%> (-0.3%) ⬇️
modules/base/tool.go 72.26% <40%> (-0.42%) ⬇️
models/gpg_key.go 55.83% <0%> (-0.84%) ⬇️
modules/log/event.go 65.98% <0%> (+1.52%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4c34bc1...3dbc3b5. Read the comment docs.

zeripath added some commits Apr 23, 2019

On error keep as UTF-8
Signed-off-by: Andrew Thornton <art27@cantab.net>

@zeripath zeripath changed the title UI: Detect and remove a decoded BOM UI: Detect BOM and restore encoding and BOMs on updates Apr 23, 2019

@silverwind

This comment has been minimized.

Copy link
Contributor

commented Apr 25, 2019

Tested this, can confirm it's working well. BOM is not rendered anymore and preserved if present when editing the file in the web editor.

@GiteaBot GiteaBot added lgtm/need 1 and removed lgtm/need 2 labels Apr 25, 2019

@GiteaBot GiteaBot added lgtm/done and removed lgtm/need 1 labels Apr 25, 2019

@zeripath

This comment has been minimized.

Copy link
Contributor Author

commented Apr 25, 2019

Should I provide a backport?

zeripath and others added some commits Apr 25, 2019

@lafriks

This comment has been minimized.

Copy link
Member

commented Apr 26, 2019

Yes please do so

@@ -267,6 +267,10 @@ func ToUTF8WithErr(content []byte) (string, error) {
if err != nil {
return "", err
} else if charsetLabel == "UTF-8" {
if len(content) > 2 && bytes.Equal(content[0:3], base.UTF8BOM) {

This comment has been minimized.

Copy link
@lunny

lunny Apr 26, 2019

Member

How about to create a function named RemoveUTF8BOM(content string) string.

This comment has been minimized.

Copy link
@zeripath

zeripath Apr 26, 2019

Author Contributor

Done

@lunny

This comment has been minimized.

Copy link
Member

commented Apr 26, 2019

@zeripath see my comment.

zeripath added some commits Apr 26, 2019

@zeripath

This comment has been minimized.

Copy link
Contributor Author

commented Apr 26, 2019

I also noticed that I wasn't dealing with updating LFSed content so that's done too.

@zeripath

This comment has been minimized.

Copy link
Contributor Author

commented Apr 26, 2019

(well it's dealt with in the same way we deal with it on the front end.)

@zeripath

This comment has been minimized.

Copy link
Contributor Author

commented Apr 26, 2019

Don't merge - I've just noticed that the LFS stuff is slightly wrong. Fixed

Show resolved Hide resolved modules/repofiles/update.go Outdated
@zeripath

This comment has been minimized.

Copy link
Contributor Author

commented Apr 26, 2019

OK fixed!

zeripath added a commit to zeripath/gitea that referenced this pull request Apr 26, 2019

Detect encoding and BOM in content (go-gitea#6727)
Detect and remove a decoded BOM when showing content.
Restore the previous encoding and BOM when updating content.
On error keep as UTF-8 encoding.

Signed-off-by: Andrew Thornton <art27@cantab.net>

@zeripath zeripath changed the title UI: Detect BOM and restore encoding and BOMs on updates UI: Detect and restore encoding and BOM in content Apr 26, 2019

@lafriks lafriks merged commit f6eedd4 into go-gitea:master Apr 26, 2019

2 checks passed

approvals/lgtm this commit looks good
continuous-integration/drone/pr Build is passing
Details

techknowlogick added a commit that referenced this pull request Apr 27, 2019

Detect encoding and BOM in content (#6727) (#6765)
Detect and remove a decoded BOM when showing content.
Restore the previous encoding and BOM when updating content.
On error keep as UTF-8 encoding.

Signed-off-by: Andrew Thornton <art27@cantab.net>

@zeripath zeripath deleted the zeripath:fix-#6716 branch May 2, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.