Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug #16143] BOM after rewind #2430

Open
wants to merge 3 commits into
base: master
from
Open

Conversation

@nobu
Copy link
Member

nobu commented Sep 5, 2019

No description provided.

@nobu nobu force-pushed the nobu:bug/16143-bom-after-rewind branch from 99ecb17 to 8f07985 Sep 5, 2019
io.c Outdated
pos = io_seek(fptr, pos, whence);
if (pos < 0 && errno) rb_sys_fail_path(fptr->pathv);
if (fptr->mode & FMODE_SETENC_BY_BOM) io_set_encoding_by_bom(fptr);

This comment has been minimized.

Copy link
@kou

kou Oct 10, 2019

Member

We should not detect BOM when we update position.

require "tempfile"

Tempfile.create("x", encoding: "utf-8:utf-8") do |f|
  f.write("\uFEFF" "abc" "\uFEFF" "efg")
  f.close
  File.open(f.path, 'r:bom|utf-8:utf-8') do |f|
    f.pos = 6
    p f.read.unpack("U*")
  end
end

should output [65279, 101, 102, 103] but this outputs [101, 102, 103].

How about this?

diff --git a/io.c b/io.c
index 3b4e25dfa1..ba19fff799 100644
--- a/io.c
+++ b/io.c
@@ -1939,7 +1939,6 @@ io_setpos(rb_io_t *fptr, off_t pos, int whence)
 {
     pos = io_seek(fptr, pos, whence);
     if (pos < 0 && errno) rb_sys_fail_path(fptr->pathv);
-    if (fptr->mode & FMODE_SETENC_BY_BOM) io_set_encoding_by_bom(fptr);
 
     return pos;
 }
@@ -2047,6 +2046,7 @@ rb_io_rewind(VALUE io)
 
     GetOpenFile(io, fptr);
     io_setpos(fptr, 0L, SEEK_SET);
+    if (fptr->mode & FMODE_SETENC_BY_BOM) io_set_encoding_by_bom(fptr);
     if (io == ARGF.current_file) {
 	ARGF.lineno -= fptr->lineno;
     }
nobu added 2 commits Sep 5, 2019
For the IO has set_encoding_by_bom flag set, strip and set the
external encoding by the BOM if exists.  [Bug #16143]
@nobu nobu force-pushed the nobu:bug/16143-bom-after-rewind branch from 8f07985 to b4d53b3 Oct 12, 2019
@nobu nobu force-pushed the nobu:bug/16143-bom-after-rewind branch from b4d53b3 to bb4ab1f Oct 12, 2019
@kou
kou approved these changes Oct 12, 2019
enc = [f.external_encoding, f.internal_encoding].compact.join(":")
f.write("\uFEFF" "abc")
f.close
File.open(f.path, 'r:bom|utf-8:utf-8') do |f|

This comment has been minimized.

Copy link
@kou

kou Oct 12, 2019

Member

f.path -> path is better?

end
pos = File.size(path)
File.open(path, "a", encoding: enc) {|f| f.write("\uFEFF" "efg")}
File.open(f.path, 'r:bom|utf-8:utf-8') do |f|

This comment has been minimized.

Copy link
@kou

kou Oct 12, 2019

Member

ditto

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants
You can’t perform that action at this time.