Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fmt: issues related to EUC-KR encoding string scanf #24353

Closed
newro opened this issue Mar 12, 2018 · 3 comments
Closed

fmt: issues related to EUC-KR encoding string scanf #24353

newro opened this issue Mar 12, 2018 · 3 comments

Comments

@newro
Copy link

@newro newro commented Mar 12, 2018

Please answer these questions before submitting your issue. Thanks!

What version of Go are you using (go version)?

go version go1.9.1 linux/amd64

Does this issue reproduce with the latest release?

I didn't check it.

What operating system and processor architecture are you using (go env)?

GOARCH="amd64"
GOBIN=""
GOEXE=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOOS="linux"
GOPATH="/home/newro/app/go"
GORACE=""
GOROOT="/home/newro/go1.9.1"
GOTOOLDIR="/home/newro/go1.9.1/pkg/tool/linux_amd64"
GCCGO="gccgo"
CC="gcc"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build985492979=/tmp/go-build -gno-record-gcc-switches"
CXX="g++"
CGO_ENABLED="1"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"

What did you do?

Sample code : https://play.golang.org/p/-GtWnBxeDAB

fmt.Sscanf does not seem to work properly with the euc-kr charset.
I am working in an euc-kr environment, but I have not been able to perform an overall test on a multi-byte rune string.

What did you expect to see?

When you work with the euc-kr string using fmt.Sscanf, the same character (euc-kr) string must be assigned to word, word2.

What did you see instead?

Unknown string code set is assigned to string string by sscanf.

@robpike
Copy link
Contributor

@robpike robpike commented Mar 12, 2018

Scanf, like all the string routines in the standard library, assume UTF-8. It is working as intended, unfortunate though that may be for you. You will need help from another package to succeed here.

@mpvl

@mattn
Copy link
Member

@mattn mattn commented Mar 12, 2018

@newro As rob mentioned, the reader should read contents as utf-8. Please try below.

r := korean.EUCKR.NewDecoder().Reader(strings.NewReader(str))
fmt.Fsscanf(r, "%d %d %s %s", &sp, &ep, &word, &word2)
@ALTree ALTree changed the title Issues related to EUC-KR encoding string scanf fmt: issues related to EUC-KR encoding string scanf Mar 12, 2018
@andybons andybons added this to the Unplanned milestone Mar 26, 2018
@gopherbot
Copy link

@gopherbot gopherbot commented Apr 13, 2018

Timed out in state WaitingForInfo. Closing.

(I am just a bot, though. Please speak up if this is a mistake or you have the requested information.)

@gopherbot gopherbot closed this Apr 13, 2018
@golang golang locked and limited conversation to collaborators Apr 13, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants
You can’t perform that action at this time.