-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
html: html.UnescapeString("(	)") does not decode single character number #66058
Comments
entities need to end with a semicolon |
entities do not need to end with a semmicolon. package main
import (
"fmt"
"html"
)
func main() {
fmt.Println(html.UnescapeString("(!)"))
} output:
https://go.dev/play/p/1EawuIJeFka Edge/Chrome/Firefox allows entity without a semicolon. |
mdn says starts with & and ends with ; whatwg html spec agrees
|
Yes, current HTML5 specification states that entity must ends with a semicolon. https://www.w3.org/TR/1999/REC-html401-19991224/charset.html#entities
If html.UnescapeString() follow HTML5 specification, how do you explain the following code's output.
output:
|
see previously #21563 |
says: <html><body></body></html> and <html><body></body></html> parse to the same document. BUT, current html.UnescapeString() behave differently. import (
"fmt"
"html"
)
func main() {
fmt.Printf("%q\n", html.UnescapeString("()"))
fmt.Printf("%q\n", html.UnescapeString("()"))
} output:
if <html><body></body></html>
<html><body></body></html> means same document. html.UnescapeString("()" should equals to html.UnescapeString("()") |
@seankhliao Please reopen this issue. This is clearly a bug. This code explains everything. package main
import (
"fmt"
"html"
)
func main() {
fmt.Printf("%q\n", html.UnescapeString("(	)"))
fmt.Printf("%q\n", html.UnescapeString("(	)"))
} output:
|
Fix handling of "&golang#9" and add tests for other single-digit cases. Fixes golang#66058 Updates golang#21563
Fix handling of "&golang#9" and add tests for other single-digit cases. Fixes golang#66058 Updates golang#21563
Fix handling of "&golang#9" and add tests for other single-digit cases. Fixes golang#66058 Updates golang#21563
Fix handling of "&golang#9" and add tests for other single-digit cases. Fixes golang#66058 Updates golang#21563
Change https://go.dev/cl/569456 mentions this issue: |
Go version
go version go1.21.4 linux/amd64
Output of
go env
in your module/workspace:What did you do?
the html.UnescapeString() function does not decode one character number correctly.
https://go.dev/play/p/kCEC5INrCNt
What did you see happen?
output:
I think the "No characters matched." check logic of function html.unescapeEntity() is incorrect.
https://cs.opensource.google/go/go/+/refs/tags/go1.22.0:src/html/escape.go;l=107
What did you expect to see?
want:
Edge/Chrome/Firefox displays the following html as ( ).
https://jsfiddle.net/Lkm6jy3c/
The text was updated successfully, but these errors were encountered: