Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(toml): parse() duplicates the character next to reserved escape sequences #4192

Merged
merged 9 commits into from
Jan 16, 2024
Merged

fix(toml): parse() duplicates the character next to reserved escape sequences #4192

merged 9 commits into from
Jan 16, 2024

Conversation

babiabeo
Copy link
Contributor

Fix: #4187

@babiabeo babiabeo requested a review from kt3k as a code owner January 14, 2024 10:43
@github-actions github-actions bot added the toml label Jan 14, 2024
@kt3k
Copy link
Member

kt3k commented Jan 15, 2024

There are 2 cases that I expected:

  • Ignores the reserved escape sequence:
parse('a = "a\\0b"'); // { a: "ab" }
  • Throws an error as specified in the spec:

@babiabeo
Do you think ignoring is more useful than throwing?

@iuioiua
Copy link
Collaborator

iuioiua commented Jan 15, 2024

I think we should stick to the spec and throw.

@babiabeo
Copy link
Contributor Author

babiabeo commented Jan 15, 2024

There are 2 cases that I expected:

  • Ignores the reserved escape sequence:
parse('a = "a\\0b"'); // { a: "ab" }
  • Throws an error as specified in the spec:

@babiabeo Do you think ignoring is more useful than throwing?

Currently, this PR solves the issue by ignoring reserved escape sequences.

However, I saw that most TOML parser packages throw an error in this case

import { load } from "npm:js-toml";

console.log(load('a = "a\\0b"'));

// error: Uncaught (in promise) SyntaxParseError: Syntax error
// unexpected character: ->"<- at offset: 4, skipped 3 characters.
// unexpected character: ->b<- at offset: 8, skipped 2 characters.
import tomllib

print(tomllib.loads('a = "a\\0b"'))

# File "D:\coding\programming\languages\python\Lib\tomllib\_parser.py", line 494, in parse_basic_str_escape
#     raise suffixed_err(src, pos, "Unescaped '\\' in a string") from None
# tomllib.TOMLDecodeError: Unescaped '\' in a string (at line 1, column 9)
package main

import (
	"fmt"

	"github.com/BurntSushi/toml"
)

type Data struct {
    A string `toml:"a"`
}

func main() {
    var data Data
    _, err := toml.Decode("a = \"a\\0b\"", &data)

    fmt.Println(err)

    // toml: line 1 (last key "a"): invalid escape in string '\0'
}

Copy link
Collaborator

@iuioiua iuioiua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it best if we instead threw instead of ignoring, as to follow the spec. Are you able to rework this PR to do that?

Copy link
Collaborator

@iuioiua iuioiua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@babiabeo, I've made some tweaks to improve the thrown error message and accuracy of the tests. Do these changes seem reasonable to you?

@babiabeo
Copy link
Contributor Author

@babiabeo, I've made some tweaks to improve the thrown error message and accuracy of the tests. Do these changes seem reasonable to you?

Yes. I forgot to add \\. Thanks for adding it!

@iuioiua iuioiua changed the title fix(toml): parse duplicates the character next to reserved escape sequences fix(toml): parse() duplicates the character next to reserved escape sequences Jan 16, 2024
Copy link
Collaborator

@iuioiua iuioiua left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you, David.

@iuioiua iuioiua merged commit 17cabfa into denoland:main Jan 16, 2024
12 checks passed
@babiabeo babiabeo deleted the fix-toml-parse branch January 16, 2024 01:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug(toml): parse duplicates the character next to reserved escape sequences
3 participants