Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handled UTF-8 BOM #1285

Merged
merged 3 commits into from
Nov 10, 2020
Merged

Handled UTF-8 BOM #1285

merged 3 commits into from
Nov 10, 2020

Conversation

FrankHB
Copy link
Contributor

@FrankHB FrankHB commented Jul 25, 2020

Fixed #1155 .

FrankHB added a commit to FrankHB/YSLib-book that referenced this pull request Jul 25, 2020
使用 YSLib-wiki.hg 版本 b1d91974e33d ,符号链接到 src 。
依赖 mdBook 0.4.1 修改版,参见 rust-lang/mdBook#1285 。

Signed-off-by: FrankHB <frankhb1989@gmail.com>
FrankHB added a commit to FrankHB/YSLib-book that referenced this pull request Jul 25, 2020
使用 YSLib-wiki.hg 版本 b1d91974e33d ,符号链接到 src 。
依赖 mdBook 0.4.1 修改版,参见 rust-lang/mdBook#1285 。

Signed-off-by: FrankHB <frankhb1989@gmail.com>
Copy link
Contributor

@ehuss ehuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Can you also add a test for this?

src/book/book.rs Outdated
@@ -264,6 +264,10 @@ fn load_chapter<P: AsRef<Path>>(
format!("Unable to read \"{}\" ({})", link.name, location.display())
})?;

if content.as_bytes().starts_with(b"\xef\xbb\xbf") {
content = content[3..].to_string()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to avoid the extra allocation, perhaps this could be content.replace_range(..3, "");

@ehuss ehuss added the S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. label Sep 6, 2020
Signed-off-by: FrankHB <frankhb1989@gmail.com>
@FrankHB
Copy link
Contributor Author

FrankHB commented Sep 29, 2020

Not sure where should be the test. Is an additional BOM in tests/dummy_book/src/first/unicode.md enough?

@ehuss
Copy link
Contributor

ehuss commented Sep 30, 2020

I'm a little concerned about having invisible bytes in a file. unicode.md is already quite subtle. But I think if you have an editor that can't handle a BOM properly, it will have a risk of corrupting the existing unicode, so it's probably fine.

My only other idea is to create a simple 1-chapter book in a temp directory like book_with_a_reserved_filename_does_not_build does. Whichever seems easiest!

@FrankHB
Copy link
Contributor Author

FrankHB commented Oct 2, 2020

Sorry, I don't get the point of the temp directory.

As of the subtleness... I largely agree. Nevertheless, I can use a hex editor to make sure only the BOM is inserted exactly. Anyway, the visible part of the file has already warned us about this. (Or one more visible warning about invisible bytes?)

@ehuss
Copy link
Contributor

ehuss commented Oct 5, 2020

Creating a book in Rust code (and writing it to a temp directory) would allow writing the content in a Rust string where the characters can be escaped and explicit. For example, fs::write(chapter, "\u{feff}# Example chapter").

Signed-off-by: FrankHB <frankhb1989@gmail.com>
@ehuss ehuss removed the S-waiting-on-author Status: The marked PR is awaiting some action (such as code changes) from the PR author. label Nov 10, 2020
Copy link
Contributor

@ehuss ehuss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@ehuss ehuss merged commit 643d5ec into rust-lang:master Nov 10, 2020
FrankHB added a commit to FrankHB/YSLib-book that referenced this pull request Jan 22, 2024
使用 YSLib-wiki.hg 版本 e1f073d604e6 ,符号链接到 src 。
依赖 mdBook 0.4.1 修改版,参见 rust-lang/mdBook#1285 。

Signed-off-by: FrankHB <frankhb1989@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing support of byte order mark
2 participants