Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing links in print.html #866

Merged
merged 3 commits into from
Jan 18, 2019
Merged

Fixing links in print.html #866

merged 3 commits into from
Jan 18, 2019

Conversation

steveklabnik
Copy link
Member

This is only a failing test, not an implementation. Opening it now for two reasons:

  1. this fails two tests right now. @Michael-F-Bryan or someone else, how do I properly regenerate the search index json fixture? I tried to do it manually and it didn't seem to work.

  2. does this test look reasonable?

working on a patch now cc @cetra3 if you have any advice

@cetra3
Copy link
Contributor

cetra3 commented Jan 14, 2019

The search index is changed due to the addition of an extra chapter, which means you have to regenerate the searchindex_fixture.json file by:

  • Setting GENERATE_FIXTURE to true on line 469
  • Running cargo test
  • Setting GENERATE_FIXTURE back to false

Unfortunately you can't just copy it from a normal book build, as it's not pretty printed.

The current code basically concatenates all the files into one big html file and does it as it's rendering all the files. I.e, it renders a file, outputs it to the location it's at, then appends the html into print.html.

I was going to use another project I did (https://github.com/cetra3/mdcollate) to generate a correct print.md file.

With mdcollate, I was going to:

  • Traverse from SUMMARY.md
  • Generate a single print.md file
  • Use this as print.html

Unfortunately there would need to be a few enhancements to mdcollate to support this. Namely, dealing with windows file systems (backslashes!) & converting it into a library.

@steveklabnik
Copy link
Member Author

which means you have to regenerate the searchindex_fixture.json file by:

Oh wow, I just missed this, thanks :(

The current code basically concatenates all the files into one big html file and does it as it's rendering all the files. I.e, it renders a file, outputs it to the location it's at, then appends the html into print.html.

Yep, that's the root cause here. The strategy I'm attempting to take is to check if links are relative, and then re-base them. We'll see...

@steveklabnik
Copy link
Member Author

Okay! While this hasn't built here, things pass locally, as well as testing in the main Rust repository (cc rust-lang/rust#57495)

Can someone review this?

@ehuss
Copy link
Contributor

ehuss commented Jan 15, 2019

I don't know the mdbook source code well enough to provide a review, but I have kicked the tires on this PR and it seems to work for me.

I tossed together a quick script to convert mdbook 0.1 to 0.2: https://gist.github.com/ehuss/b2a647eb69aa8f57bc4fb397656dfac1 I ran it against the reference and I found some curious issues (unrelated to this PR) that maybe someone can answer:

  • Previously you could have a link like #foo and it would be relative to the same page. Is it possible to do that in mdbook 0.2? Or does it have to include the page name like page.html#foo?
  • The .md to .html translation doesn't seem to work for links with # fragments that have md in the middle. For example, foo.html#phantomdata gets rewritten as foo.html#phant.html. Am I doing something wrong, or is this just a bug?

@steveklabnik
Copy link
Member Author

@ehuss oh that's super cool!

I have to re-read the code, but I think those are both bugs; I'd like to support them, at least. Maybe I can fix those issues quickly and make them a part of this PR as well. I'll investigate later today and report back.

In a regex, `.` means `[0-9a-zA-Z]`, not a period character. This means
that this regex worked, unless a link anchor contained `md` inside of
it. This fixes the regex to match a literal period.

Reported by @ehuss in
#866 (comment)
@steveklabnik
Copy link
Member Author

@ehuss I've pushed a commit to fix the second issue, nice catch!

The first issue, #foo, is hard though: when we make everything one page, we can't guarantee that these are all accurate. Imagine you have two different pages that contain #foo headings and anchors... fixing that problem is hard. Does linkcheck fail on these links, do you know?

@ehuss
Copy link
Contributor

ehuss commented Jan 16, 2019

The print.html page just ends up with a few links that point to the wrong section. The headers get unique anchor names (#foo #foo-1 #foo-2), but the links will all point to #foo. I believe this has always been the case (see #542).

The issue I was specifically referring to is if a fragment-only link can elide the page name or not. It's not a big deal, just a minor convenience. The script I posted inserts the page name, so it's an easy workaround.

@steveklabnik
Copy link
Member Author

I plan on merging this tomorrow and cutting a new release.

@steveklabnik steveklabnik merged commit 25c1ca1 into master Jan 18, 2019
@steveklabnik
Copy link
Member Author

Thanks folks!

@steveklabnik steveklabnik deleted the gh828 branch January 18, 2019 14:54
Ruin0x11 pushed a commit to Ruin0x11/mdBook that referenced this pull request Aug 30, 2020
In a regex, `.` means `[0-9a-zA-Z]`, not a period character. This means
that this regex worked, unless a link anchor contained `md` inside of
it. This fixes the regex to match a literal period.

Reported by @ehuss in
rust-lang#866 (comment)
Ruin0x11 pushed a commit to Ruin0x11/mdBook that referenced this pull request Aug 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants