Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[INFRA] PDF version of spec: fix handling of internal links #915

Merged
merged 23 commits into from Nov 15, 2021

Conversation

sappelhoff
Copy link
Member

@sappelhoff sappelhoff commented Oct 28, 2021

fixes #852
replaces #855 (thereby closes #855)
related: #596 (but this PR does not really work towards that yet)

I think this is a nice bunch of fixes / enhancements. The one thing that would "wrap this up" would be support for removing "reference style" internal links from the PDF build. (i.e., links like [I am text][some-ref], where "some-ref" is declared at the bottom of the doc). But I think that's a rabbit hole, so I wanted to push this PR as long as it's reasonably clean.

fixes / enhancements so far

  • one typo in the spec
  • better docs about pdf building dependencies
  • regexp that removes internal links
    • better docs
    • roadmap of what's needed (TODO, maybe another PR, don't know)
    • simplify regexp, also make it catch more cases (probably solving the LGTM error, cc @DimitriPapadopoulos )
  • better logging during pdf table fixing when errors are encountered
  • fix table detection: previously, filename templates like [_part-<mag|phase|real|imag>] would be detected as table (due to numerous pipe chars)
  • fix weird markdown syntax in pre GH changelog MD doc
  • PDF building job now runs independently of linkchecker (+artifacts-redirector updated)

to do

  • fix for reference style links

better/alternative solution

nice to have

  • a "check" going through all "reference style links" asserting that no single ref has been "overloaded" with two links across all documents
  • CI step that checks that links are not defined over multiple lines, like below:
this is an example of a link [that goes to
google](https://google.com)

That should be:

this is an example of a link
[that goes to google](https://google.com)

To make it easier to parse links

Comment on lines -204 to -212
# Ensure that build_docs_pdf always runs last, so that we can use the CircleCI API link for the "latest" artifact
# https://circleci.com/api/v1.1/project/github/bids-standard/bids-specification/latest/artifacts/0/bids-spec.pdf?branch=master
- build_docs_pdf:
requires:
- build_docs
- linkchecker
- github-changelog-generator
- remark
- Changelog-bot
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is no longer needed (and hasn't been needed for a long time actually), because we use https://github.com/larsoner/circleci-artifacts-redirector-action

It's good we finally change this, because now we can see whether PDF build passes when linkchecker doesn't ... and linkchecker is arguably the test that fails most often, which often prevented us from seeing the PDF check

Comment on lines +98 to +99
Minor revisions may be made using GitHub's
[suggestion feature](https://help.github.com/en/articles/incorporating-feedback-in-your-pull-request).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reformatted several such "multiline links" to be on a single line. this is a style choice that makes it easier to parse the links ... and it's also nice on the eye (the latter just IMHO). I added a CI check for this, so this style can now be enforced.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the advantage to this was to avoid a line break in GitHub's rendering of the document. It doesn't matter much to me, though.

msg = "\n\nFound duplicate link references. Please make them unique.\n"
for log_dict in duplicate_link_refs:
msg += "\n" + json.dumps(log_dict, indent=4)
raise RuntimeError(msg)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strictly speaking, having duplicate source links across different MKDocs .md documents is not an issue, because for HTML, these are rendered as separate pages.

It does become a problem for the PDF build though, where we first concatenate all files, so duplication results in partially wrong links.

I think it's reasonable to ask people to take a unique reference for a link reference. This short script will enforce this in CI, making use of the pandoc log, which detects such violations.

Comment on lines +12 to +13
- Added support for describing events with Hierarchical Event Descriptors: [4.3 Task events](04-modality-specific-files/05-task-events.md).
- Added `VolumeTiming` and `AcquisitionDuration` fields: [4.1 Task (including resting state) imaging data](04-modality-specific-files/01-magnetic-resonance-imaging-data.md#task-including-resting-state-imaging-data).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This style change from [[.....]] to : [...] is also for convenience of link parsing.

Copy link
Member Author

@sappelhoff sappelhoff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left a couple of comments to make it easier for reviewers to see what I meant with some changes.

This PR is ready to be merged from my side.

@sappelhoff sappelhoff added the formatting Aesthetics and formatting of the spec label Oct 30, 2021
@sappelhoff sappelhoff added this to the 1.7.0 milestone Nov 9, 2021
Copy link
Member

@tsalo tsalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't speak to the code, but the PDF artifact looks good to me, as do the markdown changes.

Copy link
Collaborator

@Remi-Gau Remi-Gau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. Also looks like a lot of work. Thanks @sappelhoff

@sappelhoff sappelhoff merged commit 41704e8 into bids-standard:master Nov 15, 2021
@sappelhoff sappelhoff deleted the fix/pdf/internal_links branch November 15, 2021 16:03
@effigies effigies added the exclude-from-changelog This item will not feature in the automatically generated changelog label Jul 27, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
exclude-from-changelog This item will not feature in the automatically generated changelog formatting Aesthetics and formatting of the spec
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"LGTM" tool warns about regexp for "remove_internal_links" function (for PDF build)
4 participants