Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable unique slugs across multiple files #20

Closed
4 tasks done
about-code opened this issue Aug 21, 2021 · 8 comments
Closed
4 tasks done

Enable unique slugs across multiple files #20

about-code opened this issue Aug 21, 2021 · 8 comments
Labels
🙅 no/wontfix This is not (enough of) an issue for this project 👎 phase/no Post cannot or will not be acted on

Comments

@about-code
Copy link

Initial checklist

Problem

Given a headline appears multiple times within a set of markdown files that is being processed by unified-engine and remark-slug. Then resetting the slugger within the transformer produces slugs that are unique on a per-file basis.

As a remark-slug and unified-engine user I would like to be able to postprocess markdown files with pandoc and concatenate them into a single output file. In such a scenario I need slugs that are unique across the whole set of files and multiple syntax trees to maintain uniqueness after concatenation.

Solution

I propose an option multifile which is false by default for backwards compatibility but when true prevents the slugger from being reset:

export default function remarkSlug(opts) {
  const {multifile} = { multifile: false, ...opts }
  
  slugs.reset()  // reset once, only.
  return (tree) => {
    if (! multifile) {
       slugs.reset()
    }
    // [...]
  }
}

Open questions

  • Does the name multifile make sense? Other ideas: reset, autoreset, resetSlugs

Alternatives

  1. Fork remark-slug and publish a forked package
  2. Implement it privately and drop remark-slug in favour of a custom function

Given the widespread use of remark and pandoc a contribution to remark-slug may be the best alternative.

@github-actions github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Aug 21, 2021
@about-code
Copy link
Author

A similar candidate for such a feature would be remark-reference-links (Permalink) which starts counting reference ids in a similar fashion on a per tree/per file basis.

@wooorm
Copy link
Member

wooorm commented Aug 22, 2021

What is the reason you don't first concatenate the markdown?

@about-code
Copy link
Author

about-code commented Aug 22, 2021

What is the reason you don't first concatenate the markdown?

Fair and reasonable question.

  1. In my particular case upfront concatenation would wipe semantic differences in the set of input files. For glossarify-md the file set is subdivided into glossary files and document files (much like it used to be in GitBook). Headings in glossary files are considered terms. From a conceptual perspective concatenating the set of input files into a single input file beforehand would make the resulting file a document and a glossary file at the same time.

glossarify-md would be able to handle such glossary-document-duality. But it would make every heading phrase in that file subject to "term-ination" and auto-linking which might not be what its user wants and thus should not be enforced upon users whose goal is to linkify glossary terms, only.

  1. Moreover, the actual concatenation is a bit out of scope. I just want to take care that the tool's intended audience may whish to carry out additional postprocessing on the tool's outputs, e.g. when publishing MD books in single-file formats such as PDF or single-HTML docs. What can be considered within the tool's scope, though, is to prepare output files in a way which maintains link stability when they are concatenated. I could achieve this with remark-slug (and remark-reference-links) being able to create identifiers unique across the file set.

For the sake of transparency, I do not want to hide, that after having tested my patch locally it doesn't get me the whole way, yet. There are still some rough edges to be tackled with pandoc as the concatenator and postprocessor. But apart from that "fileset uniqueness" of heading IDs remain a step into that direction. So if we focus on the question whether such a property and option were a nice addition, particularly in case of using the plug-in with unified-engine, then I'd be willing to complete the drafted PR with tests and further provide a similar one to remark-reference-links, too.

@wooorm
Copy link
Member

wooorm commented Aug 23, 2021

The main problem I see is that unified pipelines handle with one file, and if you pass that file through twice, users expect the same result. Or, if files are passed through twice in different orders (e.g., because of async), the same output could be expected as well.
unified-engine through unified-args (on the CLI) also supports watch mode, where it keeps running, and passes a changed file through on its own when its edited.

So I don’t really see an option like this solving your original issue. Or, perhaps it works for you, but it would have unexpected consequences for other users.

I also have some experience with a similar issue: one big markdown file that’s split up into multiple HTML files (for epubs, which often do that to improve rendering speed).
Like you note: remark-slug / rehype-slug don’t completely get you all the way to what’s needed. So perhaps just better to write a custom utility/plugin that handles slugs/links exactly how you’d want?

@about-code
Copy link
Author

Okay, I see. In case of a watch mode we were likely to see sequentially increasing numbers by adding to the state of the previous run and results would vary depending on how often a file was changed. Well a bit of a pity but convincing. Time for going on with Sinatra 🎼 and doing it my way.

Had been a pleasure to contribute. Maybe next time.

@github-actions

This comment has been minimized.

@wooorm wooorm added 👎 phase/no Post cannot or will not be acted on 🙅 no/wontfix This is not (enough of) an issue for this project and removed 🤞 phase/open Post is being triaged manually labels Aug 23, 2021
@github-actions
Copy link

Hi team! Could you describe why this has been marked as wontfix?

Thanks,
— bb

@wooorm
Copy link
Member

wooorm commented Aug 23, 2021

Would love to have you as a contributor, in the future! All the best!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🙅 no/wontfix This is not (enough of) an issue for this project 👎 phase/no Post cannot or will not be acted on
Development

Successfully merging a pull request may close this issue.

2 participants