Enable unique slugs across multiple files #20

about-code · 2021-08-21T18:32:43Z

Initial checklist

I read the support docs
I read the contributing guide
I agree to follow the code of conduct
I searched issues and couldn’t find anything (or linked relevant results below)

Problem

Given a headline appears multiple times within a set of markdown files that is being processed by unified-engine and remark-slug. Then resetting the slugger within the transformer produces slugs that are unique on a per-file basis.

As a remark-slug and unified-engine user I would like to be able to postprocess markdown files with pandoc and concatenate them into a single output file. In such a scenario I need slugs that are unique across the whole set of files and multiple syntax trees to maintain uniqueness after concatenation.

Solution

I propose an option multifile which is false by default for backwards compatibility but when true prevents the slugger from being reset:

export default function remarkSlug(opts) {
  const {multifile} = { multifile: false, ...opts }
  
  slugs.reset()  // reset once, only.
  return (tree) => {
    if (! multifile) {
       slugs.reset()
    }
    // [...]
  }
}

Open questions

Does the name multifile make sense? Other ideas: reset, autoreset, resetSlugs

Alternatives

Fork remark-slug and publish a forked package
Implement it privately and drop remark-slug in favour of a custom function

Given the widespread use of remark and pandoc a contribution to remark-slug may be the best alternative.

The text was updated successfully, but these errors were encountered:

about-code · 2021-08-22T11:42:12Z

A similar candidate for such a feature would be remark-reference-links (Permalink) which starts counting reference ids in a similar fashion on a per tree/per file basis.

wooorm · 2021-08-22T11:45:03Z

What is the reason you don't first concatenate the markdown?

about-code · 2021-08-22T16:47:56Z

What is the reason you don't first concatenate the markdown?

Fair and reasonable question.

In my particular case upfront concatenation would wipe semantic differences in the set of input files. For glossarify-md the file set is subdivided into glossary files and document files (much like it used to be in GitBook). Headings in glossary files are considered terms. From a conceptual perspective concatenating the set of input files into a single input file beforehand would make the resulting file a document and a glossary file at the same time.

glossarify-md would be able to handle such glossary-document-duality. But it would make every heading phrase in that file subject to "term-ination" and auto-linking which might not be what its user wants and thus should not be enforced upon users whose goal is to linkify glossary terms, only.

Moreover, the actual concatenation is a bit out of scope. I just want to take care that the tool's intended audience may whish to carry out additional postprocessing on the tool's outputs, e.g. when publishing MD books in single-file formats such as PDF or single-HTML docs. What can be considered within the tool's scope, though, is to prepare output files in a way which maintains link stability when they are concatenated. I could achieve this with remark-slug (and remark-reference-links) being able to create identifiers unique across the file set.

For the sake of transparency, I do not want to hide, that after having tested my patch locally it doesn't get me the whole way, yet. There are still some rough edges to be tackled with pandoc as the concatenator and postprocessor. But apart from that "fileset uniqueness" of heading IDs remain a step into that direction. So if we focus on the question whether such a property and option were a nice addition, particularly in case of using the plug-in with unified-engine, then I'd be willing to complete the drafted PR with tests and further provide a similar one to remark-reference-links, too.

wooorm · 2021-08-23T09:52:00Z

The main problem I see is that unified pipelines handle with one file, and if you pass that file through twice, users expect the same result. Or, if files are passed through twice in different orders (e.g., because of async), the same output could be expected as well.
unified-engine through unified-args (on the CLI) also supports watch mode, where it keeps running, and passes a changed file through on its own when its edited.

So I don’t really see an option like this solving your original issue. Or, perhaps it works for you, but it would have unexpected consequences for other users.

I also have some experience with a similar issue: one big markdown file that’s split up into multiple HTML files (for epubs, which often do that to improve rendering speed).
Like you note: remark-slug / rehype-slug don’t completely get you all the way to what’s needed. So perhaps just better to write a custom utility/plugin that handles slugs/links exactly how you’d want?

about-code · 2021-08-23T17:18:32Z

Okay, I see. In case of a watch mode we were likely to see sequentially increasing numbers by adding to the state of the previous run and results would vary depending on how often a file was changed. Well a bit of a pity but convincing. Time for going on with Sinatra 🎼 and doing it my way.

Had been a pleasure to contribute. Maybe next time.

github-actions · 2021-08-23T18:18:54Z

Hi team! Could you describe why this has been marked as wontfix?

Thanks,
— bb

wooorm · 2021-08-23T18:19:00Z

Would love to have you as a contributor, in the future! All the best!

github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Aug 21, 2021

about-code mentioned this issue Aug 21, 2021

Option to enable unique slugs across files or trees. #21

Closed

5 tasks

about-code closed this as completed Aug 23, 2021

This comment has been minimized.

Sign in to view

wooorm added 👎 phase/no Post cannot or will not be acted on 🙅 no/wontfix This is not (enough of) an issue for this project and removed 🤞 phase/open Post is being triaged manually labels Aug 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable unique slugs across multiple files #20

Enable unique slugs across multiple files #20

about-code commented Aug 21, 2021

about-code commented Aug 22, 2021

wooorm commented Aug 22, 2021

about-code commented Aug 22, 2021 •

edited

Loading

wooorm commented Aug 23, 2021

about-code commented Aug 23, 2021

This comment has been minimized.

github-actions bot commented Aug 23, 2021

wooorm commented Aug 23, 2021

Enable unique slugs across multiple files #20

Enable unique slugs across multiple files #20

Comments

about-code commented Aug 21, 2021

Initial checklist

Problem

Solution

Open questions

Alternatives

about-code commented Aug 22, 2021

wooorm commented Aug 22, 2021

about-code commented Aug 22, 2021 • edited Loading

wooorm commented Aug 23, 2021

about-code commented Aug 23, 2021

This comment has been minimized.

github-actions bot commented Aug 23, 2021

wooorm commented Aug 23, 2021

about-code commented Aug 22, 2021 •

edited

Loading