[epic] MarkdownDB plugin system #2

rufuspollock · 2023-04-28T11:23:39Z

We want a plugin system in MarkdownDB so people can easily extend the core functionality, for example to extract additional metadata, so that not all functionality has to be in core and people can rapidly add functionality

Sketch (April 2023)

https://link.excalidraw.com/l/9u8crB2ZmUo/9hkrQmVl9QX

Acceptance

Identify the different types of plugins ✅2023-11-19 roughly: parsing, computing, validating (and maybe serializing ...)
Research how remark works to see if we can reuse it 🚧2023-11-19 see notes in comment below
Design of MarkdownDB and especially the plugin system.
- extract first heading as title metadata
- add a metadata field

Notes

MarkdownDB vs Contentlayer

Contentlayer supported:

document types with
- frontmatter schema definition and validation
- assigning document types based on glob patterns
- computed fields, e.g. description auto-extracted from the document content
excluding/including some content folders we kinda already have this but it's not configurable
...

What we need:

probably config file similar to Contentlayer one, with:
- custom document types,
- content include/exclude option
- plugins
- ...
...

The text was updated successfully, but these errors were encountered:

rufuspollock · 2023-11-19T15:59:45Z

Doing a bunch of research on remark and micromark re the parsing part of this - could remark be our plug in system here? (probably)

Should we just build on top of the remark ecosystem i.e. use remark plugins for doing the parsing? ✅2023-11-19 my sense is yes
- Should we use remark plugins or micromark (what's the difference even?). 🚧2023-11-19 still confused on this one (as others are) but my sense is we just remark and its plugins
How do you create a plugin 🚧2023-11-19 see https://github.com/remarkjs/remark/blob/main/doc/plugins.md and it's guide
- How do you pass data around? see notes below (no answer yet!) 🚧2023-11-19 there is something called messages ...
What remark plugins could we learn from?
- For tasklists: https://github.com/micromark/micromark-extension-gfm-task-list-item
- How would we extract tags?

Can you pass "data" along the chain of a plugin

This example remarkjs/remark#251 talks about word counts but it console logs the info ...

var unified = require('unified');
var parse = require('remark-parse');
var stringify = require('remark-stringify');
var english = require('retext-english');
var remark2retext = require('remark-retext');
var visit = require('unist-util-visit');

unified()
  .use(parse)
  .use(remark2retext, unified().use(english).use(count))
  .use(stringify)
  .processSync('*This* and _that_. \n> And some more stuff.\n\nAnd another thing.');

function count() {
  return counter;
  function counter(tree) {
    var counts = {};
    visit(tree, visitor);
    console.log(counts);
    function visitor(node) {
      counts[node.type] = (counts[node.type] || 0) + 1;
    }
  }
}

{ RootNode: 1,
  ParagraphNode: 3,
  SentenceNode: 3,
  WordNode: 10,
  TextNode: 10,
  WhiteSpaceNode: 10,
  PunctuationNode: 3 }

mohamedsalem401 · 2023-11-20T18:24:51Z

The immediate question that arises is how the output of running plugins can be stored. Let's consider a straightforward example using a simple plugin available at https://github.com/florianeckerstorfer/remark-a11y-emoji. This plugin wraps emojis in a <span> tag and sets the emoji name as the aria-label.

Assuming we successfully run the markdown files through such plugins, the next query is where the newly generated markdown should be stored. Currently, the library only generates SQL databases from metadata, lacking a method to load the content of a file.

Possible solutions include:

Add Content to Database/JSON:
Store each file's body content in the generated database or local JSON files. This approach consolidates the parsed content along with metadata.
Generate Separate Markdown Files:
Create a designated folder, say .markdown, and start generating markdown files there after parsing. This process involves removing metadata from the files.
Introduce a Loading Method:
Implement a method like loadFile(file_path) to retrieve the content of a given file after running the plugins. However, a drawback of this approach is that if users generate the database/JSON files using the library but employ another tool to load the markdown file content.

rufuspollock · 2023-11-21T08:37:17Z

@mohamedsalem401 we aren't using plugins to transform markdown at all - we are using plugins to extract information from the markdown and then store that somewhere ...

See my last comment section about "Can you pass "data" along the chain of a plugin" ... because we just want to pass data along the chain. Or see the example above where it computes wordcount etc.

To repeat: we are not using remark plugins to transform the content but rather to extract information from it ...

olayway added the enhancement New feature or request label May 9, 2023

rufuspollock added the epic label Sep 24, 2023

rufuspollock changed the title ~~MarkdownDB plugin system~~ [epic] MarkdownDB plugin system Sep 24, 2023

rufuspollock mentioned this issue Nov 20, 2023

[epic] MarkdownDB Index and Library v1 #3

Open

43 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[epic] MarkdownDB plugin system #2

[epic] MarkdownDB plugin system #2

rufuspollock commented Apr 28, 2023 •

edited

rufuspollock commented Nov 19, 2023 •

edited

mohamedsalem401 commented Nov 20, 2023

rufuspollock commented Nov 21, 2023

[epic] MarkdownDB plugin system #2

[epic] MarkdownDB plugin system #2

Comments

rufuspollock commented Apr 28, 2023 • edited

Sketch (April 2023)

Acceptance

Notes

MarkdownDB vs Contentlayer

rufuspollock commented Nov 19, 2023 • edited

Can you pass "data" along the chain of a plugin

mohamedsalem401 commented Nov 20, 2023

rufuspollock commented Nov 21, 2023

rufuspollock commented Apr 28, 2023 •

edited

rufuspollock commented Nov 19, 2023 •

edited