Skip to content
This repository was archived by the owner on Feb 21, 2023. It is now read-only.

A Metalsmith plugin to find related files within collections.

License

Notifications You must be signed in to change notification settings

emmercm/metalsmith-collections-related

Repository files navigation

metalsmith-collections-related

⚠️ This repistory has been moved to metalsmith-plugins. ⚠️

npm Version npm Weekly Downloads

Known Vulnerabilities Test Coverage Maintainability

GitHub License

A Metalsmith plugin to find related files within collections.

Files are "related" if they share important terms in their contents.

For each file in a collection, Term Frequency-Inverse Document Frequency (TF-IDF) is used to:

  • Find the top natural.maxTerms important terms in the file's contents
  • Find how much weight those terms have in every other file in the collection
  • Filter matches that have at least natural.minTfIdf weight
  • Sort by descending weight (most "related" first)
  • Limit to maxRelated number of matches

Installation

npm install --save metalsmith-collections-related

JavaScript Usage

Collections need to be processed before related files can be found:

const Metalsmith  = require('metalsmith');
const collections = require('metalsmith-collections');
const related     = require('metalsmith-collections-related');

Metalsmith(__dirname)
    .use(collections({
        // options here
    }))
    .use(related({
        // options here
    }))
    .build((err) => {
        if (err) {
            throw err;
        }
    });

File metadata

This plugin adds a metadata field named related to each file in the format:

{
  "contents": "...",
  "path": "...",
  "related": {
    "[collection name]": [
      { "contents": "...", "path": "..." },
      { "contents": "...", "path": "..." }
      // up to the `maxRelated` number of files
    ],
    "[another collection name]": [
      { "contents": "...", "path": "..." },
      { "contents": "...", "path": "..." }
      // up to the `maxRelated` number of files
    ]
    // up to as many collections as the file is in
  }
}

which can be used with templating engines, such as with Handlebars:

{{#each related}}
    <a href="{{ path }}">{{ path }}</a>
{{/each}}

Options

pattern (optional)

Type: string Default: **/*

A micromatch glob pattern to find input files.

maxRelated (optional)

Type: number Default: 3

The number of related files to add to each file's metadata.

natural (optional)

Type: object Default:

{
    "minTfIdf": 0,
    "maxTerms": 10
}

natural.minTfIdf (optional)

Type: number Default: 0

The minimum Term Frequency-Inverse Document Frequency (TF-IDF) measure.

natural.maxTerms (optional)

Type: number Default: 10

The maximum number of terms to use for tf-idf weighting.

sanitizeHtml (optional)

Type: object Default:

{
    "allowedTags": [],
    "allowedAttributes": {},
    "nonTextTags": ["pre"]
}

An object of sanitize-html options.

Changelog

Changelog

About

A Metalsmith plugin to find related files within collections.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •