Simple, expected, and deterministic best-match sorting of an array in JavaScript
Clone or download
tikotzky and kentcdodds feat: add ability to set a threshold per key (#50)
<!--
Thanks for your interest in the project. Bugs filed and PRs submitted are appreciated!

Please make sure that you are familiar with and follow the Code of Conduct for
this project (found in the CODE_OF_CONDUCT.md file).

Also, please make sure you're familiar with and follow the instructions in the
contributing guidelines (found in the CONTRIBUTING.md file).

If you're new to contributing to open source projects, you might find this free
video course helpful: http://kcd.im/pull-request

Please fill out the information below to expedite the review and (hopefully)
merge of your pull request!
-->

<!-- What changes are being made? (What feature/bug is being fixed here?) -->

**What**:
Add the ability to set a threshold per key
<!-- Why are these changes necessary? -->

**Why**:
To be able to set specific keys to only match when they meet a specified threshold
<!-- How were these changes implemented? -->

**How**:
By adding returning the keys threshold with each match and verifying that the the match is `>=` the threshold specified on its key.
<!-- Have you done all of these things?  -->

**Checklist**:

<!-- add "N/A" to the end of each line that's irrelevant to your changes -->

<!-- to check an item, place an "x" in the box like so: "- [x] Documentation" -->

* [x] Documentation
* [x] Tests
* [x] Ready to be merged <!-- In your opinion, is this ready to be merged as soon as it's reviewed? -->
* [x] Added myself to contributors table <!-- this is optional, see the contributing guidelines for instructions -->

<!-- feel free to add additional comments -->
Latest commit 22f1d73 Aug 30, 2018

README.md

match-sorter

Simple, expected, and deterministic best-match sorting of an array in JavaScript


Demo

Build Status Code Coverage Dependencies version downloads MIT License

All Contributors PRs Welcome Donate Code of Conduct Roadmap Examples

gzip size size module formats: umd and cjs Watch on GitHub Star on GitHub Tweet

The problem

  1. You have a list of dozens, hundreds, or thousands of items
  2. You want to filter and sort those items intelligently (maybe you have a filter input for the user)
  3. You want simple, expected, and deterministic sorting of the items (no fancy math algorithm that fancily changes the sorting as they type)

This solution

This follows a simple and sensible (user friendly) algorithm that makes it easy for you to filter and sort a list of items based on given input. Items are ranked based on sensible criteria that result in a better user experience.

To explain the ranking system, I'll use countries as an example:

  1. CASE SENSITIVE EQUALS: Case-sensitive equality trumps all. These will be first. (ex. France would match France, but not france)
  2. EQUALS: Case-insensitive equality (ex. France would match france)
  3. STARTS WITH: If the item starts with the given value (ex. Sou would match South Korea or South Africa)
  4. WORD STARTS WITH: If the item has multiple words, then if one of those words starts with the given value (ex. Repub would match Dominican Republic)
  5. CASE STARTS WITH: If the item has a defined case (camelCase, PascalCase, snake_case or kebab-case), then if one of the parts starts with the given value (ex. kingdom would match unitedKingdom or united_kingdom)
  6. CASE ACRONYM If the item's case matches the synonym (ex. uk would match united-kingdom or UnitedKingdom)
  7. CONTAINS: If the item contains the given value (ex. ham would match Bahamas)
  8. ACRONYM: If the item's acronym is the given value (ex. us would match United States)
  9. SIMPLE MATCH: If the item has letters in the same order as the letters of the given value (ex. iw would match Zimbabwe, but not Kuwait because it must be in the same order). Furthermore, if the item is a closer match, it will rank higher (ex. ua matches Uruguay more closely than United States of America, therefore Uruguay will be ordered before United States of America)

This ranking seems to make sense in people's minds. At least it does in mine. Feedback welcome!

Getting Started

Installation

This module is distributed via npm which is bundled with node and should be installed as one of your project's dependencies:

npm install --save match-sorter

Usage

const matchSorter = require('match-sorter')
// ES6 imports work too
// Also available in global environment via `matchSorter` global
const list = ['hi', 'hey', 'hello', 'sup', 'yo']
matchSorter(list, 'h') // ['hi', 'hey', 'hello']
matchSorter(list, 'y') // ['yo', 'hey']
matchSorter(list, 'z') // []

Advanced options

keys: [string]

Default: undefined

By default it just uses the value itself as above. Passing an array tells match-sorter which keys to use for the ranking.

const objList = [
  {name: 'Janice', color: 'Green'},
  {name: 'Fred', color: 'Orange'},
  {name: 'George', color: 'Blue'},
  {name: 'Jen', color: 'Red'},
]
matchSorter(objList, 'g', {keys: ['name', 'color']})
// [{name: 'George', color: 'Blue'}, {name: 'Janice', color: 'Green'}, {name: 'Fred', color: 'Orange'}]

matchSorter(objList, 're', {keys: ['color', 'name']})
// [{name: 'Jen', color: 'Red'}, {name: 'Janice', color: 'Green'}, {name: 'Fred', color: 'Orange'}, {name: 'George', color: 'Blue'}]

Array of values: When the specified key matches an array of values, the best match from the values of in the array is going to be used for the ranking.

const iceCreamYum = [
  {favoriteIceCream: ['mint', 'chocolate']},
  {favoriteIceCream: ['candy cane', 'brownie']},
  {favoriteIceCream: ['birthday cake', 'rocky road', 'strawberry']},
]
matchSorter(iceCreamYum, 'cc', {keys: ['favoriteIceCream']})
// [{favoriteIceCream: ['candy cane', 'brownie']}, {favoriteIceCream: ['mint', 'chocolate']}]

Nested Keys: You can specify nested keys using dot-notation.

const nestedObjList = [
  {name: {first: 'Janice'}},
  {name: {first: 'Fred'}},
  {name: {first: 'George'}},
  {name: {first: 'Jen'}},
]
matchSorter(nestedObjList, 'j', {keys: ['name.first']})
// [{name: {first: 'Janice'}}, {name: {first: 'Jen'}}]

const nestedObjList = [
  {name: [{first: 'Janice'}]},
  {name: [{first: 'Fred'}]},
  {name: [{first: 'George'}]},
  {name: [{first: 'Jen'}]},
]
matchSorter(nestedObjList, 'j', {keys: ['name.0.first']})
// [{name: {first: 'Janice'}}, {name: {first: 'Jen'}}]
// matchSorter(nestedObjList, 'j', {keys: ['name[0].first']}) does not work

Property Callbacks: Alternatively, you may also pass in a callback function that resolves the value of the key(s) you wish to match on. This is especially useful when interfacing with libraries such as Immutable.js

const list = [{name: 'Janice'}, {name: 'Fred'}, {name: 'George'}, {name: 'Jen'}]
matchSorter(list, 'j', {keys: [item => item.name]})
// [{name: 'Janice'}, {name: 'Jen'}]

Threshold: You may specify an individual threshold for specific keys. A key will only match if it meets the specified threshold. For more information regarding thresholds see below

const list = [{name: 'Fred', color: 'Orange'}, {name: 'Jen', color: 'Red'}]
matchSorter(list, 'ed', {
  keys: [{threshold: rankings.STARTS_WITH, key: 'name'}, 'color'],
})
//[{name: 'Jen', color: 'Red'}]

Min and Max Ranking: You may restrict specific keys to a minimum or maximum ranking by passing in an object. A key with a minimum rank will only get promoted if there is at least a simple match.

const tea = [
  {tea: 'Earl Grey', alias: 'A'},
  {tea: 'Assam', alias: 'B'},
  {tea: 'Black', alias: 'C'},
]
matchSorter(tea, 'A', {
  keys: ['tea', {maxRanking: matchSorter.rankings.STARTS_WITH, key: 'alias'}],
})
// without maxRanking, Earl Grey would come first because the alias "A" would be CASE_SENSITIVE_EQUAL
// `tea` key comes before `alias` key, so Assam comes first even though both match as STARTS_WITH
// [{tea: 'Assam', alias: 'B'}, {tea: 'Earl Grey', alias: 'A'},{tea: 'Black', alias: 'C'}]
const tea = [
  {tea: 'Milk', alias: 'moo'},
  {tea: 'Oolong', alias: 'B'},
  {tea: 'Green', alias: 'C'},
]
matchSorter(tea, 'oo', {
  keys: ['tea', {minRanking: matchSorter.rankings.EQUAL, key: 'alias'}],
})
// minRanking bumps Milk up to EQUAL from CONTAINS (alias)
// Oolong matches as STARTS_WITH
// Green is missing due to no match
// [{tea: 'Milk', alias: 'moo'}, {tea: 'Oolong', alias: 'B'}]

threshold: number

Default: MATCHES

Thresholds can be used to specify the criteria used to rank the results. Available thresholds (from top to bottom) are:

  • CASE_SENSITIVE_EQUAL
  • EQUAL
  • STARTS_WITH
  • WORD_STARTS_WITH
  • STRING_CASE
  • STRING_CASE_ACRONYM
  • CONTAINS
  • ACRONYM
  • MATCHES (default value)
  • NO_MATCH
const fruit = ['orange', 'apple', 'grape', 'banana']
matchSorter(fruit, 'ap', {threshold: matchSorter.rankings.NO_MATCH})
// ['apple', 'grape', 'orange', 'banana'] (returns all items, just sorted by best match)

const things = ['google', 'airbnb', 'apple', 'apply', 'app'],
matchSorter(things, 'app', {threshold: matchSorter.rankings.EQUAL})
// ['app'] (only items that are equal)

const otherThings = ['fiji apple', 'google', 'app', 'crabapple', 'apple', 'apply']
matchSorter(otherThings, 'app', {threshold: matchSorter.rankings.WORD_STARTS_WITH})
// ['app', 'apple', 'apply', 'fiji apple'] (everything that matches with "word starts with" or better)

keepDiacritics: boolean

Default: false

By default, match-sorter will strip diacritics before doing any comparisons. This is the default because it makes the most sense from a UX perspective.

You can disable this behavior by specifying keepDiacritics: true

const thingsWithDiacritics = [
  'jalapeño',
  'à la carte',
  'café',
  'papier-mâché',
  'à la mode',
]
matchSorter(thingsWithDiacritics, 'aa')
// ['jalapeño', 'à la carte', 'à la mode', 'papier-mâché']

matchSorter(thingsWithDiacritics, 'aa', {keepDiacritics: true})
// ['jalapeño', 'à la carte']

matchSorter(thingsWithDiacritics, 'à', {keepDiacritics: true})
// ['à la carte', 'à la mode']

Using ES6?

In the examples above, we're using CommonJS. If you're using ES6 modules, then you can do:

import matchSorter, {rankings, caseRankings} from 'match-sorter'

Inspiration

Actually, most of this code was extracted from the very first library I ever wrote: genie!

Other Solutions

You might try Fuse.js. It uses advanced math fanciness to get the closest match. Unfortunately what's "closest" doesn't always really make sense. So I extracted this from genie.

Contributors

Thanks goes to these people (emoji key):


Kent C. Dodds

💻 📖 🚇 ⚠️ 👀

Conor Hastings

💻 📖 ⚠️ 👀

Rogelio Guzman

📖

Claudéric Demers

💻 📖 ⚠️

Kevin Davis

💻 ⚠️

Denver Chen

💻 📖 ⚠️

Christian Ruigrok

🐛 💻 📖

Hozefa

🐛 💻 ⚠️ 🤔

pushpinder107

💻

Mordy Tikotzky

💻 📖 ⚠️

This project follows the all-contributors specification. Contributions of any kind welcome!

LICENSE

MIT