Skip to content

Latest commit

 

History

History
874 lines (725 loc) · 22.7 KB

README.template.md

File metadata and controls

874 lines (725 loc) · 22.7 KB

Object-Scan

Build Status Test Coverage Dependabot Status Dependencies NPM Downloads Semantic-Release Gardener

Traverse object hierarchies using matching and callbacks.

Install

Install with npm:

$ npm install --save object-scan

Usage

haystack: { a: { b: { c: 'd' }, e: { f: 'g' } } }
needles: ['a.*.f']
spoiler: false

Features

  • Input traversed exactly once during search
  • Dependency free, small in size and very performant
  • Separate Object and Array matching
  • Wildcard and Regex matching
  • Arbitrary depth matching
  • Or-clause Syntax
  • Exclusion Matching
  • Full support for escaping
  • Traversal in "delete-safe" order
  • Recursion free implementation
  • Search syntax validated
  • Lots of tests and examples

Matching

A needle expression specifies one or more paths to an element (or a set of elements) in a JSON structure. Paths use the dot notation:

store.book[0].title

Array

Rectangular brackets for array path matching.

Examples:

haystack: [0, 1, 2, 3, 4]
needles: ['[2]']
comment: exact in array
haystack: { 0: 'a', 1: 'b', 2: 'c' }
needles: ['[1]']
comment: no match in object

Object

Property name for object property matching.

Examples:

haystack: { foo: 0, bar: 1 }
needles: ['foo']
comment: exact in object
haystack: [0, 1, 2, 3, 4]
needles: ['1']
comment: no match in array

Wildcard

The following characters have special meaning when not escaped:

  • *: Match zero or more character
  • +: Match one or more character
  • ?: Match exactly one character
  • \: Escape the subsequent character

Wildcards can be used with Array and Object selector.

Examples:

haystack: { a: { b: 0, c: 1 }, d: 2 }
needles: ['*']
comment: top level
haystack: [...Array(30).keys()]
needles: ['[?5]']
comment: two digit ending in five
haystack: { a: { b: { c: 0 }, d: { f: 0 } } }
needles: ['a.+.c']
comment: nested
haystack: { a: { b: { c: 0 }, '+': { c: 0 } } }
needles: ['a.\\+.c']
comment: escaped

Regex

Regex are defined by using parentheses.

Can be used with Array and Object selector.

Examples:

haystack: { foo: 0, foobar: 1, bar: 2 }
needles: ['(^foo)']
comment: starting with `foo`
haystack: [...Array(20).keys()]
needles: ['[(5)]']
comment: containing `5`
haystack: ['a', 'b', 'c', 'd']
needles: ['[(^[01]$)]']
comment: `[0]` and `[1]`
haystack: ['a', 'b', 'c', 'd']
needles: ['[(^[^01]$)]']
comment: other than `[0]` and `[1]`
haystack: ['a', 'b', 'c', 'd']
needles: ['[*]', '[!(^[01]$)]']
comment: match all and exclude `[0]` and `[1]`

Arbitrary Depth

There are two types of arbitrary depth matching:

  • **: Matches zero or more nestings
  • ++: Matches one or more nestings

Recursions can be combined with a regex or a group by appending the regex or group.

Examples:

haystack: { a: { b: 0, c: 0 } }
needles: ['a.**']
comment: zero or more nestings under `a`
haystack: { a: { b: 0, c: 0 } }
needles: ['a.++']
comment: one or more nestings under `a`
haystack: { 1: { 1: ['c', 'd'] }, 510: 'e', foo: { 1: 'f' } }
needles: ['**(1)']
comment: all containing `1` at every level

Or Clause

Or Clauses are defined by using curley brackets.

Can be used with Array and Object selector.

Examples:

haystack: ['a', 'b', 'c', 'd']
needles: ['[{0,1}]']
comment: `[0]` and `[1]`
haystack: { a: { b: 0, c: 1 }, d: { e: 2, f: 3 } }
needles: ['{a,d}.{b,f}']
comment: `a.b`, `a.f`, `d.b` and `d.f`

Nested Path Recursion

To match a nested path recursively, combine arbitrary depth matching with an or-clause.

There are two types of nested path matching:

  • **{...}: Matches path(s) in group zero or more times
  • ++{...}: Matches path(s) in group one or more times

Examples:

haystack: [[[[0, 1], [1, 2]], [[3, 4], [5, 6]]], [[[7, 8], [9, 10]], [[11, 12], [13, 14]]]]
needles: ['++{[0][1]}']
comment: `cyclic path`
haystack: [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
needles: ['++{[0],[1]}']
comment: `nested or`
haystack: [[[{ a: [1] }], [2]]]
needles: ['**{[*]}']
comment: `traverse only array`
haystack: { a: [0, { b: 1 }], c: { d: 2 } }
needles: ['**{*}']
comment: `traverse only object`
haystack: { a: { b: { c: { b: { c: 0 } } } } }
needles: ['a.**{b.c}']
comment: `zero or more times`
haystack: { a: { b: { c: { b: { c: 0 } } } } }
needles: ['a.++{b.c}']
comment: `one or more times`

Exclusion

To exclude a path, use exclamation mark.

Examples:

haystack: { a: 0, b: 1 }
needles: ['{a,b},!a']
comment: only `b`
strict: false
haystack: { a: 0, b: { a: 1, c: 2 } }
needles: ['**,!**.a']
comment: all except ending in `a`

Escaping

The following characters are considered special and need to be escaped using \, if they should be matched in a key:
[, ], {, }, (, ), ,, ., !, ?, *, + and \.

Examples:

haystack: { '[1]': 0 }
needles: ['\\[1\\]']
comment: special object key

Options

Signature of all callbacks is

Fn({ key, value, ... })

where:

  • key: key that callback is invoked for (respects joined option).
  • value: value for key.
  • entry: entry consisting of [key, value].
  • property: current parent property.
  • gproperty: current grandparent property.
  • parent: current parent.
  • gparent: current grandparent.
  • parents: array of form [parent, grandparent, ...].
  • isMatch: true iff last targeting needle exists and is non-excluding.
  • matchedBy: all non-excluding needles targeting key.
  • excludedBy: all excluding needles targeting key.
  • traversedBy: all needles involved in traversing key.
  • isCircular: true iff value contained in parents
  • isLeaf: true iff value can not be traversed
  • depth: length of key
  • result: intermittent result as defined by rtn
  • getKey: function that returns key
  • getValue: function that returns value
  • getEntry: function that returns entry
  • getProperty: function that returns property
  • getGproperty: function that returns gproperty
  • getParent: function that returns parent
  • getGparent: function that returns gparent
  • getParents: function that returns parents
  • getIsMatch: function that returns isMatch
  • getMatchedBy: function that returns matchedBy
  • getExcludedBy: function that returns excludedBy
  • getTraversedBy: function that returns traversedBy
  • getIsCircular: function that returns isCircular
  • getIsLeaf: function that returns isLeaf
  • getDepth: function that returns depth
  • getResult: function that returns result
  • context: as passed into the search

Notes on Performance:

  • Arguments backed by getters use Functions Getter and should be accessed via destructuring to prevent redundant computation.
  • Getters should be used to improve performance for conditional access. E.g. if (isMatch) { getParents() ... }.
  • For performance reasons, the same object is passed to all callbacks.

filterFn

Type: function
Default: undefined

When defined, this callback is invoked for every match. If false is returned, the current key is excluded from the result.

The return value of this callback has no effect when a search context is provided.

Can be used to do processing as matching keys are traversed.

Invoked in same order as matches would appear in result.

This method is conceptually similar to Array.filter().

Examples:

haystack: { a: 0, b: 'bar' }
needles: ['**']
comment: filter function
filterFn: ({ value }) => typeof value === 'string'

breakFn

Type: function
Default: undefined

When defined, this callback is invoked for every key that is traversed by the search. If true is returned, all keys nested under the current key are skipped in the search and from the final result.

Note that breakFn is invoked before the corresponding filterFn might be invoked.

Examples:

haystack: { a: { b: { c: 0 } } }
needles: ['**']
comment: break function
breakFn: ({ key }) => key === 'a.b'

beforeFn

Type: function
Default: undefined

When defined, this function is called before traversal as beforeFn(state = { haystack, context }).

If a value other than undefined is returned from beforeFn, that value is written to state.haystack before traversal.

The content of state can be modified in the function. After beforeFn has executed, the traversal happens using state.haystack and state.context.

The content in state can be accessed in afterFn. Note however that the key result is being overwritten.

Examples:

haystack: { a: 0 }
context: { b: 0 }
needles: ['**']
comment: combining haystack and context
beforeFn: ({ haystack: h, context: c }) => [h, c]
rtn: 'key'
haystack: { a: 0, b: 1 }
needles: ['**']
comment: pre-processing haystack
beforeFn: ({ haystack: h }) => Object.keys(h)
rtn: ['key', 'value']

afterFn

Type: function
Default: undefined

When defined, this function is called after traversal as afterFn(state = { result, haystack, context }).

Additional information written to state in beforeFn is available in afterFn.

The content of state can be modified in the function. In particular the key state.result can be updated.

If a value other than undefined is returned from afterFn, that value is written to state.result.

After beforeFn has executed, the key state.result is returned as the final result.

Examples:

haystack: { a: 0 }
context: 5
needles: ['**']
comment: returning count plus context
afterFn: ({ result, context }) => result + context
rtn: 'count'
joined: false
haystack: { a: 0, b: 3, c: 4 }
needles: ['**']
comment: post-processing result
afterFn: ({ result }) => result.filter((v) => v > 3)
rtn: 'value'
joined: false
haystack: {}
needles: ['**']
comment: pass data from beforeFn to afterFn
beforeFn: (state) => { /* eslint-disable no-param-reassign */ state.custom = 7; }
afterFn: (state) => state.custom
joined: false

compareFn

Type: function
Default: undefined

This function has the same signature as the callback functions. When defined it is expected to return a function or undefined.

The returned value is used as a comparator to determine the traversal order of any object keys.

This works together with the reverse option.

Examples:

haystack: { a: 0, c: 1, b: 2 }
needles: ['**']
compareFn: () => (k1, k2) => k1.localeCompare(k2)
comment: simple sort
reverse: false

reverse

Type: boolean
Default: true

When set to true, the scan is performed in reverse order. This means breakFn is executed in reverse post-order and filterFn in reverse pre-order. Otherwise breakFn is executed in pre-order and filterFn in post-order.

When reverse is true the scan is delete-safe. I.e. property can be deleted / spliced from parent object / array in filterFn.

Examples:

haystack: { f: { b: { a: {}, d: { c: {}, e: {} } }, g: { i: { h: {} } } } }
needles: ['**']
context: []
breakFn: ({ isMatch, property, context }) => { if (isMatch) { context.push(property); } }
comment: breakFn, reverse true
reverse: true
joined: false
haystack: { f: { b: { a: {}, d: { c: {}, e: {} } }, g: { i: { h: {} } } } }
needles: ['**']
context: []
filterFn: ({ property, context }) => { context.push(property); }
comment: filterFn, reverse true
reverse: true
joined: false
haystack: { f: { b: { a: {}, d: { c: {}, e: {} } }, g: { i: { h: {} } } } }
needles: ['**']
context: []
breakFn: ({ isMatch, property, context }) => { if (isMatch) { context.push(property); } }
comment: breakFn, reverse false
reverse: false
joined: false
haystack: { f: { b: { a: {}, d: { c: {}, e: {} } }, g: { i: { h: {} } } } }
needles: ['**']
context: []
filterFn: ({ property, context }) => { context.push(property); }
comment: filterFn, reverse false
reverse: false
joined: false

orderByNeedles

Type: boolean
Default: false

When set to false, all targeted keys are traversed and matched in the order determined by the compareFn and reverse option.

When set to true, all targeted keys are traversed and matched in the order determined by the corresponding needles, falling back to the above ordering.

Note that this option is constraint by the depth-first search approach.

Examples:

haystack: { a: 0, b: 1, c: 1 }
needles: ['c', 'a', 'b']
orderByNeedles: true
comment: order by needle
haystack: { a: 0, b: 1, c: 1 }
needles: ['b', '*']
orderByNeedles: true
reverse: true
comment: fallback reverse
haystack: { a: 0, b: 1, c: 1 }
needles: ['b', '*']
orderByNeedles: true
reverse: false
comment: fallback not reverse
haystack: { a: 0, b: { c: 1 }, d: 2 }
needles: ['a', 'b.c', 'd']
orderByNeedles: true
comment: nested match
haystack: { a: 0, b: { c: 1 }, d: 2 }
needles: ['b', 'a', 'b.c', 'd']
orderByNeedles: true
comment: matches traverse first

abort

Type: boolean
Default: false

When set to true the scan immediately returns after the first match.

Examples:

haystack: { a: 0, b: 1 }
needles: ['a', 'b']
joined: false
rtn: 'property'
abort: true
comment: only return first property
haystack: ['a', 'b']
needles: ['[0]', '[1]']
joined: false
rtn: 'count'
abort: true
comment: abort changes count

rtn

Type: string or array or function
Default: dynamic

Defaults to key when search context is undefined and to context otherwise.

Can be explicitly set as a string:

  • context: search context is returned
  • key: as passed into filterFn
  • value: as passed into filterFn
  • entry: as passed into filterFn
  • property: as passed into filterFn
  • gproperty: as passed into filterFn
  • parent: as passed into filterFn
  • gparent: as passed into filterFn
  • parents: as passed into filterFn
  • isMatch: as passed into filterFn
  • matchedBy: as passed into filterFn
  • excludedBy: as passed into filterFn
  • traversedBy: as passed into filterFn
  • isCircular: as passed into filterFn
  • isLeaf: as passed into filterFn
  • depth: as passed into filterFn
  • bool: returns true iff a match is found
  • count: returns the match count

When set to array, can contain any of the above except context, bool and count.

When set to function, called with callback signature for every match. Returned value is added to the result.

When abort is set to true and rtn is not context, bool or count, the first entry of the result or undefined is returned.

Examples:

haystack: ['a', 'b', 'c']
needles: ['[*]']
joined: false
rtn: 'value'
comment: return values
haystack: { foo: ['bar'] }
needles: ['foo[*]']
joined: false
rtn: 'entry'
comment: return entries
haystack: { a: { b: { c: 0 } } }
needles: ['a.b.c', 'a']
joined: false
rtn: 'property'
comment: return properties
haystack: { a: { b: 0, c: 1 } }
needles: ['a.b', 'a.c']
joined: false
rtn: 'bool'
comment: checks for any match, full scan
haystack: { a: 0 }
needles: ['**']
joined: false
rtn: 'context'
comment: return not provided context
haystack: { a: { b: { c: 0, d: 1 } } }
needles: ['a.b.{c,d}']
joined: false
rtn: 'key'
context: []
comment: return keys with context passed
haystack: { a: { b: { c: 0, d: 1 } } }
needles: ['a.b.{c,d}']
joined: false
rtn: ['property', 'value']
context: []
comment: return custom array
haystack: { a: { b: { c: 0, d: 1 } } }
needles: ['**']
joined: false
rtn: ({ value }) => value + 1
filterFn: ({ isLeaf }) => isLeaf
comment: return value plus one

joined

Type: boolean
Default: false

Keys are returned as a string when set to true instead of as a list.

Setting this option to true will negatively impact performance.

Note that _.get and _.set fully support lists.

Examples:

haystack: [0, 1, { foo: 'bar' }]
needles: ['[*]', '[*].foo']
joined: true
comment: joined
haystack: [0, 1, { foo: 'bar' }]
needles: ['[*]', '[*].foo']
joined: false
comment: not joined

useArraySelector

Type: boolean
Default: true

When set to false, no array selectors should be used in any needles and arrays are automatically traversed.

Note that the results still include the array selectors.

Examples:

haystack: [{ a: 0 }, { b: [{ c: 1 }, { d: 2 }] }]
needles: ['a', 'b.d']
useArraySelector: false
comment: automatic array traversal
haystack: [{ a: 0 }, { b: 1 }]
needles: ['']
useArraySelector: false
comment: top level array matching

strict

Type: boolean
Default: true

When set to true, errors are thrown when:

  • a path is identical to a previous path
  • a path invalidates a previous path
  • a path contains consecutive recursions

Examples:

haystack: []
needles: ['a.b', 'a.b']
comment: identical
haystack: []
needles: ['a.{b,b}']
comment: identical, same needle
haystack: []
needles: ['a.b', 'a.**']
comment: invalidates previous
haystack: []
needles: ['**.!**']
comment: consecutive recursion

Search Context

A context can be passed into a search invocation as a second parameter. It is available in all callbacks and can be used to manage state across a search invocation without having to recompile the search.

By default all matched keys are returned from a search invocation. However, when it is not undefined, the context is returned instead.

Examples:

haystack: { a: { b: { c: 2, d: 11 }, e: 7 } }
needles: ['**.{c,d,e}']
context: { sum: 0 }
filterFn: ({ value, context }) => { context.sum += value; }
comment: sum values

Examples

More extensive examples can be found in the tests.

haystack: { a: { b: { c: 'd' }, e: { f: 'g' }, h: ['i', 'j'] }, k: 'l' }
needles: ['a.*.f']
comment: nested
needles: ['*.*.*']
comment: multiple nested
needles: ['a.*.{c,f}']
comment: or filter
needles: ['a.*.{c,f}']
comment: or filter, not joined
joined: false
needles: ['*.*[*]']
comment: list filter
needles: ['*[*]']
comment: list filter, unmatched
needles: ['**']
comment: star recursion
needles: ['++.++']
comment: plus recursion
needles: ['**.f']
comment: star recursion ending in f
needles: ['**[*]']
comment: star recursion ending in array
needles: ['a.*,!a.e']
comment: exclusion filter
needles: ['**.(^[bc]$)']
comment: regex matching

Edge Cases

Top level object(s) are matched by the empty needle ''. This is useful for matching objects nested in arrays by setting useArraySelector to false. To match the actual empty string as a key, use (^$).

Note that the empty string does not work to match top level objects with _.get or _.set.

Examples:

haystack: [{}, {}]
needles: ['']
useArraySelector: false
comment: match top level objects in array
haystack: {}
needles: ['']
comment: match top level object
haystack: { '': 0, a: { '': 1 } }
needles: ['**.(^$)']
joined: false
comment: match empty string keys
haystack: [0, [{ a: 1 }, 2]]
needles: ['**(^a$)']
useArraySelector: false
comment: star recursion matches roots

Internals

Conceptually this package works as follows:

  1. During initialization the needles are parsed and built into a search tree. Various information is pre-computed and stored for every node. Finally the search function is returned.

  2. When the search function is invoked, the input is traversed simultaneously with the relevant nodes of the search tree. Processing multiple search tree branches in parallel allows for a single traversal of the input.

Having a separate initialization stage allows for a performant search and significant speed ups when applying the same search to different input.