Skip to content

Latest commit

 

History

History
464 lines (314 loc) · 9.5 KB

README.md

File metadata and controls

464 lines (314 loc) · 9.5 KB

Fergie's Inverted Index

tests

This is an inverted index library. There are many like it, but this one is Fergie's.

Throw JavaScript objects at the index and they will become retrievable by their properties using promises and map-reduce (see examples)

This lib will work in node and also in the browser

Getting started

Initialise and populate an index

import fii from 'fergies-inverted-index'

const db = fii()

db.PUT([ /* my array of objects to be searched */ ]).then(doStuff)

Query the index

// (given objects that contain: { land: <land>, colour: <colour>, population: <number> ... })

// get all object IDs where land=SCOTLAND and colour=GREEN
db.AND(|'land:SCOTLAND', 'colour:GREEN']).then(result)

// the query strings above can alternatively be expressed using JSON objects
db.AND([
  {
    FIELD: 'land'
    VALUE: 'SCOTLAND'
  }, {
    FIELD: 'colour',
    VALUE: 'GREEN'
  }
]).then(result)

// as above, but return whole objects
db.AND(['land:SCOTLAND', 'colour:GREEN']).then(db.OBJECT).then(result)

// Get all object IDs where land=SCOTLAND, and those where land=IRELAND
db.OR(['land:SCOTLAND', 'land:IRELAND']).then(result)

// queries can be embedded within each other
db.AND([
  'land:SCOTLAND',
  db.OR(['colour:GREEN', 'colour:BLUE'])
]).then(result)

// get all object IDs where land=SCOTLAND and colour is NOT GREEN
db.NOT(
  db.GET('land:SCOTLAND'),                 // everything in this set
  db.GET('colour:GREEN', 'colour:RED').    // minus everything in this set
).then(result)

// Get max population
db.MAX('population').then(result)

(See the tests for more examples.)

API

fii(options)

Returns a promise

import fii from 'fergies-inverted-index'

// creates a DB called "myDB" using levelDB (node.js), or indexedDB (browser)
const db = await fii({ name: 'myDB' })

In some cases you will want to start operating on the database instentaneously. In these cases you can wait for the callback:

import fii from 'fergies-inverted-index'

// creates a DB called "myDB" using levelDB (node.js), or indexedDB (browser)
fii({ name: 'myDB' }, (err, db) => {
  // db is guaranteed to be open and available
})

db.AGGREGATION_FILTER(aggregation, query).then(result)

The aggregation (either FACETS or BUCKETS) is filtered by the query

Promise.all([
  FACETS({
    FIELD: ['drivetrain', 'model']
  }),
  AND(['colour:Black'])
])
  .then(([facetResult, queryResult]) =>
    db.AGGREGATION_FILTER(facetResult, queryResult)
  )
  .then(result)

db.AND([ ...token ]).then(result)

db.AND returns a set of object IDs that match every clause in the query.

For example- get the set of objects where the land property is set to scotland, year is 1975 and color is blue

db.AND([ 'land:scotland', 'year:1975', 'color:blue' ]).then(result)

db.BUCKETS( ...token ).then(result)

Every bucket returns all object ids for objects that contain the given token

BUCKETS(
  {
    FIELD: ['year'],
    VALUE: {
      LTE: 2010
    }
  },
  {
    FIELD: ['year'],
    VALUE: {
      GTE: 2010
    }
  }
).then(result)

db.CREATED().then(result)

Returns the timestamp that indicates when the index was created

db.CREATED().then(result)

db.DELETE([ ...id ]).then(result)

Delete all objects by id. The result indicated if the delete operation was successful or not.

db.DELETE([ 1, 2, 3 ]).then(result)

db.DISTINCT(options).then(result)

db.DISTINCT returns every value in the db that is greater than equal to GTE and less than or equal to LTE (sorted alphabetically)

For example- get all names between h and l:

db.DISTINCT({ GTE: 'h', LTE: 'l' }).then(result)

db.EXIST( ...id ).then(result)

Indicates whether the documents with the given ids exist in the index

db.EXIST(1, 2, 3).then(result)

db.EXPORT().then(result)

Exports the index to text file. See also IMPORT.

db.EXPORT().then(result)

db.FACETS( ...token ).then(result)

Creates an aggregation for each value in the given range. FACETS differs from BUCKETS in that FACETS creates an aggregation per value whereas BUCKETS can create aggregations on ranges of values

db.FACETS(
  {
    FIELD: 'colour'
  },
  {
    FIELD: 'drivetrain'
  }
).then(result)

db.FIELDS().then(result)

db.FIELDS returns all available fields

db.FIELDS().then(result) // 'result' is an array containing all available fields

db.GET(token).then(result)

db.GET returns all object ids for objects that contain the given property, aggregated by object id.

For example to get all Teslas do:

db.GET('Tesla').then(result)  // get all documents that contain Tesla, somewhere in their structure

Perhaps you want to be more specific and only return documents that contain Tesla in the make FIELD

db.GET('make:Tesla').then(result)

which is equivalent to:

db.GET({
  FIELD: 'make',
  VALUE: 'Tesla'
}).then(result)

You can get all cars that begin with O to V in which case you could do

db.GET({
  FIELD: 'make',
  VALUE: {
    GTE: 'O',   // GTE == greater than or equal to
    LTE: 'V'    // LTE == less than or equal to
  }
}).then(result)

db.IMPORT(exportedIndex).then(result)

Reads in an exported index and returns a status.

See also EXPORT.

db.IMPORT(exportedIndex).then(result)

db.LAST_UPDATED().then(result)

Returns a timestamp indicating when the index was last updated.

db.LAST_UPDATED().then(result)

db.MAX(token).then(result)

Get the highest alphabetical value in a given token

For example- see the highest price:

db.MAX('price')

db.MIN(token).then(result)

Get the lowest alphabetical value in a given token

For example- see the lowest price:

db.MIN('price')

db.NOT(A, B).then(result)

Where A and B are sets, db.NOT Returns the ids of objects that are present in A, but not in B.

For example:

db.NOT(
  global[indexName].GET({
    FIELD: 'sectorcode',
    VALUE: {
      GTE: 'A',
      LTE: 'G'
    }
  }),
  'sectorcode:YZ'
)

db.OBJECT([ ...id ]).then(result)

Given an array of ids, db.OBJECT will return the corresponding objects.

db.AND([
  'board_approval_month:October',
  global[indexName].OR([
    'sectorcode:LR',
    global[indexName].AND(['sectorcode:BC', 'sectorcode:BM'])
  ])
])
  .then(db.OBJECT)
  .then(result)

db.OR([ ...tokens ]).then(result)

Return ids of objects that are in one or more of the query clauses

For example- get the set of objects where the land property is set to scotland, or year is 1975 or color is blue

db.AND([ 'land:scotland', 'year:1975', 'color:blue' ]).then(result)

db.PUT([ ...documents ]).then(result)

Add documents to index

For example:

db.PUT([
  {
    _id: 8,
    make: 'BMW',
    colour: 'Silver',
    year: 2015,
    price: 81177,
    model: '3-series',
    drivetrain: 'Petrol'
  },
  {
    _id: 9,
    make: 'Volvo',
    colour: 'White',
    year: 2004,
    price: 3751,
    model: 'XC90',
    drivetrain: 'Hybrid'
  }
]).then(result)

db.SORT(resultSet).then(result)

Example:

db.GET('blue').then(db.SORT)

db.STORE

Property that points to the underlying level store

test