Skip to content

Conversation

jimlambie
Copy link
Contributor

Description

This PR introduces a search endpoint and indexing routine.

Configuration

A search block must exist in the configuration file:

"search": {
  "enabled": true,
  "minQueryLength": 3,
  "datastore": "@dadi/api-mongodb",
  "database": "search"
}

minQueryLength

The number of characters required before a search can be executed

datastore

The data connector to use to store search collections. Currently supports @dadi/api-mongodb only.

database

The database to store words and search instances

Indexable fields

To enable indexing you must specify a search block for each field you'd like indexed within the collection schema, including the weight:

{
  "fields": {
    "title": {
      "type": "String",
      "label": "Title",
      "search": {
        "weight": 2
      }
  }
}

Running a query

Query an indexed collection by adding /search to the collection's endpoint and include a q parameter in the querystring:

https://somedomain.com/1.0/my-db/books/search?q=harry wizard

Field filters can be applied in the same way as collection filtering"

https://somedomain.com/1.0/my-db/books/search?q=harry wizard&fields={"title": 1}

@eduardoboucas eduardoboucas changed the title Actual Search Add search Apr 5, 2018
@abovedave abovedave mentioned this pull request Jun 1, 2018
jimlambie added 5 commits June 8, 2018 11:37
…arch

# Conflicts:
#	dadi/lib/controller/index.js
#	dadi/lib/model/index.js
#	dadi/lib/search/index.js
#	package.json
#	test/acceptance/db-connection.js
#	test/acceptance/search_collections.js
#	workspace/collections/vjoin/testdb/collection.books.json
…arch

# Conflicts:
#	README.md
#	dadi/lib/index.js
#	dadi/lib/model/delete.js
#	dadi/lib/model/index.js
#	dadi/lib/model/update.js
#	dadi/lib/search/index.js
#	package.json
#	test/acceptance/search_collections.js
#	test/test-connector/index.js
config.js Outdated
default: 3
},
wordCollection: {
doc: '',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

config.js Outdated
default: 'words'
},
datastore: {
doc: "",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😑

config.js Outdated
default: '@dadi/api-mongodb'
},
database: {
doc: '',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😱

let queryOptions = this._prepareQueryOptions(options)

if (queryOptions.errors.length !== 0) {
sendBackJSON(400, res, next)(null, queryOptions)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, don't we want to return?

})

return help.sendBackJSON(200, res, next)(null, results)
}).catch(error => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you don't need this catch, since you have one on the parent Promise.

* @param {Object} options - options to use in the query
* @return {Promise}
*/
// Search.prototype.insert = function (datastore, data, collection, schema, options = {}) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this meant to be commented out?

if (!Object.keys(this.indexableFields).length) return

let skip = (page - 1) * limit
console.log(`Indexing page ${page} (${limit} per page)`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a debug()?

settings: this.model.settings
}).then(results => {
if (results.results && results.results.length) {
console.log(`Indexed ${results.results.length} ${results.results.length === 1 ? 'record' : 'records'} for ${this.model.name}`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a debug()?


if (results.results.length > 0) {
this.index(results.results).then(response => {
console.log(`Indexed page ${options.page}/${results.metadata.totalPages}`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a debug()?

@@ -1,4 +1,3 @@
--bail
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not commit this change? I think it's useful if tests fail early in the CI.

@coveralls
Copy link

coveralls commented Jul 17, 2018

Pull Request Test Coverage Report for Build 1472

  • 257 of 279 (92.11%) changed or added relevant lines in 9 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.06%) to 89.018%

Changes Missing Coverage Covered Lines Changed/Added Lines %
dadi/lib/controller/index.js 20 21 95.24%
dadi/lib/search/analysers/standard.js 49 51 96.08%
dadi/lib/search/index.js 154 160 96.25%
dadi/lib/controller/searchIndex.js 8 21 38.1%
Totals Coverage Status
Change from base Build 1460: 0.06%
Covered Lines: 3507
Relevant Lines: 3830

💛 - Coveralls

let err

if (typeof options === 'function') {
// done = options
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can go, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's gone!


areValidWords (words) {
return Array.isArray(words) &&
words.every(word => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation here looks a bit off?

* N.B. May only be used with the MongoDB Data Connector.
*/
const Search = function (model) {
if (!model || model.constructor.name !== 'Model') throw new Error('model should be an instance of Model')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New line? 😬

}
}
}).catch(err => {
console.log(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we go into this catch? We're swallowing the error, so doesn't the request hang?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return Promise.reject(err) <- ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless we want to handle this error in a special way, I would just remove the catch and let the error be handled upstream. But it needs confirmation (and ideally a unit test), I'm just being that annoying guy that points potential issues just by looking at the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted.

}

if (!Array.isArray(documents)) {
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore if this doesn't make sense, but is it okay to be returning undefined here where the other exit routes return a Promise? i.e. isn't there a risk that we'll add a .then() somewhere upstream, which will throw an error if the subject is undefined?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't actually see us calling this method from anywhere.

@mingard any pointers to where I might find this being called?


debug('deleting documents from the %s index', this.searchCollection)

let deleteQueue = documents.map(document => this.clearDocumentInstances(document._id.toString()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New line? 😬

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably you mean:

  let deleteQueue = documents.map(document => {
    this.clearDocumentInstances(document._id.toString())
  })

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  let deleteQueue = documents.map(document => {
    return this.clearDocumentInstances(document._id.toString())
  })

(You were missing the return)

Search.prototype.getIndexableFields = function () {
let schema = this.model.schema

return Object.assign({}, ...Object.keys(schema)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this return statement super confusing 😞

Search.prototype.removeNonIndexableFields = function (document) {
if (typeof document !== 'object') return {}

return Object.assign({}, ...Object.keys(document)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this return statement super confusing 😞

settings: this.getWordSchema().settings
}).then(results => {
// Get all word instances from Analyser
this.clearDocumentInstances(docId).then(response => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we return this.clearDocumentInstances(docId)?

options: options,
schema: this.model.schema,
settings: this.model.settings
}).then(results => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor thing: perhaps we could use object destructuring here to avoid writing results.results, so:

// (...)
}).then(({metadata, results}) => {
  if (results && results.length) {
// (...)

}

// 404 if Search is not enabled
if (config.get('search.enabled') === false) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we be more strict and do if (config.get('search.enabled') !== true) {?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

return Promise.resolve()
}

if (!Array.isArray(documents)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we merge this with the first if, perhaps?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

@jimlambie jimlambie merged commit 60230f4 into develop Aug 1, 2018
@jimlambie jimlambie deleted the feature/search branch August 1, 2018 11:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants