An entities
collection contains suggestable items. These items can be from any source.
Entities can be extracted from a main collection using an aggregation pipeline that maps the entities into the schema described below, and then $merge
'd as the last stage. See "Examples" below for concrete pipelines.
_id
: Unique, stable identifier for the entitytype
: Entity typename
: Text string name of the entity- other fields: as needed for filtering or relevancy
Set up an Atlas Search entities_index
on the entities
collection with the following considerations:
type
: Index it as astringFacet
to allow faceting by entity types. Also, index it as atoken
field type for filtering capability.name
: This field is indexed in numerous "multi"-analyzed ways to facilitate a variety of partial matching techniques.
Given a users query, $search
the entities with a broad set of optional query clauses across the "multi"-analyzed name
field variations. Boosts can be attached to each clause to tune the relevancy of results.
The entities
collection can be faceted by type
(provided the field was configured as stringFacet
), to provide an overview of the various types matched.
- If there is only one type of entity to suggest, a separate
entities
collection isn't needed. The index configuration for the existing collection can be augmented with the analyzers and queried using the techniques described here.
It's easy to get started with this as-you-type suggestion solution using the sample movies data available within Atlas. First, load the sample data into your cluster and have mongosh
installed, then following these steps:
cd examples/movies
- Run
mongosh "<connection string>" setup.js
- Wait until the index has been built and is available. This may take a few minutes, depending on your cluster tier. You can check on the status through the Atlas web UI or with Compass.
- Then run
mongosh "<connection string>" suggest.js
suggest.js
emulates a user typing the query that is coded at the top of the file, one query for each character of the string. Adjust this string to try other examples. The output will be a series of results like this, ending with the time each query took to execute:
matr
* The Matrix (title)
* The Matrix Reloaded (title)
* The Matrix Revolutions (title)
* The Matriarch (title)
* The Matrimony (title)
* India: Matri Bhumi (title)
* Mia Madre (title)
* Holy Matrimony (title)
* David Matranga (cast)
* Mother (title)
63ms
If you need to run setup.js
again, you may encounter MongoServerError: Duplicate Index
due to the existing entities_index
lingering briefly before it gets automatically removed in order to set it up again. Wait a few seconds and try again.