Release v1.7.0 - Array datatypes and two new modules (NER, Spellcheck) · weaviate/weaviate

Features

Array Datatypes (#1611)

Starting with this releases, primitive object properties are no longer limited to individual properties, but can also include lists of primitives. Array types can be stored, filtered and aggregated in the same way as other primitives.

Auto-schema will automatically recognize lists of string/text and number/int. You can also explicitly specify lists in the schema by using the following data types string[], text[], int[], number[]. A type that is assigned to be an array, must always stay an array, even if it only contains a single element.

New Module: `text-spellcheck` - Check and auto-correct misspelled search terms (#1606)

Use the new spellchecker module to verify user-provided search queries (in existing nearText or ask functions) are spelled correctly and even suggest alternative, correct spellings. Spell-checking happens at query time.

There are two ways to use this module:

It provides a new additional prop which can be used to check (but not alter) the provided queries:
The following query:

 {
   Get {
     Post(nearText:{
       concepts: "missspelled text"
     }) {
       content
       _additional{
         spellCheck{
           changes{
             corrected
             original
           }
           didYouMean
           location
           originalText
         }
       }
     }
   }
 }

will produce results, similar to the following:

   "_additional": {
     "spellCheck": [
       {
         "changes": [
           {
             "corrected": "misspelled",
             "original": "missspelled"
           }
         ],
         "didYouMean": "misspelled text",
         "location": "nearText.concepts[0]",
         "originalText": "missspelled text"
       }
     ]
   },
   "content": "..."
 },

It extends existing text2vec-modules with a autoCorrect flag, which can be used to correct the query if incorrect in the background.

New Module `ner-transformers` - Extract entities from Weaviate using transformers (#1632)

Use transformer-based models to extract entities from your existing Weaviate objects on the fly. Entity Extraction happens at query time. Note that for maximum perfomance, transformer-based models should run with GPUs. CPUs can be used, but the throughput will be lower.

To make use of the modules capabilities, simply extend your query with the following new _additional property:

{
  Get {
    Post {
      content
      _additional {
        tokens(
          properties: ["content"],    # is required
          limit: 10,                  # optional, int
          certainty: 0.8              # optional, float
        ) {
          certainty
          endPosition
          entity
          property
          startPosition
          word
        }
      }
    }
  }
}

It will return results similar to the following:

 "_additional": {
   "tokens": [
     {
       "property": "content",
       "entity": "PER",
       "certainty": 0.9894614815711975,
       "word": "Sarah",
       "startPosition": 11,
       "endPosition": 16
     },
     {
       "property": "content",
       "entity": "LOC",
       "certainty": 0.7529033422470093,
       "word": "London",
       "startPosition": 31,
       "endPosition": 37
     }
   ]
 }

Fixes

Aggregation can get stuck when aggregating number datatypes (#1660)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.7.0 - Array datatypes and two new modules (NER, Spellcheck)

Features

Array Datatypes (#1611)

New Module: `text-spellcheck` - Check and auto-correct misspelled search terms (#1606)

New Module `ner-transformers` - Extract entities from Weaviate using transformers (#1632)

Fixes

v1.7.0 - Array datatypes and two new modules (NER, Spellcheck)

Features

Array Datatypes (#1611)

New Module: text-spellcheck - Check and auto-correct misspelled search terms (#1606)

New Module ner-transformers - Extract entities from Weaviate using transformers (#1632)

Fixes

New Module: `text-spellcheck` - Check and auto-correct misspelled search terms (#1606)

New Module `ner-transformers` - Extract entities from Weaviate using transformers (#1632)