Add `neural_sparse` #84

esmarkowski · 2024-03-16T17:47:48Z

Add neural_sparse for providing sparse vector search and access to results.

class OnboardingDoc < StretchyModel
	attribute :body, :text
	attribute :status, :string
	attribute :filename, :keyword
	attribute :path, :keyword
	attribute :owner, :keyword
	attribute :client, :keyword
	attribute :embedding, :rank_features

	default_pipeline :nlp_sparse_pipeline, model_id: 'q32Pw02BJ3squ3VZa'

end

question = "What clients are due for a monthly review?"
context = OnboardingDoc.neural_sparse(embedding: question)
#=>
   {
     "_index" : "onboarding_docs",
     "_id" : "1",
     "_score" : 30.0029,
     "_source" : {
       "body" : "Perform monthly review by QA team",
       "embedding" : {
         "review" : 0.8708904,
         "monthly" : 0.8587369,
         "QA" : 2.3929274,
         "team" : 2.7839446,
         "weekly" : 0.75845814,
       },
       "id" : "s1"
     }
   }

context.embedding.review
#=> 0.8708904

Neural Sparse

model.neural_sparse¹


`query_text`	String	Required	The query text from which to generate vector embeddings.
`model_id`	String	Required	The ID of the sparse encoding model or tokenizer model that will be used to generate vector embeddings from the query text. The model must be deployed in OpenSearch before it can be used in sparse neural search. For more information, see Using custom models within OpenSearch and Neural sparse search.
`max_token_score`	Float	Optional	(Deprecated) The theoretical upper bound of the score for all tokens in the vocabulary (required for performance optimization). For OpenSearch-provided pretrained sparse embedding models, we recommend setting `max_token_score`to 2 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-doc-v1`and to 3.5 for `amazon/neural-sparse/opensearch-neural-sparse-encoding-v1`. This field has been deprecated as of OpenSearch 2.12.

# model.neural_sparse(field: 'query_text', **options)

model.neural_sparse(passage_embedding: 'Hi world')
#or 
model.neural_sparse(passage_embedding: 'Hi world', model_id: 'aP2Q8ooBpBj3wT4HVS8a')

{
  "query": {
    "neural_sparse": {
      "passage_embedding": {
        "query_text": "Hi world",
        "model_id": "aP2Q8ooBpBj3wT4HVS8a"
      }
    }
  }
}

https://opensearch.org/docs/latest/query-dsl/specialized/neural-sparse/ ↩

The text was updated successfully, but these errors were encountered:

esmarkowski added this to the v0.7.0 milestone Mar 16, 2024

esmarkowski mentioned this issue Mar 18, 2024

Add OpenSearch Neural Features #86

Closed

esmarkowski mentioned this issue Mar 19, 2024

Add MachineLearning Models #92

Merged

esmarkowski closed this as completed in 1386e1a Mar 19, 2024

esmarkowski closed this as completed in #92 Mar 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `neural_sparse` #84

Add `neural_sparse` #84

esmarkowski commented Mar 16, 2024 •

edited

Loading

Add neural_sparse #84

Add neural_sparse #84

Comments

esmarkowski commented Mar 16, 2024 • edited Loading

Neural Sparse

Footnotes

Add `neural_sparse` #84

Add `neural_sparse` #84

esmarkowski commented Mar 16, 2024 •

edited

Loading