GitHub - mongodb-developer/Triggers-Sentence-Transformers: This trigger demonstrates how you can automatically update document embeddings whenever a new document is inserted or modified in a specific collection.

Atlas Triggers And HuggingFace Sentence Transformers

The sample python code provided uses the all-MiniLM-L6-v2 sentence transformer model, from HuggingFace. It maps the sentences (docs to insert in collection as well as for query string) to a 384 dimensional dense vector space, and creates corresponding vector embeddings (list of numbers). This tutorial requires basic knowledge of Python, and assumes that you have an existing Atlas Cluster. Simple Vector demo below

Steps:

Install pymongo:

pip3 install pymongo

Install sentence-transformers:

pip3 install -U sentence-transformers

Run the following code to create the test collection with corresponding vector embeddings:

import pymongo
client = pymongo.MongoClient("SRV URL TO YOUR ATLAS CLUSTER")
db = client.vector_tests

from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

docs = [
    "The students studied for their exams.",
    "Studying hard, the students prepared for their exams.",
    "The chef cooked a delicious meal.",
    "The chef cooked the chicken with the vegetables.",
    "Known for its power and aggression, Mike Tyson's boxing style was feared by many."
]

print(docs)

result_doc = {}
for doc in docs:
    doc_vector = model.encode(doc).tolist()
    result_doc['sentence'] = doc
    result_doc['vectorEmbedding'] = doc_vector
    result = db.vectors_demo_1.insert_one(result_doc.copy())
    print(result)

Create the following Atlas Search index on the vectors_demo_1 collection:

{
  "mappings": {
    "dynamic": true,
    "fields": {
      "vectorEmbedding": {
        "type": "knnVector",
        "dimensions": 384,
        "similarity": "euclidean"
      }
    }
  }
}

Now you can run the following code to perform semantic search for various sentences (uncomment the query you want to run) - note that my index is named "default" and that's why the query does not specify the index name if your index is not named default then please include your index's name in the query:

import pymongo
client = pymongo.MongoClient("SRV URL TO YOUR ATLAS CLUSTER")
db = client.vector_tests

from sentence_transformers import SentenceTransformer, util
model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')

# sample searches

#query = "The cook prepared a meal of poultry and veggies."
#query = "The pupils worked hard for their test."
#query = "Studying hard, the students prepped for their exams."
#query = "A delicious meal was cooked by the chef."
#query = "Tyson's boxing style was feared for its power and aggression."

vector_query = model.encode(query).tolist()
pipeline = [
    {
        "$search": {
            "knnBeta": {
                "vector": vector_query,
                "path": "vectorEmbedding",
                "k": 3
            }
        }
    },
    {
        "$limit": 1
    },
    {
        "$project": {
            "vectorEmbedding": 0,
            "_id": 0,
            'score': {
                '$meta': 'searchScore'
            }
        }
    }
]

results = db.vectors_demo_1.aggregate(pipeline)
for result in results:
    print("\n")
    print("*************Vector Search Result*****************")
    print(result['sentence'])
    print("**************************************************")

🤗 HuggingFace Documentation

🍃 Atlas Triggers Documentation

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
LICENSE		LICENSE
README.md		README.md
catagory-trigger.js		catagory-trigger.js
change-trigger.js		change-trigger.js
mappers.py		mappers.py
queryV.py		queryV.py
similarity-trigger.js		similarity-trigger.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LICENSE

LICENSE

README.md

README.md

catagory-trigger.js

catagory-trigger.js

change-trigger.js

change-trigger.js

mappers.py

mappers.py

queryV.py

queryV.py

similarity-trigger.js

similarity-trigger.js

Repository files navigation

About

Releases

Packages

Languages

License

mongodb-developer/Triggers-Sentence-Transformers

Folders and files

Latest commit

History

Repository files navigation

About

Resources

License

Stars

Watchers

Forks

Languages