# Atlas Vector Search - Create Embeddings - Open Source - New Data

This notebook is a companion to the [Create Embeddings](https://www.mongodb.com/docs/atlas/atlas-vector-search/create-embeddings/) page. Refer to the page for set-up instructions and detailed explanations.

This notebook takes you through how to generate embeddings from **new data** by using the open-source ``nomic-embed-text-v1`` model.

<a target="_blank" href="https://colab.research.google.com/github/mongodb/docs-notebooks/blob/main/create-embeddings/open-source-new-data.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

In [11]:
pip install --quiet --upgrade sentence-transformers pymongo einops

## Use an Embedding Model

In [12]:
from sentence_transformers import SentenceTransformer

# Load the embedding model
model = SentenceTransformer("nomic-ai/nomic-embed-text-v1", trust_remote_code=True)

# Define a function to generate embeddings
def get_embedding(data, precision="float32"):
   return model.encode(data, precision=precision).tolist()

# Generate an embedding
embedding = get_embedding("foo")
print(embedding)



[-0.029808253049850464, 0.03841473162174225, -0.02561120130121708, -0.06707508116960526, 0.03867151960730553, 0.0031592303421348333, -0.04971666261553764, 0.023594636470079422, -0.07017440348863602, -0.08847323805093765, -0.05611487850546837, 0.051312509924173355, 0.027612315490841866, 0.03640267997980118, 0.018126828595995903, -0.0032138070091605186, -0.004208534490317106, -0.034853965044021606, -0.015637759119272232, -0.04466874152421951, -0.004550334066152573, 0.04953199625015259, -0.01731801964342594, -0.004371760878711939, 0.20573385059833527, -0.024629119783639908, -0.020009297877550125, -0.002872822340577841, -0.05270130932331085, -0.009439852088689804, 0.0029349259566515684, 0.024012627080082893, -0.007290663663297892, -0.0500018335878849, -0.04934301599860191, -0.09039969742298126, 0.014878465794026852, 0.03805148974061012, 0.027957448735833168, 0.014889146201312542, -0.023368364199995995, 0.029722414910793304, 0.019776511937379837, 0.01744695007801056, 0.0008264359785243869, 

### (Optional) Compress your embeddings

Optionally, run the following code to define a function that converts your embeddings into BSON `binData` vectors for [efficient storage and retrieval](https://www.mongodb.com/docs/atlas/atlas-vector-search/create-embeddings/#vector-compression).

In [13]:
from bson.binary import Binary
from bson.binary import BinaryVectorDtype

# Define a function to generate BSON vectors
def generate_bson_vector(vector, vector_dtype):
   return Binary.from_vector(vector, vector_dtype)

# Generate BSON vector from the sample float32 embedding
bson_float32_embedding = generate_bson_vector(embedding, BinaryVectorDtype.FLOAT32)

# Print the converted embedding
print(f"The converted BSON embedding is: {bson_float32_embedding}")

The converted BSON embedding is: b'\'\x00p0\xf4\xbc\xc4X\x1d=\x95\xce\xd1\xbc\xa9^\x89\xbd\x07f\x1e=\x17\x0bO;\xb3\xa3K\xbd\x8aI\xc1<\x99\xb7\x8f\xbdu1\xb5\xbd\xb7\xd8e\xbd\x11-R=93\xe2<\xfa\x1a\x15=\xb7~\x94<\xbc\x9eR\xbb\xbf\xe7\x89\xbb\x08\xc3\x0e\xbd\xc2\x1a\x80\xbc\x92\xf66\xbd\xf8\x1a\x95\xbb\x10\xe2J=\x85\xde\x8d\xbc\xfd@\x8f\xbb\xe5\xabR>\x02\xc3\xc9\xbc\x8a\xea\xa3\xbc\xf6E<\xbbT\xddW\xbd\x9c\xa9\x1a\xbc\xe3W@;!\xb6\xc4<\x85\xe6\xee\xbb\xb9\xceL\xbd\xe7\x1bJ\xbdz#\xb9\xbd\xcf\xc4s<\xe1\xdb\x1b=\x05\x07\xe5<\x9b\xf1s<\x03o\xbf\xbcl|\xf3<Z\x02\xa2<\xe8\xec\x8e<.\xa5X:\'\x13*\xbd\xe9\xa3\xa1<b8?=gOJ=\xb4v\xeb\xbcE\x1c\x98\xbcF\xac\x9b9\x90\xc1<\xbb\x8c\x05}\xbd\xa0F\'<\xf8\xe2\xb9\xbb\xb0\x1cS<8\x84\xac\xbd\x14\xa5\x96\xbc\xf7\xc5\x16;\x98\xff!\xbd\x1a\xb5\x81<\x8f\x85\xea\xbc \x88\xd0<\x83\xde$<\xaa\x8a4\xbd\x0eQ\xcb\xbc8Il\xbc\x08\xc7\x00<$\x14^\xbdmHP=\xa9fq\xbc\x8f4\xed\xbc\x9a\xb0\xb0<\xc7\x93\xf7\xbck\xe5Y<\xec\x1f\x19\xbb\xde\x18\xee;0\xf2!\xbd\x10\xac\xe1<uMl<[\x87\xba\xb

## Generate Embeddings

In [14]:
# Sample data
texts = [
  "Titanic: The story of the 1912 sinking of the largest luxury liner ever built",
  "The Lion King: Lion cub and future king Simba searches for his identity",
  "Avatar: A marine is dispatched to the moon Pandora on a unique mission"
]

In [15]:
# Generate embeddings from the sample data
embeddings = []
for text in texts:
 embedding = get_embedding(text)

 # Uncomment the following line to convert to BSON vectors
 # embedding = generate_bson_vector(embedding, BinaryVectorDtype.FLOAT32)

 embeddings.append(embedding)

 # Print the embeddings
 print(f"\nText: {text}")
 print(f"Embedding: {embedding[:3]}... (truncated)")


Text: Titanic: The story of the 1912 sinking of the largest luxury liner ever built
Embedding: [-0.010890400037169456, 0.059266459196805954, -0.0029132517520338297]... (truncated)

Text: The Lion King: Lion cub and future king Simba searches for his identity
Embedding: [-0.056070517748594284, -0.013606156222522259, 0.005238529294729233]... (truncated)

Text: Avatar: A marine is dispatched to the moon Pandora on a unique mission
Embedding: [-0.027525829151272774, 0.01144346408545971, -0.02360895648598671]... (truncated)


## Ingest Embeddings into Atlas

In [16]:
def create_docs_with_embeddings(embeddings, data):
   docs = []
   for i, (embedding, text) in enumerate(zip(embeddings, data)):
      doc = {
            "_id": i,
            "text": text,
            "embedding": embedding,
      }
      docs.append(doc)
   return docs

In [17]:
# Create documents with embeddings and sample data
docs = create_docs_with_embeddings(embeddings, texts)

In [18]:
import pymongo

# Connect to your Atlas cluster
mongo_client = pymongo.MongoClient('mongodb+srv://edvaro:RQwC5lFqqshq1Leq@clusterevr.brwkhvu.mongodb.net/')
db = mongo_client["sample_db"]
collection = db["embeddings"]

# Ingest data into Atlas
collection.insert_many(docs)

BulkWriteError: batch op errors occurred, full error: {'writeErrors': [{'index': 0, 'code': 11000, 'errmsg': 'E11000 duplicate key error collection: sample_db.embeddings index: _id_ dup key: { _id: 0 }', 'keyPattern': {'_id': 1}, 'keyValue': {'_id': 0}, 'op': {'_id': 0, 'text': 'Titanic: The story of the 1912 sinking of the largest luxury liner ever built', 'embedding': [-0.010890400037169456, 0.059266459196805954, -0.0029132517520338297, -0.03514469787478447, 0.001446364214643836, 0.06216762214899063, -0.0023083032574504614, 0.04222578555345535, -0.03217548131942749, -0.031069349497556686, -0.06828976422548294, 0.007966002449393272, 0.07462596893310547, 0.021224139258265495, 0.050795573741197586, -0.04988135024905205, 0.029282039031386375, 0.0012309779413044453, -0.03696685656905174, 0.0033800441306084394, -0.036281075328588486, -0.010497999377548695, -0.01841910555958748, 0.012748581357300282, 0.014733908697962761, 0.017466669902205467, -0.04757408797740936, 0.00428344588726759, -0.028435733169317245, -0.02531161718070507, 0.01919807493686676, -0.034398872405290604, -0.027543671429157257, -0.03131009638309479, -0.02058536745607853, -0.03143634274601936, -0.02329772524535656, 0.04825453832745552, -0.004310029558837414, 0.017729733139276505, 0.04252646118402481, -0.017379947006702423, 0.03204219043254852, -0.01582254096865654, 0.014750128611922264, 0.006237971596419811, 0.033458784222602844, -0.02819852903485298, -0.0034845611080527306, 0.0015091991517692804, 0.009790420532226562, 0.06706884503364563, -0.015925586223602295, -0.08984250575304031, -0.048958856612443924, 0.002772014355286956, 0.01923898421227932, 0.0077552152797579765, -0.008373776450753212, 0.0095625389367342, 0.031545668840408325, 0.0019389797234907746, 0.0289163775742054, 0.024097006767988205, 0.02818068116903305, -0.05613052472472191, -0.004161072429269552, 0.02347741834819317, 0.041439153254032135, -0.030971059575676918, 0.06186395883560181, -0.04424417018890381, -0.001565844053402543, -0.010612047277390957, -0.054762642830610275, 0.04558068886399269, 0.03144651651382446, 0.013379747979342937, -0.027961645275354385, 0.0355956107378006, 0.07733847945928574, -0.01880178414285183, 0.03537285327911377, 0.03955190256237984, 0.0600440613925457, 0.03570496290922165, -0.007016603369265795, -0.0003721166867762804, -0.04020082578063011, 0.06604360044002533, 0.02734702080488205, -0.012259342707693577, 0.08236970007419586, 0.05908721685409546, 0.005337329115718603, 0.012837247923016548, -0.025094272568821907, -0.07053948938846588, -0.018267765641212463, 0.006933542899787426, -0.020759357139468193, -0.03494760021567345, -0.0024852652568370104, -0.016994237899780273, 0.06358154863119125, 0.008757458068430424, 0.030700044706463814, 0.027893105521798134, 0.054408106952905655, -0.02901853434741497, -0.0071413847617805, -0.024525176733732224, 0.001696246094070375, -0.047412268817424774, 0.004085949622094631, -0.0046844482421875, 0.09406334161758423, -0.035322826355695724, -0.07131346315145493, 0.017651211470365524, -0.014299151487648487, -0.019949832931160927, -0.00906967930495739, 0.03916983678936958, 0.017711855471134186, 0.027483729645609856, -0.06152739003300667, -0.00046055082930251956, 0.020919319242239, -0.029390651732683182, -0.013192047365009785, 0.03015129268169403, 0.009153475984930992, -0.014879952184855938, 0.0019349521026015282, 0.05601894110441208, -0.03710183501243591, -0.05640023946762085, 0.028542663902044296, 0.005019851494580507, -0.012395039200782776, -0.011415103450417519, -0.017609665170311928, 0.013125469908118248, 0.002315163379535079, -0.0050658793188631535, 0.045918554067611694, -0.007514091208577156, -0.051651015877723694, 0.016019307076931, -0.007316756993532181, -0.0020416551269590855, 0.00786389410495758, 0.037382811307907104, -0.06727085262537003, -0.07253800332546234, -0.014476603828370571, -0.003604139667004347, 0.05440199375152588, -0.02691136859357357, -0.00939116533845663, 0.017055319622159004, -0.021309148520231247, 0.041290219873189926, -0.053142741322517395, 0.012684851884841919, 0.047566525638103485, 0.02959750033915043, 0.04576535150408745, 0.026831042021512985, -0.013827664777636528, -0.028189189732074738, 0.00985243171453476, 0.02719149924814701, 0.018721921369433403, -0.02330494299530983, -0.02709755301475525, -0.04677743837237358, 0.05042680725455284, -0.04133133217692375, 0.06076638028025627, -0.014511686749756336, 0.05088618025183678, -0.006097504869103432, -0.014105105772614479, 0.02873213216662407, -0.03328091651201248, -0.056016284972429276, -0.0186284352093935, -0.013384358957409859, -0.04844577983021736, 0.016298990696668625, 0.08167050033807755, -0.020240992307662964, -0.034788526594638824, -0.037735700607299805, 0.04057466983795166, -0.0019364196341484785, -0.02721763215959072, -0.026354586705565453, 0.017797382548451424, -0.030836816877126694, -0.0490676648914814, 0.00994881335645914, -0.06455318629741669, 0.009744773618876934, 0.019744301214814186, 0.0284217931330204, 0.020331233739852905, 0.02104794606566429, 0.07362129539251328, -0.008145849220454693, -0.018608679994940758, -0.006221134215593338, -0.01973062939941883, -0.015069890767335892, -0.011945543810725212, -0.024629957973957062, -0.0062026167288422585, 0.024576807394623756, 0.02451360411942005, 0.023779746145009995, -0.007718370761722326, -0.05666026100516319, 0.0606401152908802, -0.018106041476130486, 0.03661641106009483, -0.05577932670712471, -0.014946171082556248, -0.008826836943626404, -0.035743534564971924, -0.06251566857099533, 0.026665352284908295, 0.039021603763103485, -0.04540792852640152, 0.03389832377433777, 0.01565598137676716, 0.014755524694919586, 0.02920512668788433, 0.060935791581869125, -0.005632354877889156, 0.05082307755947113, -0.045334093272686005, -0.047049377113580704, -0.07118339836597443, 0.0236024409532547, -0.019268371164798737, 0.0338999480009079, -0.004264761693775654, 0.025399014353752136, -0.0463150218129158, 0.019952824339270592, 0.0698348730802536, 0.001531624118797481, 0.06032009422779083, -0.061149176210165024, -0.023120982572436333, 0.016413835808634758, -0.028902698308229446, 0.05014696717262268, 0.05280621722340584, -0.02555232308804989, 0.004966916982084513, -0.04544907063245773, -0.06198139116168022, 0.01855446584522724, 0.01154287625104189, -0.030962783843278885, 0.0562380813062191, -0.032298002392053604, 0.0023257494904100895, 0.01024501584470272, 0.06812234967947006, 0.05205141380429268, -0.016977908089756966, 0.02091449685394764, -0.03643983229994774, -0.03783057630062103, -0.03104078397154808, 0.005921577103435993, -0.036504488438367844, -0.012859844602644444, -0.08194635063409805, 0.014591501094400883, 0.000350514572346583, 0.006536613218486309, 0.04830031841993332, 0.010026369243860245, 0.06323408335447311, 0.015710994601249695, -0.04845714196562767, -0.06597108393907547, -0.010776309296488762, 0.03700970858335495, 0.01940828748047352, 0.06146962195634842, 0.061570972204208374, 0.030443407595157623, -0.019285524263978004, -0.05964209884405136, 0.04563064128160477, -0.0042747319675982, -0.035776495933532715, 0.010956337675452232, 0.004579615313559771, -0.0022817037533968687, 0.047739770263433456, 0.022968094795942307, -0.024088969454169273, -0.025169098749756813, -0.03922341391444206, -0.020213700830936432, 0.030645648017525673, 0.03991343080997467, -0.04262847825884819, 0.007343901786953211, 0.08564183115959167, 0.014552384614944458, 0.031609874218702316, 0.04444422572851181, 0.024167655035853386, -0.02228076569736004, -0.027974650263786316, -0.028590576723217964, -0.01649901084601879, -0.05977031961083412, -0.009149832651019096, 0.03827269375324249, -0.01947762258350849, 0.016426505520939827, -0.06336566805839539, 0.08835984766483307, -0.04397239536046982, -0.028798697516322136, 0.020953167229890823, 0.06317156553268433, 0.04552436247467995, 0.06712374091148376, -0.019233087077736855, 0.03061172366142273, 0.017667053267359734, -0.02657107636332512, -0.04772541671991348, -0.051717400550842285, -0.0028802973683923483, -0.019653191789984703, -0.03456873819231987, -0.004444960039108992, 0.06796037405729294, 0.026755541563034058, 0.045678310096263885, 0.008077288046479225, -0.018768372014164925, 0.008036277256906033, -0.038976892828941345, 0.04210789129137993, 0.04258846119046211, -0.020870266482234, 0.017940768972039223, 0.02188757061958313, 0.028946703299880028, -0.054711923003196716, 0.027436506003141403, 0.01220338512212038, 0.007036644034087658, 0.03907303512096405, 0.020091846585273743, -0.025016281753778458, 0.008808299899101257, -0.023388631641864777, -0.06520487368106842, -0.024879826232790947, -0.010166261345148087, 0.015326009131968021, 0.043404191732406616, 0.0056261345744132996, -0.04076579213142395, -0.020692089572548866, -0.01792716234922409, 0.011961802840232849, 0.06392666697502136, -0.000760832685045898, 0.035345032811164856, 0.030408943071961403, 0.028703924268484116, -0.026785537600517273, -0.09470765292644501, -0.006044622976332903, 0.0018137446604669094, -0.04877466335892677, 0.020598027855157852, -0.04438868165016174, -0.006800459232181311, 0.03356955572962761, -0.01637614332139492, -0.031621918082237244, 0.033411264419555664, 0.011772960424423218, 0.00044436234747990966, -0.020170247182250023, -0.057517506182193756, -0.04511626809835434, 0.0301734060049057, 0.04439612478017807, -0.008688376285135746, 0.023664919659495354, -0.006551435217261314, -0.07213370501995087, -0.03121105208992958, 0.013925224542617798, -0.00032777650631032884, 0.048395685851573944, -0.029710639268159866, -0.0712294653058052, 0.046294916421175, 0.041729170829057693, 0.0667201578617096, 0.02668054960668087, -0.004712694324553013, -0.03011273592710495, 0.029726240783929825, 0.041038788855075836, -0.035972341895103455, -0.044626545161008835, -0.019361810758709908, -0.05982054024934769, -0.01090522762387991, -0.011139435693621635, -0.011868716217577457, -0.07818660885095596, 0.0097797317430377, -0.004991594236344099, -0.015547428280115128, 0.059893012046813965, 0.006472510751336813, 0.008438969030976295, 0.10383640974760056, 0.03195059671998024, 0.0023631977383047342, 0.017805954441428185, 0.036114197224378586, -0.0758197158575058, -0.00794261321425438, 0.0500456728041172, 0.0567692369222641, 0.05024329945445061, 0.06652019917964935, -0.04629777744412422, -0.037845563143491745, 0.018244456499814987, 0.06650602072477341, -0.04106376692652702, -0.043002694845199585, 0.09126979857683182, 0.056164223700761795, -0.04145435616374016, 0.04236083850264549, 0.00687146931886673, -0.01927802339196205, 0.07081717997789383, 0.039113402366638184, -0.008967348374426365, -0.03562507405877113, 0.024780770763754845, -0.02743164636194706, -0.03927168250083923, 0.005551986861974001, -0.02155373990535736, -0.018718425184488297, 0.0715436115860939, -0.03791624307632446, 0.008281218819320202, 0.010541103780269623, -0.011458371765911579, -0.03998427838087082, 0.015494094230234623, 0.0422971136868, -0.044035956263542175, 0.008243718184530735, 0.009027749300003052, 0.03483116626739502, 0.002490632003173232, -0.03821859136223793, -0.03034432791173458, -0.014298683032393456, 0.08857085555791855, 0.007055887021124363, -0.007831455208361149, -0.0349525548517704, -0.002248247852548957, -0.03767175227403641, -0.003222811734303832, 0.010227198712527752, 8.761749631958082e-05, -0.04382576048374176, -0.043408989906311035, -0.008123324252665043, 0.010976429097354412, 0.09559370577335358, 0.020003534853458405, 0.03623010963201523, 0.0234067365527153, -0.01756623573601246, 0.005620216019451618, -0.024421092122793198, 0.009357371367514133, -0.03272551670670509, 0.025085613131523132, -0.0521698035299778, 0.0016492956783622503, 0.02311481349170208, 0.03286001831293106, 0.01309652253985405, 0.035504210740327835, 0.10196229815483093, -0.06356968730688095, 0.041933007538318634, 0.009227316826581955, 0.0062858653254806995, 0.07219922542572021, -0.03773224726319313, -0.015157639048993587, 0.0451362282037735, 0.011015438474714756, -0.050075046718120575, -0.006679155398160219, -0.05174526944756508, -0.057635460048913956, 0.059263553470373154, -0.0936308428645134, 0.0041158003732562065, 0.030449625104665756, 0.04157227650284767, -0.037475891411304474, 0.0015166952507570386, -0.0040998817421495914, -0.017144329845905304, 0.00976873654872179, -0.009339231997728348, 0.03858545795083046, 0.0018928252393379807, 0.050948429852724075, -0.030751602724194527, 0.04281602054834366, 0.028066743165254593, -0.034449104219675064, -0.051686644554138184, -0.008443498983979225, 0.0009538265876471996, -0.0056050363928079605, -0.005371119827032089, -0.04299130663275719, 0.031338855624198914, -0.015428152866661549, 0.00443774089217186, -0.015371774323284626, -0.002037616213783622, -0.03801447153091431, -0.04718411713838577, -0.04169297590851784, -0.03522389754652977, -0.023572752252221107, 0.028689641505479813, -0.02646252140402794, -0.03253023698925972, -0.08991426974534988, -0.06626066565513611, 0.021625954657793045, 0.04541239142417908, -0.0013940909411758184, -0.021506687626242638, -0.039874084293842316, -0.011822210624814034, -0.01297086477279663, 0.030673617497086525, -0.002148053841665387, 0.04908691719174385, 0.008935799822211266, -0.005843879655003548, -0.06884447485208511, -0.04557333514094353, -0.07078147679567337, -0.0008696320001035929, 0.013746863231062889, -0.07005608826875687, 0.004461986944079399, -0.02629571594297886, -0.06411262601613998, 0.00546253053471446, 0.017797110602259636, 0.02094867452979088, -0.028179148212075233, -0.023721259087324142, -0.01994216814637184, -0.025568122044205666, -0.02784646302461624, -0.03834833949804306, -0.002377412049099803, 0.016478655859827995, -0.035406436771154404, -0.02786083146929741, 0.039395298808813095, 0.013863306492567062, -0.014982474967837334, 0.015060636214911938, 0.05049679055809975, 0.05757457762956619, 0.011351272463798523, -0.00823402963578701, -0.010242867283523083, 0.006883144844323397, 0.05660012364387512, -0.07081413269042969, -0.048114459961652756, -0.029056375846266747, -0.058467160910367966, 0.1108960285782814, 0.0017966775922104716, 0.02697952464222908, 0.0014729565009474754, 0.004957456141710281, -0.03665788844227791, -0.006170474924147129, -0.0067114331759512424, -0.021601812914013863, 0.04012630507349968, -0.037782374769449234, -0.031313709914684296, -0.018836863338947296, 0.04661206528544426, 0.0014853463508188725, -0.016370244324207306, -0.05053260177373886, -0.05962010845541954, -0.019668132066726685, 0.034355998039245605, 0.01035877875983715, 0.015682969242334366, 0.029982635751366615, 0.0353042334318161, 0.02040773443877697, 0.0205775648355484, -0.006547962315380573, 0.007474167738109827, 0.0069264075718820095, 0.04216313734650612, 0.006204960402101278, -0.03394218534231186, 0.013506629504263401, -0.0038171622436493635, 0.04836452379822731, 0.061753299087285995, -0.023148827254772186, 0.039094991981983185, -0.011595163494348526, -0.01777818240225315, -0.002421468496322632, -0.037237368524074554, -0.020721714943647385, 0.004612855147570372, 0.006706791464239359, -0.03454624116420746, 0.017372310161590576, -0.01812809333205223, 0.03530625253915787, 0.01935749314725399, -0.03675356134772301, -0.04169526323676109, -0.05082462355494499, 0.05524292588233948, -0.036628834903240204, -0.03154505416750908, 0.006024354137480259, -0.03908659145236015, -0.043458450585603714, 0.0351521298289299, 0.010031807236373425, -0.013360974378883839, 0.02512325905263424, -0.024991720914840698, 0.0362602062523365, 0.04009023681282997, -0.0026555710937827826, -0.028360318392515182, 0.030896520242094994, -0.08602234721183777, -0.012029498815536499, -0.022471433505415916, 0.029438311234116554, -0.013242281973361969, 0.015442554838955402, -0.01437876746058464, -0.0054753366857767105, -0.04258120432496071, 0.04755197465419769, -0.01859908364713192, -0.06870205700397491, 0.01470877043902874, 0.0042498111724853516, 0.027016330510377884, -0.001216128352098167, 0.09015780687332153, -0.012022599577903748, 0.018645770847797394, -0.09132695198059082, -0.04772753268480301, 0.022508829832077026, 0.04012347757816315, 0.020381543785333633, 0.005158893298357725, -0.02812253311276436, -0.01927192322909832, 0.06880275160074234, -0.031807295978069305, 0.045620039105415344, 0.023033497855067253, 0.007586540188640356, -0.07128427922725677, 0.00400942238047719, -0.01951177790760994, 0.05477533116936684, 0.037129178643226624, -0.027728650718927383, 0.011775517836213112, -0.020487410947680473, 0.02323490008711815, 0.014793163165450096, -0.01875353418290615, -0.03364428132772446, -0.015415229834616184, 0.0020590380299836397, 0.038462452590465546, -0.007631716784089804, -0.017381522804498672, -0.024731339886784554, -0.019405771046876907, -0.04654645919799805, 0.03407447040081024, -0.011306943371891975, 0.016661399975419044, 0.01816343329846859, -0.03638990968465805, 0.006554038263857365, 0.03166460990905762, 0.006554426159709692, 0.0058263191021978855, -0.03713278844952583, -0.020550960674881935, 0.06023097783327103, -0.011048476211726665, 0.04062469303607941, 0.027700362727046013, 0.0021342833060771227, -0.013202212750911713, 0.015926769003272057, 0.03756203502416611, 0.009622489102184772, 0.03200943395495415, 0.0361432284116745, -0.02892252430319786, -0.004309121053665876, -0.021977709606289864, -0.0023114418145269156, 0.00793149508535862, 0.007860514335334301, 0.005119677633047104, -0.024751650169491768, 0.008691693656146526]}}], 'writeConcernErrors': [], 'nInserted': 0, 'nUpserted': 0, 'nMatched': 0, 'nModified': 0, 'nRemoved': 0, 'upserted': []}

## Index and Query Your Embeddings

In [None]:
from pymongo.operations import SearchIndexModel
import time

# Create your index model, then create the search index
search_index_model = SearchIndexModel(
  definition = {
    "fields": [
      {
        "type": "vector",
        "path": "embedding",
        "similarity": "dotProduct",
        "numDimensions": 768
      }
    ]
  },
  name="vector_index",
  type="vectorSearch"
)
result = collection.create_search_index(model=search_index_model)

# Wait for initial sync to complete
print("Polling to check if the index is ready. This may take up to a minute.")
predicate=None
if predicate is None:
  predicate = lambda index: index.get("queryable") is True

while True:
  indices = list(collection.list_search_indexes(result))
  if len(indices) and predicate(indices[0]):
    break
  time.sleep(5)
print(result + " is ready for querying.")

In [None]:
# Generate embedding for the search query
query_embedding = get_embedding("ocean tragedy")

# Sample vector search pipeline
pipeline = [
   {
      "$vectorSearch": {
            "index": "vector_index",
            "queryVector": query_embedding,
            "path": "embedding",
            "exact": True,
            "limit": 5
      }
   },
   {
      "$project": {
         "_id": 0,
         "text": 1,
         "score": {
            "$meta": "vectorSearchScore"
         }
      }
   }
]

# Execute the search
results = collection.aggregate(pipeline)

# Print results
for i in results:
   print(i)
