# nomic generate embeddings
https://docs.nomic.ai/atlas/embeddings-and-retrieval/generate-embeddings

https://docs.gpt4all.io/gpt4all_python/home.html#embeddings

In [1]:
from nomic import embed
import numpy as np

output = embed.text(
    texts=['The text you want to embed.'],
    model='nomic-embed-text-v1.5',
    task_type='search_document',
)

embeddings = np.array(output['embeddings'])

In [2]:
embeddings

array([[ 6.25228900e-03,  2.37121580e-02, -1.25488280e-01,
        -2.77252200e-02,  6.45141600e-02, -6.17980960e-03,
        -6.66618350e-04,  5.93566900e-03, -1.12304690e-02,
        -5.94177250e-02, -1.49993900e-02,  1.91497800e-02,
         7.71484400e-02,  3.50952150e-02, -1.55334470e-02,
        -3.19213870e-02,  2.60925300e-02, -4.30603030e-02,
         1.05438230e-02,  8.17260740e-02, -2.97698970e-02,
        -1.59759520e-02, -3.72619630e-02, -1.49993900e-02,
         3.01055900e-02,  3.13568120e-03,  2.45361330e-02,
        -3.31726070e-02, -5.45349120e-02, -2.50854500e-02,
        -1.23214720e-02, -2.53295900e-02,  1.89208980e-02,
        -5.75256350e-02,  2.65045170e-02, -7.08618200e-02,
        -2.48107910e-02,  5.82275400e-02, -1.16424560e-02,
        -8.58306900e-03,  9.53674300e-03,  7.20596300e-03,
        -4.95605470e-02, -2.94189450e-02,  3.12194820e-02,
        -4.21752930e-02,  4.30603030e-02, -1.40762330e-02,
         5.63049300e-02, -1.00585940e-01,  1.62353520e-0

In [4]:
from nomic import embed
embeddings = embed.text(["String 1", "String 2"], inference_mode="local")['embeddings']
print("Number of embeddings created:", len(embeddings))
print("Number of dimensions per embedding:", len(embeddings[0]))

Downloading: 100%|██████████| 274M/274M [03:30<00:00, 1.30MiB/s] 
Verifying: 100%|██████████| 274M/274M [00:02<00:00, 122MiB/s] 
Embedding texts: 100%|██████████| 2/2 [00:02<00:00,  1.06s/inputs]

Number of embeddings created: 2
Number of dimensions per embedding: 768





## Embedding task types
There are four task type options for Nomic Embed

<b>Retrieval task types</b>
1. search_query: Use this when you want to encode a query for question-answering over text that was embedded with search_document.
2. search_document: The default embedding task type. Any document you want to use for retrieval or store in a vector database should use this task type.

<b>Semantic search</b>
* If you want to do semantic similarity search instead of question answering, you should encode both queries and document with the search_document task type.

<b>Classification and clustering tasks</b>
1. classification: Use this if your embeddings are for classification (e.g. training a linear probe for a target classification task)
2. clustering: Use this if your embeddings need very high linear separability (e.g. building a topic model on your embeddings)

https://docs.nomic.ai/atlas/embeddings-and-retrieval/generate-embeddings#embedding-task-types