# Embeddings API Examples with MLX Server

This notebook demonstrates how to use the embeddings endpoint of MLX Server through the OpenAI-compatible API. You'll learn how to generate embeddings, work with batches, compare similarity between texts, and use embeddings for practical applications.

## Setup and Connection

In [1]:
# Import the OpenAI client for API communication
from openai import OpenAI

# Connect to the local MLX Server with OpenAI-compatible API
client = OpenAI(
    base_url="http://localhost:8000/v1",
    api_key="fake-api-key",
)

## Basic Embedding Generation

### Single Text Embedding


In [2]:
# Generate embedding for a single text input
single_text = "Artificial intelligence is transforming how we interact with technology."
response = client.embeddings.create(
    input=[single_text],
    model="mlx-community/DeepSeek-R1-Distill-Qwen-1.5B-MLX-Q8"
)

### Batch Processing Multiple Texts

In [3]:
text_batch = [
    "Machine learning algorithms improve with more data",
    "Natural language processing helps computers understand human language",
    "Computer vision allows machines to interpret visual information"
]

batch_response = client.embeddings.create(
    input=text_batch,
    model="mlx-community/DeepSeek-R1-Distill-Qwen-1.5B-MLX-Q8"
)

In [4]:
# Access all embeddings
embeddings = [item.embedding for item in batch_response.data]
print(f"Number of embeddings generated: {len(embeddings)}")
print(f"Dimensions of each embedding: {len(embeddings[0])}")

Number of embeddings generated: 3
Dimensions of each embedding: 1536


## Semantic Similarity Calculation

One of the most common uses for embeddings is measuring semantic similarity between texts.

In [5]:
import numpy as np

In [6]:
def cosine_similarity_score(vec1, vec2):
    """Calculate cosine similarity between two vectors"""
    dot_product = np.dot(vec1, vec2)
    norm1 = np.linalg.norm(vec1)
    norm2 = np.linalg.norm(vec2)
    return dot_product / (norm1 * norm2)

In [7]:
# Example texts to compare
text1 = "Dogs are loyal pets that provide companionship"
text2 = "Canines make friendly companions for humans"
text3 = "Quantum physics explores the behavior of matter at atomic scales"

In [8]:
# Generate embeddings
comparison_texts = [text1, text2, text3]
comparison_response = client.embeddings.create(
    input=comparison_texts,
    model="mlx-community/DeepSeek-R1-Distill-Qwen-1.5B-MLX-Q8"
)
comparison_embeddings = [item.embedding for item in comparison_response.data]

In [9]:
# Compare similarities
similarity_1_2 = cosine_similarity_score(comparison_embeddings[0], comparison_embeddings[1])
similarity_1_3 = cosine_similarity_score(comparison_embeddings[0], comparison_embeddings[2])
similarity_2_3 = cosine_similarity_score(comparison_embeddings[1], comparison_embeddings[2])

print(f"Similarity between text1 and text2: {similarity_1_2:.4f}")
print(f"Similarity between text1 and text3: {similarity_1_3:.4f}")
print(f"Similarity between text2 and text3: {similarity_2_3:.4f}")

Similarity between text1 and text2: 0.8142
Similarity between text1 and text3: 0.6082
Similarity between text2 and text3: 0.5739


## Text Search Using Embeddings

In [10]:
# Sample document collection
documents = [
    "The quick brown fox jumps over the lazy dog",
    "Machine learning models require training data",
    "Neural networks are inspired by biological neurons",
    "Deep learning is a subset of machine learning",
    "Natural language processing helps with text analysis",
    "Computer vision systems can detect objects in images"
]

In [11]:
# Generate embeddings for all documents
doc_response = client.embeddings.create(
    input=documents,
    model="mlx-community/DeepSeek-R1-Distill-Qwen-1.5B-MLX-Q8"
)
doc_embeddings = [item.embedding for item in doc_response.data]

In [12]:
def search_documents(query, doc_collection, doc_embeddings):
    """Search for documents similar to query"""
    # Generate embedding for query
    query_response = client.embeddings.create(
        input=[query],
        model="mlx-community/DeepSeek-R1-Distill-Qwen-1.5B-MLX-Q8"
    )
    query_embedding = query_response.data[0].embedding
    
    # Calculate similarity scores
    similarities = []
    for doc_embedding in doc_embeddings:
        similarity = cosine_similarity_score(query_embedding, doc_embedding)
        similarities.append(similarity)
    
    # Return results with scores
    results = []
    for i, score in enumerate(similarities):
        results.append((doc_collection[i], score))
    
    # Sort by similarity score (highest first)
    return sorted(results, key=lambda x: x[1], reverse=True)

# Example search
search_results = search_documents("How do AI models learn?", documents, doc_embeddings)

print("Search results:")
for doc, score in search_results:
    print(f"Score: {score:.4f} - {doc}")

Search results:
Score: 0.8574 - Computer vision systems can detect objects in images
Score: 0.8356 - Neural networks are inspired by biological neurons
Score: 0.8266 - Natural language processing helps with text analysis
Score: 0.8141 - Deep learning is a subset of machine learning
Score: 0.7474 - Machine learning models require training data
Score: 0.5936 - The quick brown fox jumps over the lazy dog
