## Embeddings API Demo

In [1]:
import requests 
from pprint import pprint 

### Initial Setup

Make sure to run the following from terminal:

```sh
$ uvicorn main:app --reload
```

In [2]:
host = "http://localhost"
port = "8000"

headers = {
    "Content-Type": "application/json"
}

### Available Models

See which models are available through the API.

In [3]:
url = f"{host}:{port}/models"
print(f"url: {url}")

response = requests.get(url=url, headers=headers)
print(response)

if response.status_code == 200:
    pprint(response.json())
else:
    print(response.content)

url: http://localhost:8000/models
<Response [200]>
{'data': {'models': {'all-MiniLM-L12-v2': {'dim': 384},
                     'all-mpnet-base-v2': {'dim': 768}}},
 'generated': '2023-12-02 @ 22:11:51',
 'id': '363da7f3-1181-42ce-90a7-8aa4bd05f463'}


### Text Embeddings

Choose from one of the available models to generate embeddings

In [4]:
checkpoint = "all-mpnet-base-v2"

In [5]:
embeddings_url = f"{host}:{port}/embeddings"
print(f"url (for embeddings): {embeddings_url}")

url (for embeddings): http://localhost:8000/embeddings


In [6]:
text = "Can we mimic OpenAI's embeddings API ouputs? I dunno, let's see."

In [7]:
response = requests.post(
    url=embeddings_url,
    json={"model": checkpoint, "input": text},
    headers=headers
)
print(response)

<Response [200]>


In [8]:
if response.status_code == 200:
    pprint(response.json())
else:
    print(response.content)

{'data': [{'embedding': [-0.03878581151366234,
                         0.03181726858019829,
                         -0.020964443683624268,
                         0.07907546311616898,
                         -0.002314141718670726,
                         -0.016988448798656464,
                         0.03300256282091141,
                         0.022271115332841873,
                         0.05998004972934723,
                         -0.0046280971728265285,
                         -0.008012671954929829,
                         0.05871560052037239,
                         -0.013547050766646862,
                         0.04765020310878754,
                         0.0069237565621733665,
                         -0.0740000456571579,
                         -0.012723986990749836,
                         0.01012865174561739,
                         -0.0020324080251157284,
                         0.008839319460093975,
                         -0.0044814227148890495,
        

### Input Multiple Text Sequences

In [9]:
response = requests.post(
    url=embeddings_url,
    json={
        "model": checkpoint,
        "input": [text, "well, how does it look?"]
    },
    headers=headers
)

print(response)

<Response [200]>


In [10]:
if response.status_code == 200:
    content = response.json()
    print(f"** returned {len(content['data'])} embeddings...")

** returned 2 embeddings...


### Invalid Model Checkpoint

If an _unspecifed_ (and thus _unavailable_) model is provided in the request, the API returns an `HTTP:422` error.

This behavior could be modified with suggestions for a similar model name (i.e., user typo), or to download another open source model, etc.

In [11]:
response = requests.post(
    url=embeddings_url,
    json={"model": "the-finest-model-youve-got!", "input": text},
    headers=headers
)

print(response, "...", response.content)

<Response [422]> ... b'{"detail":[{"type":"assertion_error","loc":["body","model"],"msg":"Assertion failed, Model `the-finest-model-youve-got!` has not been specified and is not available.","input":"the-finest-model-youve-got!","ctx":{"error":{}},"url":"https://errors.pydantic.dev/2.5/v/assertion_error"}]}'
