# Using the databricks-gte-large-en

[Documentation databricks of the endpoint](https://docs.databricks.com/aws/en/machine-learning/foundation-model-apis/supported-models#gte-large-en)

[Model documentation on huggingface](https://huggingface.co/Alibaba-NLP/gte-large-en-v1.5)

In [0]:
%sql
SELECT ai_query('databricks-gte-large-en',
    request => "this is my embedding")

Nota : This model as described in the documentation is only made to return embeddings vectors. But you can use 01.5-ai_classify_embedding to run a similarity yourself.

In [0]:
df = spark.sql(
  """
SELECT ai_query('databricks-gte-large-en',
    request => "this is my embedding")
  """
)
display(df)

Nota : The [databricks-gte-large-en pay-per-token endpoint is going to have limits](https://docs.databricks.com/aws/en/machine-learning/foundation-model-apis/limits).
If you need to run a lot of embeddings on a batch we recommand to use [provision throughput](https://docs.databricks.com/aws/en/machine-learning/foundation-model-apis/deploy-prov-throughput-foundation-model-apis)

Alternatively possible to run python requests to query the model serving endpoint

In [0]:
import requests
import json
import os

# Get the Databricks workspace URL and token
workspace_url = spark.conf.get("spark.databricks.workspaceUrl")
token = dbutils.notebook.entry_point.getDbutils().notebook().getContext().apiToken().get()

# Define the API endpoint for the embedding model
api_url = f"https://{workspace_url}/serving-endpoints/databricks-gte-large-en/invocations"

# Define the request payload (corrected format for embeddings)
payload = {
    "input": ["this is my embedding"]
}

# Set up headers
headers = {
    "Authorization": f"Bearer {token}",
    "Content-Type": "application/json"
}

# Make the API request
response = requests.post(api_url, headers=headers, json=payload)

# Check if the request was successful
if response.status_code == 200:
    result = response.json()
    print("API Response structure:")
    print(json.dumps(result, indent=2)[:500] + "...")
    
    # Extract embeddings (structure may vary)
    if 'data' in result:
        embeddings = result['data'][0]['embedding']
        print(f"\nSuccessfully generated embedding with {len(embeddings)} dimensions")
        print(f"First 10 values: {embeddings[:10]}")
    else:
        print("\nFull response:")
        print(result)
else:
    print(f"Error: {response.status_code}")
    print(f"Response: {response.text}")