Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions guides/databases/_menu.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,3 +10,4 @@
## [ PostgreSQL ](postgres.md)
# [ Schema Evolution ](schema-evolution.md)
# [ Performance Guide ](performance.md)
# [ Vector Embeddings ](vector-embeddings.md)
80 changes: 0 additions & 80 deletions guides/databases/hana.md
Original file line number Diff line number Diff line change
Expand Up @@ -261,86 +261,6 @@ See the [Deploying to Cloud](../deploy/index.md) guide for information about how

The HANA Service provides dedicated support for native SAP HANA features as follows.

### Vector Embeddings

Vector embeddings let you add semantic search, recommendations, and generative AI features to your CAP application. Embeddings are numeric arrays that represent the meaning of unstructured data (text, images, etc.), making it possible to compare and search for items that are semantically related to each other or a user query.

#### Choose an Embedding Model

Choose an embedding model that fits your use case and data (for example english or multilingual text). The model determines the number of dimensions of the resulting output vector. Check the documentation of the respective embedding model for details.

Use the [SAP Generative AI Hub](https://community.sap.com/t5/technology-blogs-by-sap/how-sap-s-generative-ai-hub-facilitates-embedded-trustworthy-and-reliable/ba-p/13596153) for unified consumption of embedding models and LLMs across different vendors and open source models. Check for available models on the [SAP AI Launchpad](https://help.sap.com/docs/ai-launchpad/sap-ai-launchpad-user-guide/models-and-scenarios-in-generative-ai-hub-fef463b24bff4f44a33e98bb1e4f3148#models).

#### Add Embeddings to Your CDS Model
Use the `cds.Vector` type in your CDS model to store embeddings on SAP HANA Cloud. Set the dimension to match your embedding model (for example, 1536 embedding dimensions for OpenAI *text-embedding-3-small*).

```cds
entity Books : cuid {
title : String(111);
description : LargeString;
embedding : Vector(1536); // adjust dimensions to embedding model
}
```

#### Generate Embeddings
Use an embedding model to convert your data (for example, book descriptions) into vectors. The [SAP Cloud SDK for AI](https://sap.github.io/ai-sdk/) makes it easy to call SAP AI Core services to generate these embeddings.

:::details Example using SAP Cloud SDK for AI
```Java
var aiClient = OpenAiClient.forModel(OpenAiModel.TEXT_EMBEDDING_3_SMALL);
var response = aiClient.embedding(
new OpenAiEmbeddingRequest(List.of(book.getDescription())));
book.setEmbedding(CdsVector.of(response.getEmbeddingVectors().get(0)));
```
:::

#### Query for Similarity
At runtime, use SAP HANA's built-in vector functions to search for similar items. For example, find books with embeddings similar to a user question:

::: code-group
```Java [Java]
// Compute embedding for user question
var request = new OpenAiEmbeddingRequest(List.of("How to use vector embeddings in CAP?"));
CdsVector userQuestion = CdsVector.of(
aiClient.embedding(request).getEmbeddingVectors().get(0));

// Compute similarity between user question and book embeddings
var similarity = CQL.cosineSimilarity( // computed on SAP HANA
CQL.get(Books.EMBEDDING), userQuestion);

// Find Books related to user question ordered by similarity
hana.run(Select.from(BOOKS).limit(10)
.columns(b -> b.ID(), b -> b.title(), b -> similarity.as("similarity"))
.orderBy(b -> b.get("similarity").desc())
);
```

```js [Node.js]
const response = await new AzureOpenAiEmbeddingClient(
'text-embedding-3-small'
).run({
input: 'How to use vector embeddings in CAP?'
});

const questionEmbedding = response.getEmbedding();
let similarBooks = await SELECT.from('Books')
.where`cosine_similarity(embedding, to_real_vector(${questionEmbedding})) > 0.9`;
```
:::

:::tip Evolve embeddings with your model
Store embeddings when you create or update your data. Regenerate embeddings if you change your embedding model.
:::

:::tip Use SAP Cloud SDK for AI
Use the [SAP Cloud SDK for AI](https://sap.github.io/ai-sdk/) for unified access to embedding models and large language models (LLMs) from [SAP AI Core](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/what-is-sap-ai-core).
:::

Learn more about the [SAP Cloud SDK for AI (Java)](https://sap.github.io/ai-sdk/docs/java/getting-started) or the [SAP Cloud SDK for AI (JavaScript)](https://sap.github.io/ai-sdk/docs/js/getting-started) {.learn-more}

[Learn more about Vector Embeddings in CAP Java](../../java/cds-data#vector-embeddings) {.learn-more}


### Geospatial Functions

CDS supports the special syntax for SAP HANA geospatial functions:
Expand Down
109 changes: 109 additions & 0 deletions guides/databases/vector-embeddings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
---
label: Vector Embeddings
---
# Vector Embeddings

Vector embeddings convert unstructured content (text, images, and so on) into numeric vectors that encode semantics (meaning). Comparing these vectors enables semantic search, recommendations, and enhanced generative AI features in your CAP application. For example retrieving related records, ranking results by relevance, or augmenting prompts for LLMs.

## Choose an Embedding Model

Choose an embedding model that fits your use case and data (for example English or multilingual text). The model determines the number of dimensions of the resulting output vector. Check the documentation of the respective embedding model for details.

Use the [SAP Generative AI Hub](https://www.sap.com/products/artificial-intelligence/generative-ai-hub.html) for unified consumption of embedding models and LLMs across different vendors and open-source models. Check for available models on the [SAP AI Launchpad](https://help.sap.com/docs/ai-launchpad/sap-ai-launchpad-user-guide/models-and-scenarios-in-generative-ai-hub-fef463b24bff4f44a33e98bb1e4f3148#models).

## Add Embeddings to Your CDS Model
Use the built-in CDL [Vector type](../../cds/types) in your CDS model to store embeddings. Set the vector dimensions to match the embedding model (for example, 768 for *SAP_GXY.20250407*).

```cds
extend Incidents with {
embedding : Vector(768);
}
```

## Generate Embeddings
Use an embedding model to convert your data (for example, incident titles and summaries) into vectors.

:::warning Evolve embeddings with your model
Store embeddings when you create or update your data. Regenerate embeddings if you change your embedding model.
:::

### Generate Embeddings on the Database

Comment thread
MattSchur marked this conversation as resolved.
To generate vector embeddings on write in SAP HANA, you can use the [vector_embedding](https://help.sap.com/docs/hana-cloud-database/sap-hana-cloud-sap-hana-database-sql-reference-guide/vector-embedding-function-vector) function as calculated element [on-write](../../cds/cdl#on-write) with embedding models from [SAP HANA NLP](https://help.sap.com/docs/hana-cloud-database/sap-hana-cloud-sap-hana-database-vector-engine-guide/creating-text-embeddings-with-nlp-51eb170d038d4099a9bbb85c08fda888) or a configured remote source from SAP AI Core:

```cds
extend Incidents with {
@cds.api.ignore
embedding : Vector(768) = vector_embedding(
'Title: ' || title || ', Summary: ' || summary,
'DOCUMENT', 'SAP_GXY.20250407'
) stored;
}
```

Comment thread
MattSchur marked this conversation as resolved.
:::tip Prefer calculated elements for vector embeddings
If the database calculates vector embeddings on write it automatically regenerates the embedding if the input data changes.
:::

Comment thread
MattSchur marked this conversation as resolved.
::: info Local Testing with H2 and SQLite
On H2 and SQLite the `CQL.vectorEmbedding` function is emulated to support local testing.
:::

> [!warning] Java only and <Beta/>
> The `vector_embedding` function is currently in beta and only supported by the CAP Java runtime.

[Learn more about Vector Embeddings in CAP Java](../../java/cds-data#vector-embeddings) {.learn-more}

### Generate Embeddings Programmatically

Alternatively, you can compute vector embeddings in your application layer using the [SAP Cloud SDK for AI](https://sap.github.io/ai-sdk/) to call SAP AI Core services for generating embeddings.
Comment thread
MattSchur marked this conversation as resolved.

:::details Example using SAP Cloud SDK for AI
```Java
var aiClient = OpenAiClient.forModel(OpenAiModel.TEXT_EMBEDDING_3_SMALL);
var response = aiClient.embedding(
new OpenAiEmbeddingRequest(List.of(book.getDescription())));
book.setEmbedding(CdsVector.of(response.getEmbeddingVectors().get(0)));
```
:::

Comment thread
MattSchur marked this conversation as resolved.
:::tip Use SAP Cloud SDK for AI
Use the [SAP Cloud SDK for AI](https://sap.github.io/ai-sdk/) for unified access to embedding models and large language models (LLMs) from [SAP AI Core](https://help.sap.com/docs/sap-ai-core/sap-ai-core-service-guide/what-is-sap-ai-core).
:::

Learn more about the [SAP Cloud SDK for AI (Java)](https://sap.github.io/ai-sdk/docs/java/getting-started) or the [SAP Cloud SDK for AI (JavaScript)](https://sap.github.io/ai-sdk/docs/js/getting-started) {.learn-more}

## Query for Similarity
At runtime, use vector functions to search for similar items. In an example Retrieval-Augmented Generation (RAG) scenario, use `CQL.cosineSimilarity` to enhance the context of a user query for the LLM. First, compute the vector embedding of the user query and use it to find related incidents.

::: code-group
```Java [Java]
// Compute embedding for user question
var query = CQL.val(
"Any incidents with solar inverters this month? How were they resolved?");
var embedding = CQL.vectorEmbedding(query, TextType.QUERY, "SAP_GXY.20250407");

// Compute similarity between user question and incident embeddings
var similarity = CQL.cosineSimilarity(CQL.get(Incidents.EMBEDDING), embedding);

// Find Incidents related to user question ordered by relevance
Select.from(INCIDENTS)
.columns(i -> similarity.times(100).as("relevance"),
i -> i.ID(), i -> i.title(), i -> i.summary(), i -> i.date())
.where(i -> similarity.gt(0.75))
.orderBy(i -> i.get("relevance").desc());
```

```js [Node.js]
const response = await new AzureOpenAiEmbeddingClient(
'text-embedding-3-small'
).run({
input: 'Any incidents with solar inverters this month? How were they resolved?'
});

const questionEmbedding = response.getEmbedding();
let similarIncidents = await SELECT.from('Incidents')
.where`cosine_similarity(embedding, to_real_vector(${questionEmbedding})) > 0.75`;
```
:::

2 changes: 1 addition & 1 deletion java/cds-data.md
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,7 @@ Map data can be nested and may contain nested maps and lists, which are serializ

## Vector Embeddings { #vector-embeddings }

In CDS, [vector embeddings](../guides/databases/hana#vector-embeddings) are stored in elements of type [`Vector`](/@external/cds/types).
In CDS [vector embeddings](../guides/databases/vector-embeddings) are stored in elements of type `cds.Vector`:

CAP Java support the vector type on SAP HANA, as well as H2 and SQLite for local testing. On Postgres (beta) support for vectors requires the [pgvector](https://github.com/pgvector/pgvector) extension.

Expand Down
4 changes: 2 additions & 2 deletions java/working-with-cql/query-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -1643,7 +1643,7 @@ Scalar functions are values that are calculated from other values. This calculat

#### Vector Functions

Vector functions allow you to compute similarity and distance of [vectors](../cds-data.md#vector-embeddings), as well as [vector embeddings](../../guides/databases/hana.md#vector-embeddings) of text data directly in the database.
Vector functions allow you to compute similarity and distance of [vectors](../cds-data.md#vector-embeddings), as well as [vector embeddings](../../guides/databases/vector-embeddings) of text data directly in the database.

##### Computing Vector Embeddings in SAP HANA <Beta />

Expand Down Expand Up @@ -1682,7 +1682,7 @@ On H2 and SQLite, the `vectorEmbedding` function is emulated. You can also use l

##### Computing Vector Similarity and Distance

You can use the functions, `CQL.cosineSimilarity`, and `CQL.l2Distance` (Euclidean distance) in queries to compute the similarity and distance of vectors. Distance functions are used in use cases such as finding similar items based on [vector embeddings](../../guides/databases/hana.md#vector-embeddings), for example to improve the response of an LLM to a user query. To use vector embeddings in functions, wrap them using `CQL.vector`:
You can use the functions, `CQL.cosineSimilarity`, and `CQL.l2Distance` (Euclidean distance) in queries to compute the similarity and distance of vectors. Distance functions are used in use cases such as finding similar items based on [vector embeddings](../../guides/databases/vector-embeddings), for example to improve the response of an LLM to a user query. To use vector embeddings in functions, wrap them using `CQL.vector`:

```Java
CqnVector vec = CQL.vector(embedding);
Expand Down
Loading