Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
175 changes: 164 additions & 11 deletions docs/sphinx/source/reference/sql_types.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ You can define a *struct type* (often interchangeably referred to as a *nested t

.. code-block:: sql

CREATE TYPE AS STRUCT nested_type (d INT64, e STRING);
CREATE TYPE AS STRUCT nested_type (d INT64, e STRING)
CREATE TABLE foo (a STRING, b DOUBLE, c nested_type, PRIMARY KEY(a));

In this example, :sql:`nested_type` is a struct within the table `foo`, and its full contents are materialized alongside the full record for each entry in the `foo` table.
Expand All @@ -38,8 +38,8 @@ Struct types can have columns which are themselves struct types. Thus, this exam

.. code-block:: sql

CREATE TYPE AS STRUCT nested_nested_type (f STRING, g STRING);
CREATE TYPE AS STRUCT nested_type (d INT64, e STRING, f nested_nested_type);
CREATE TYPE AS STRUCT nested_nested_type (f STRING, g STRING)
CREATE TYPE AS STRUCT nested_type (d INT64, e STRING, f nested_nested_type)
CREATE TABLE foo (a STRING, b DOUBLE, c nested_type, PRIMARY KEY(a));

In this example, :sql:`nested_type` is a struct within the table :sql:`foo`, and :sql:`nested_nested_type` is a struct within the type :sql:`nested_type`.
Expand All @@ -65,21 +65,174 @@ Arrays can also be created with struct columns:

.. code-block:: sql

CREATE TYPE AS STRUCT nested_struct (b STRING, d STRING);
CREATE TYPE AS STRUCT nested_struct (b STRING, d STRING)
CREATE TABLE structArray (a STRING, c nested_struct array);

In this example, `c` is an array, and each record within the array is a struct of type :sql:`nested_struct`. You can generally treat an array as a "nested `ResultSet`"--that is to say, you can just pull up a `ResultSet` of an array type, and interrogate it as if it were the output of its own query.

It is possible to nest arrays within structs, and structs within arrays, to an arbitrary depth (limited by the JVM's stack size, currently).

NULL Semantics
##############
.. _vector_types:

For any unset primitive type or struct type fields, queries will return NULL for the column, unless default values are defined.
Vector Types
############

The Relational Layer supports *vector types* for storing fixed-size numerical vectors, commonly used in machine learning and similarity search applications. A vector type represents a fixed-dimensional array of floating-point numbers with a specific precision.

Vector Type Declaration
=======================

Vectors are declared using the :sql:`VECTOR(dimension, precision)` syntax, where:

* **dimension**: The number of elements in the vector (must be a positive integer)
* **precision**: The floating-point precision, which can be:

* :sql:`HALF` - 16-bit half-precision floating-point (2 bytes per element)
* :sql:`FLOAT` - 32-bit single-precision floating-point (4 bytes per element)
* :sql:`DOUBLE` - 64-bit double-precision floating-point (8 bytes per element)

Examples of vector column definitions:

.. code-block:: sql

CREATE TABLE embeddings (
id BIGINT,
embedding_half VECTOR(128, HALF),
embedding_float VECTOR(128, FLOAT),
embedding_double VECTOR(128, DOUBLE),
PRIMARY KEY(id)
);

Vectors can also be used within struct types:

.. code-block:: sql

CREATE TYPE AS STRUCT model_embedding (
model_name STRING,
embedding VECTOR(512, FLOAT)
)
CREATE TABLE documents (
id BIGINT,
content STRING,
embedding model_embedding,
PRIMARY KEY(id)
);

Internal Storage Format
=======================

Vectors are stored as byte arrays with the following format:

* **Byte 0**: Vector type identifier (0 = HALF, 1 = FLOAT, 2 = DOUBLE)
* **Remaining bytes**: Vector components in big-endian byte order

The storage size for each vector is:

* :sql:`VECTOR(N, HALF)`: 1 + (2 × N) bytes
* :sql:`VECTOR(N, FLOAT)`: 1 + (4 × N) bytes
* :sql:`VECTOR(N, DOUBLE)`: 1 + (8 × N) bytes

Working with Vectors
====================

Vector Literals and Prepared Statements
----------------------------------------

**Important**: Vector literals are not directly supported in SQL. Vectors must be inserted using **prepared statement
parameters** through the JDBC API.

In the JDBC API, you would create a prepared statement and bind vector parameters using the appropriate Java objects
(e.g., :java:`HalfRealVector`, :java:`FloatRealVector`, or :java:`DoubleRealVector`):

.. code-block:: java

// Java JDBC example
PreparedStatement stmt = connection.prepareStatement(
"INSERT INTO embeddings VALUES (?, ?)");
stmt.setLong(1, 1);
stmt.setObject(2, new FloatRealVector(new float[]{0.5f, 1.2f, -0.8f}));
stmt.executeUpdate();

For documentation purposes, the examples below demonstrate vector usage. While vectors can be constructed in SQL by
casting numeric arrays to vector types, **note that inserting vectors requires using prepared statement parameters
through the JDBC API** (as shown in the Java example above). CAST expressions work well for SELECT queries but have
limitations with INSERT statements in prepared statement contexts:

.. code-block:: sql

-- Example: Constructing vectors using CAST (works in SELECT contexts)
SELECT CAST([0.5, 1.2, -0.8] AS VECTOR(3, HALF)) AS half_vector;
SELECT CAST([0.5, 1.2, -0.8] AS VECTOR(3, FLOAT)) AS float_vector;
SELECT CAST([0.5, 1.2, -0.8] AS VECTOR(3, DOUBLE)) AS double_vector;

Casting Arrays to Vectors
--------------------------

While vector literals are not supported, you can use :sql:`CAST` to convert array expressions to vectors. The source array elements can be of any numeric type (:sql:`INTEGER`, :sql:`BIGINT`, :sql:`FLOAT`, :sql:`DOUBLE`):

.. code-block:: sql

-- Cast FLOAT array to FLOAT vector
SELECT CAST([1.2, 3.4, 5.6] AS VECTOR(3, FLOAT)) AS vec;

-- Cast INTEGER array to HALF vector
SELECT CAST([1, 2, 3] AS VECTOR(3, HALF)) AS vec;

-- Cast mixed numeric types to DOUBLE vector
SELECT CAST([1, 2.5, 3L] AS VECTOR(3, DOUBLE)) AS vec;

The array must have exactly the same number of elements as the vector's declared dimension, or the cast will fail with error code :sql:`22F3H`. Only numeric arrays can be cast to vectors.

Querying Vectors
----------------

Vectors can be selected and compared like other column types. When comparing vectors in WHERE clauses, you would typically use prepared statement parameters in your Java/JDBC code, but for illustration purposes, vectors can also be constructed using CAST:

.. code-block:: sql

-- Select vectors
SELECT embedding FROM embeddings WHERE id = 1;

-- Compare vectors for equality (in actual code use PreparedStatement for better performance)
SELECT id FROM embeddings WHERE embedding = CAST([0.5, 1.2, -0.8] AS VECTOR(3, FLOAT));

-- Compare vectors for inequality
SELECT id FROM embeddings WHERE embedding != CAST([1.0, 2.0, 3.0] AS VECTOR(3, FLOAT));

-- Check for NULL vectors
SELECT id FROM embeddings WHERE embedding IS NULL;
SELECT id FROM embeddings WHERE embedding IS NOT NULL;

-- Use IS DISTINCT FROM for NULL-safe comparisons
SELECT embedding IS DISTINCT FROM CAST([0.5, 1.2, -0.8] AS VECTOR(3, FLOAT)) FROM embeddings;

Vectors in Struct Fields
-------------------------

When vectors are nested within struct types, you can access them using dot notation. As with all vector comparisons, you would use prepared statement parameters in your JDBC code:

.. code-block:: sql

-- Access vector within a struct
SELECT embedding.embedding FROM documents WHERE id = 1;

-- Filter by vector within struct (in actual code use PreparedStatement for better performance)
SELECT id FROM documents
WHERE embedding.embedding = CAST([0.5, 1.2, -0.8] AS VECTOR(3, FLOAT));

-- Check NULL for vector field in struct
SELECT id FROM documents WHERE embedding.embedding IS NULL;

Supported Operations
====================

The following operations are supported on vector types:

* **Equality comparison** (:sql:`=`, :sql:`!=`)
* **NULL checks** (:sql:`IS NULL`, :sql:`IS NOT NULL`)
* **NULL-safe comparison** (:sql:`IS DISTINCT FROM`, :sql:`IS NOT DISTINCT FROM`)
* **CAST from numeric arrays** to vectors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a mention here of the fact that you cannot create value indexes on vectors?
We also probably want to mention that you cannot ORDER BY a vector field at this point.
Can we group by a vector field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested any of that (yet), but I can get to it later (not important for now, but nice to cover for completeness).


For array type fields:
Note that mathematical operations (addition, subtraction, dot product, etc.) are performed through the Java API using the :java:`RealVector` interface and its implementations (:java:`HalfRealVector`, :java:`FloatRealVector`, :java:`DoubleRealVector`), not through SQL.

* If the whole array is unset, query returns NULL.
* If the array is set to empty, query returns empty list.
* All elements in the array should be set, arrays like :sql:`[1, NULL, 2, NULL]` are not supported.

Original file line number Diff line number Diff line change
Expand Up @@ -190,7 +190,7 @@ public static RecordKeyExpressionProto.Value toProtoValue(@Nullable Object value
} else if (value instanceof byte[]) {
builder.setBytesValue(ZeroCopyByteString.wrap((byte[])value));
} else if (value != null) {
throw new RecordCoreException("Unsupported value type").addLogInfo(
throw new RecordCoreException("Unsupported value type " + value.getClass()).addLogInfo(
"value_type", value.getClass().getName());
}
return builder.build();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@

import com.apple.foundationdb.annotation.API;
import com.apple.foundationdb.annotation.SpotBugsSuppressWarnings;
import com.apple.foundationdb.linear.RealVector;
import com.apple.foundationdb.record.Bindings;
import com.apple.foundationdb.record.EvaluationContext;
import com.apple.foundationdb.record.ObjectPlanHash;
Expand Down Expand Up @@ -224,6 +225,8 @@ public static Object toClassWithRealEquals(@Nonnull Object obj) {
return obj;
} else if (obj instanceof List) {
return obj;
} else if (obj instanceof RealVector) {
return obj;
} else {
throw new RecordCoreException("Tried to compare non-comparable object " + obj.getClass());
}
Expand Down
Loading