Skip to content

Conversation

danbev
Copy link
Member

@danbev danbev commented Oct 2, 2025

This commit adds support for models that use SentenceTransformer layers.

The motivation for this is that if converted model includes any of the numbered layers specified in the original models repository then these changes enable these models to be used and verified. Currently the model-conversion only support the base model output without any of the additional transformation layers.

Usage:
Convert the model that also includes the SentenceTransformer layers:

(venv) $ export EMBEDDING_MODEL_PATH="~/google/embeddinggemma-300M"
(venv) make embedding-convert-model

Verify the produced embeddings from the converted model against the original model embeddings:

(venv) make embedding-verify-logits-st

The original model can be run using SentenceTransformer:

(venv) make embedding-run-original-model-st

Run the converted model using "SentenceTransformer" layers whic enables pooling and normalization:

(venv) make embedding-run-converted-model-st

This commit adds support for models that use SentenceTransformer layers.

The motivation for this is that if converted model includes any of the
numbered layers specified in the original models repository then these
changes enable these models to be used and verified. Currently the
model-conversion only support the base model output without any of
the additional transformation layers.

Usage:
Convert the model that also includes the SentenceTransformer layers:
```console
(venv) $ export EMBEDDING_MODEL_PATH="~/google/embeddinggemma-300M"
(venv) make embedding-convert-model
```

Verify the produced embeddings from the converted model against the
original model embeddings:
```console
(venv) make embedding-verify-logits-st
```

The original model can be run using SentenceTransformer:
```console
(venv) make embedding-run-original-model-st
```

Run the converted model using "SentenceTransformer" layers whic
enables pooling and normalization:
```console
(venv) make embedding-run-converted-model-st
```
@github-actions github-actions bot added examples python python script changes labels Oct 2, 2025
printf("%9.6f ", embeddings[j * n_embd + i]);
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: The embd_norm parameter is not validated to ensure it's within acceptable range before being used

Suggestion: Add validation to ensure embd_norm is a valid value (>= -1) before using it in normalization

Code to review:

printf("Embeddings size: %d\n", n_embeddings);

Suggested fix:

if (embd_norm < -1) { fprintf(stderr, "Invalid embedding normalization value: %d\n", embd_norm); return 1; }

Automated review by PR Code Review Service

} else if (strcmp(argv[i], "-pooling") == 0) {
pooling_enabled = true;
} else if (strcmp(argv[i], "-embd-norm") == 0) {
if (i + 1 < argc) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: The code parses -embd-norm flag without validating the range of embd_norm values, which may cause unexpected behavior for invalid normalization types.

Suggestion: Add validation to ensure embd_norm is within acceptable range (-1 to 2) or is a positive integer for p-norm.

Code to review:

                        embd_norm = std::stoi(argv[++i]);

Suggested fix:

if (embd_norm < -1) { fprintf(stderr, "Invalid embd_norm value: %d\n", embd_norm); return 1; }

Automated review by PR Code Review Service

static void print_usage(int, char ** argv) {
printf("\nexample usage:\n");
printf("\n %s -m model.gguf [-ngl n_gpu_layers] -embd-mode [prompt]\n", argv[0]);
printf("\n %s -m model.gguf [-ngl n_gpu_layers] -embd-mode [-pooling] [-embd-norm <norm>] [prompt]\n", argv[0]);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Unnecessary code duplication in parameter parsing

Suggestion: Refactor duplicated code into a helper function to reduce redundancy

Code to review:

printf("%s: error: unable to load model\n" , __func__);

Automated review by PR Code Review Service

printf("Saving logits to %s\n", bin_filename);
printf("Saving data to %s\n", bin_filename);

FILE * f = fopen(bin_filename, "wb");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Missing error checking for file operations when writing binary data

Suggestion: Add checks for fwrite return value to ensure data was written successfully

Code to review:

fwrite(data_ptr, sizeof(float), data_size, f);

Automated review by PR Code Review Service

@@ -211,27 +239,27 @@

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Missing error checking for file operations when writing text data

Suggestion: Add checks for fprintf return values to ensure data was written successfully

Code to review:

fprintf(f, "%d: %.6f\n", i, data_ptr[i]);

Automated review by PR Code Review Service

printf("\nexample usage:\n");
printf("\n %s -m model.gguf [-ngl n_gpu_layers] -embd-mode [prompt]\n", argv[0]);
printf("\n %s -m model.gguf [-ngl n_gpu_layers] -embd-mode [-pooling] [-embd-norm <norm>] [prompt]\n", argv[0]);
printf("\n");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: The usage message for command-line arguments is updated but doesn't fully reflect the new options.

Suggestion: Ensure usage message includes all new command-line flags and their descriptions.

Code to review:

    printf("  -embd-norm: normalization type for pooled embeddings (default: 2)\n");

Suggested fix:

printf("  -embd-norm: normalization type for pooled embeddings (default: 2)\n");

Automated review by PR Code Review Service


cmake --build ../../build --target llama-logits -j8
# TODO: update logits.cpp to accept a --file/-f option for the prompt
../../build/bin/llama-logits -m "$CONVERTED_MODEL" -embd-mode "$PROMPT"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: The script does not validate that the model file exists before attempting to use it, which could result in a cryptic error message.

Suggestion: Add a check to ensure the model file exists before proceeding.

Code to review:

if [ -n "$USE_POOLING" ]; then

Suggested fix:

if [ ! -f "$CONVERTED_MODEL" ]; then echo "Error: Model file '$CONVERTED_MODEL' not found" >&2; exit 1; fi

Automated review by PR Code Review Service

# TODO: update logits.cpp to accept a --file/-f option for the prompt
../../build/bin/llama-logits -m "$CONVERTED_MODEL" -embd-mode "$PROMPT"
if [ -n "$USE_POOLING" ]; then
../../build/bin/llama-logits -m "$CONVERTED_MODEL" -embd-mode -pooling "$PROMPT"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: The script does not handle the case where cmake build fails, potentially leading to an execution of a non-existent binary.

Suggestion: Add error handling to ensure cmake build succeeds before proceeding.

Code to review:

cmake --build ../../build --target llama-logits -j8

Suggested fix:

cmake --build ../../build --target llama-logits -j8 || { echo "Error: Build failed" >&2; exit 1; }

Automated review by PR Code Review Service

# Parse command line arguments
CONVERTED_MODEL=""
PROMPTS_FILE=""
USE_POOLING=""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue: Variable initialization for USE_POOLING is a good practice but should be done in a way that's consistent with other variable declarations.

Suggestion: Initialize USE_POOLING with empty string at the same time as other variables for consistency.

Code to review:

+USE_POOLING=""

Suggested fix:

CONVERTED_MODEL=""
PROMPTS_FILE=""
USE_POOLING=""

Automated review by PR Code Review Service

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples python python script changes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants