Skip to content

Conversation

@michalkulakowski
Copy link
Collaborator

🛠 Summary

CVS-178915

This PR adds allowed_media_domains parameter that specifies domains from which OVMS can download images requested in v3/completion request image_url field

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

@michalkulakowski michalkulakowski changed the title Support allowed domains parameter [WIP] Support allowed domains parameter Jan 19, 2026
@michalkulakowski michalkulakowski force-pushed the mkulakow/allowed_domains_parameter branch from 761630d to df5ea5f Compare January 23, 2026 12:27
@michalkulakowski michalkulakowski changed the title [WIP] Support allowed domains parameter Support allowed domains parameter Jan 23, 2026
@michalkulakowski michalkulakowski force-pushed the mkulakow/allowed_domains_parameter branch from df5ea5f to e85d080 Compare January 23, 2026 12:30
| `allowed_origins` | `string` (default: *) | Comma-separated list of allowed origins in CORS requests. |
| `api_key_file` | `string` | Path to the text file with the API key for generative endpoints `/v3/`. The value of first line is used. If not specified, server is using environment variable API_KEY. If not set, requests will not require authorization.|
| `allowed_local_media_path` | `string` | Path to the directory containing images to include in requests. If unset, local filesystem images in requests are not supported.|
| `allowed_media_domains` | `string` | Comma separated list of media domains from which urls can be used as input for LLMs. Set to \"all\" to disable this restriction."
Copy link
Collaborator

@dkalinowski dkalinowski Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| `allowed_media_domains` | `string` | Comma separated list of media domains from which urls can be used as input for LLMs. Set to \"all\" to disable this restriction."
| `allowed_media_domains` | `string` | Comma separated list of media domains from which URLs can be used as input for LLMs. Set to \"all\" to disable this restrictions."

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mention default behavior (as in the allowed_local_media_path - "If unset, ...")

Generative endpoints starting with `/v3`, might be restricted with authorization and API key. It can be set during the server initialization with a parameter `api_key_file` or environment variable `API_KEY`.
The `api_key_file` should contain a path to the file containing the value of API key. The content of the file first line is used. If parameter api_key_file and variable API_KEY are not set, the server will not require any authorization. The client should send the API key inside the `Authorization` header as `Bearer <api_key>`.

OVMS supports multimodal models with image inputs provided as url's. However due to vulnerability to Server-Side Request Forgery (SSRF) attacks, all the url's are restricted by default. To allow fetching image from specific domains use `--allowed_media_domains` parameter described [here](parameters.md). Also, consider setting OVMS_MEDIA_URL_ALLOW_REDIRECTS=0 to prevent HTTP redirects from being followed to bypass domain restrictions.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
OVMS supports multimodal models with image inputs provided as url's. However due to vulnerability to Server-Side Request Forgery (SSRF) attacks, all the url's are restricted by default. To allow fetching image from specific domains use `--allowed_media_domains` parameter described [here](parameters.md). Also, consider setting OVMS_MEDIA_URL_ALLOW_REDIRECTS=0 to prevent HTTP redirects from being followed to bypass domain restrictions.
OVMS supports multimodal models with image inputs provided as URL. However due to prevent Request Forgery (SSRF) attacks, all the URLs are restricted by default. To allow fetching image from specific domains use `--allowed_media_domains` parameter described [here](parameters.md). Also, consider setting OVMS_MEDIA_URL_ALLOW_REDIRECTS=0 to prevent HTTP redirects from being followed to bypass domain restrictions.

@mzegla mzegla requested a review from Copilot January 23, 2026 12:33
cxxopts::value<std::string>()->default_value(""),
"CPU_EXTENSION")
("allowed_media_domains",
"Comma separated list of media domains from which urls can be used as input for LLMs. Set to \"all\" to disable this restriction.",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"Comma separated list of media domains from which urls can be used as input for LLMs. Set to \"all\" to disable this restriction.",
"Comma separated list of media domains from which URLs can be used as input for LLMs. Set to \"all\" to disable this restriction.",

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for restricting image URL domains in multimodal LLM requests to mitigate Server-Side Request Forgery (SSRF) vulnerabilities. The feature introduces an allowed_media_domains parameter that controls which domains can be used in image URL requests.

Changes:

  • Added allowed_media_domains CLI parameter and configuration option to specify allowed domains for image URLs
  • Implemented domain validation logic that checks URLs against the allowed domains list before downloading images
  • Updated documentation to explain the security implications and usage of the new parameter

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/test/ovmsconfig_test.cpp Added test coverage for parsing the new allowed_media_domains parameter
src/test/http_openai_handler_test.cpp Added comprehensive tests for domain validation including positive and negative test cases
src/llm/visual_language_model/legacy/servable.cpp Updated to pass allowedMediaDomains parameter through the request parsing chain
src/llm/apis/openai_completions.hpp Updated method signatures to accept allowedMediaDomains parameter
src/llm/apis/openai_completions.cpp Implemented domain validation logic and integrated it into the message parsing flow
src/cli_parser.cpp Added CLI option definition and server settings configuration for allowed_media_domains
src/capi_frontend/server_settings.hpp Added allowedMediaDomains field to server settings structure
docs/security_considerations.md Documented the SSRF vulnerability and mitigation using the new parameter
docs/parameters.md Added documentation for the allowed_media_domains parameter
docs/model_server_rest_api_chat.md Added note about requiring the parameter for URL-based image requests
.bazelrc Added environment variable for test configuration

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

return absl::OkStatus();
}

static bool isDomainAllowed(std::vector<std::string> allowedDomains, const char* url) {
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The allowedDomains parameter is passed by value, causing an unnecessary copy of the vector on each function call. Pass by const reference instead: const std::vector<std::string>& allowedDomains.

Copilot uses AI. Check for mistakes.
return false;
}
bool allowed = false;
for (auto allowedDomain : allowedDomains) {
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop variable allowedDomain creates a copy of each string in the vector. Use a const reference to avoid unnecessary copies: for (const auto& allowedDomain : allowedDomains).

Copilot uses AI. Check for mistakes.
bool allowed = false;
for (auto allowedDomain : allowedDomains) {
if (allowedDomain.compare(host) == 0) {
allowed = true;
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop continues iterating even after finding a match. Add a break statement after setting allowed = true to exit early when a match is found.

Suggested change
allowed = true;
allowed = true;
break;

Copilot uses AI. Check for mistakes.
char* host;
rc = curl_url_get(parsedUrl, CURLUPART_HOST, &host, 0);
if (rc) {
SPDLOG_LOGGER_DEBUG(llm_calculator_logger, "Parsing url {} hostname failed", url);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leak, missing curl_url_cleanup(parsedUrl); ?
use guards or free everything in good order?

#endif
"--cache_dir", "/tmp/model_cache",
"--allowed_local_media_path", "/tmp/path",
"--allowed_media_domains", "raw.githubusercontent.com,githubusercontent.com,google.com",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about C-API parameter?

return nullptr;
}

DLL_PUBLIC OVMS_Status* OVMS_ServerSettingsSetAllowedLocalMediaPath(OVMS_ServerSettings* settings,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OVMS_API_VERSION_MINOR?

Copy link
Collaborator

@dkalinowski dkalinowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should increase OVMS_API_VERSION_MINOR
other than that it looks ok

| `allowed_origins` | `string` (default: *) | Comma-separated list of allowed origins in CORS requests. |
| `api_key_file` | `string` | Path to the text file with the API key for generative endpoints `/v3/`. The value of first line is used. If not specified, server is using environment variable API_KEY. If not set, requests will not require authorization.|
| `allowed_local_media_path` | `string` | Path to the directory containing images to include in requests. If unset, local filesystem images in requests are not supported.|
| `allowed_media_domains` | `string` | Comma separated list of media domains from which urls can be used as input for LLMs. Set to \"all\" to disable this restriction."
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please mention default behavior (as in the allowed_local_media_path - "If unset, ...")

return true;
}
CURLUcode rc;
CURLU* parsedUrl = curl_url();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure it's a valid pointer at this point?

@michalkulakowski michalkulakowski merged commit edcba96 into main Jan 26, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants