Skip to content

[OPIK-6626] [BE] feat: support multiple environments per prompt version#6873

Merged
BorisTkachenko merged 5 commits into
mainfrom
boryst/OPIK-6626-be-support-multiple-environments-for-prompts
May 27, 2026
Merged

[OPIK-6626] [BE] feat: support multiple environments per prompt version#6873
BorisTkachenko merged 5 commits into
mainfrom
boryst/OPIK-6626-be-support-multiple-environments-for-prompts

Conversation

@BorisTkachenko
Copy link
Copy Markdown
Contributor

Details

Extends the prompt version environment model from a single environment field to a Set<String> environments, allowing one version to be pinned to multiple environments simultaneously.

Schema changes:

  • New prompt_version_envs table (migration 000077) tracks environment-to-version mappings with soft-delete via ended_at, replacing the environment column on prompt_versions
  • Indexes added for efficient lookup by workspace+prompt+environment and by workspace+version

API changes:

  • PromptVersion.environmentenvironments: Set<String> (nullable, max 100 entries, each validated against Environment.NAME_PATTERN)
  • PromptVersionEnvironmentUpdate.environmentenvironments: Set<String> (non-null; empty set clears all; previously nullable string mapped to clear behavior)
  • Validation updated: mask versions reject non-empty environments sets

Service / DAO changes:

  • PromptService.setVersionEnvironment accepts Set<String> and performs: close all current env assignments for the version, close any existing ownerships for the incoming envs on the same prompt, then batch-insert the new assignments
  • createVersionWithEnvironment checks for conflicts via findTakenEnvironments (set-based) instead of single-version lookup; raises 409 listing all conflicting env names
  • savePromptVersion now also calls saveEnvironments in the same transaction
  • Queries that return PromptVersion objects use a correlated subquery (JSON_ARRAYAGG) to populate environments from the new table
  • EnvironmentService.findExistingNames added to validate the full set of requested envs before assignment

Change checklist

  • User facing
  • Documentation update

Issues

  • OPIK-6626

Testing

Integration tests in PromptResourceTest updated to cover:

  • Creating a version with multiple environments
  • Setting multiple environments via PUT /v1/private/prompts/versions/{id}/environment
  • Conflict detection when two versions claim overlapping environments
  • Clearing all environments with an empty set
  • Mask version rejection when environments is non-empty
  • Environment existence validation (404 on unknown names)

Documentation

No public documentation changes — internal API behavior change. Callers previously passing a single environment string must migrate to environments: [<name>] (or environments: [] to clear).

@BorisTkachenko BorisTkachenko self-assigned this May 26, 2026
@github-actions github-actions Bot added java Pull requests that update Java code Backend tests Including test files, or tests related like configuration. labels May 26, 2026
@BorisTkachenko BorisTkachenko marked this pull request as ready for review May 26, 2026 17:21
@BorisTkachenko BorisTkachenko requested a review from a team as a code owner May 26, 2026 17:21
Copy link
Copy Markdown
Contributor

@thiagohora thiagohora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few follow-ups before moving forward

Comment thread apps/opik-backend/src/main/java/com/comet/opik/api/PromptVersion.java Outdated
'type', pv.type,
'version_type', pv.version_type,
'environment', pv.environment,
'environments', (SELECT JSON_ARRAYAGG(pve.environment) FROM prompt_version_envs pve WHERE pve.workspace_id = pv.workspace_id AND pve.version_id = pv.id AND pve.ended_at IS NULL),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check the query plan? I think this will run once per row

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the query

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit d1f5296 addressed this comment by introducing CTEs that materialize prompt versions and their active environments before the final SELECT, so the environment aggregation no longer runs per row but is computed once and joined back to the version records.

Copy link
Copy Markdown
Contributor Author

@BorisTkachenko BorisTkachenko May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After updating the query, I checked the query plan. All CTEs are run only once, and everything seems okay.

| -> Table scan on p  (cost=46 rows=435)
-> Select #2 (subquery in projection; run only once)
    -> Aggregate: count(pv.id)  (cost=116 rows=1)
        -> Filter: (pv.version_type = 'prompt_version')  (cost=79.9 rows=366)
            -> Table scan on pv  (cost=79.9 rows=731)
-> Select #4 (subquery in projection; run only once)
    -> Limit: 1 row(s)
        -> Sort: id DESC, limit input to 1 row(s) per chunk
            -> Stream results  (cost=46.5 rows=366)
                -> Left hash join (<hash>(ve.version_id)=<hash>(pv.id)), extra conditions: (ve.version_id = pv.id)  (cost=46.5 rows=366)
                    -> Filter: (pv.version_type = 'prompt_version')  (cost=79.9 rows=366)
                        -> Table scan on pv  (cost=79.9 rows=731)
                    -> Hash
                        -> Table scan on ve  (cost=3.1..3.1 rows=1)
                            -> Materialize CTE ver_envs if needed  (cost=0.585..0.585 rows=1)
                                -> Group aggregate: json_arrayagg(pve.environment)  (cost=0.485 rows=1)
                                    -> Nested loop inner join  (cost=0.385 rows=1)
                                        -> Sort: version_id  (cost=0.26 rows=1)
                                            -> Filter: (pve.ended_at is null)  (cost=0.26 rows=1)
                                                -> Index lookup on pve using uq_pve_active_env (workspace_id='default')  (cost=0.26 rows=1)
                                        -> Single-row covering index lookup on pv using PRIMARY (id=pve.version_id)  (cost=1.25 rows=1)
-> Select #9 (subquery in projection; run only once)
    -> Left hash join (<hash>(ve.version_id)=<hash>(pv.id)), extra conditions: (ve.version_id = pv.id)  (cost=46.5 rows=366)
        -> Filter: (pv.version_type = 'prompt_version')  (cost=79.9 rows=366)
            -> Table scan on pv  (cost=79.9 rows=731)
        -> Hash
            -> Table scan on ve  (cost=3.1..3.1 rows=1)
                -> Materialize CTE ver_envs if needed (query plan printed elsewhere)  (cost=0.585..0.585 rows=1)
 |

Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/PromptService.java Outdated
Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/PromptService.java Outdated
Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/PromptVersionDAO.java Outdated
Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/PromptVersionDAO.java Outdated
Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/PromptService.java Outdated
Copy link
Copy Markdown
Contributor

@thiagohora thiagohora left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two review notes (non-blocking):

  1. Tests — the new environment tests assert only .environments() rather than the whole PromptVersion. Reuse the existing getPromptVersionAndAssert / usingRecursiveComparison helpers so other fields are covered too. (inline)

  2. Single-owner invariant — dropping the unique index moves this guarantee from the DB to the app lock. I verified the write paths and a DB-level unique constraint via a generated active_env column is fully compatible and would restore defense-in-depth. (inline)

Copy link
Copy Markdown
Member

@andrescrz andrescrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a change in the contract for environments. I agree that it's for good reasons and a good one on the long term. But it breaks backwards compatibility. Can we do that? How do you ensure that we don't break existing clients? Are our clients only internal SDKs or FE? Or do we potentially have external clients?

@BorisTkachenko
Copy link
Copy Markdown
Contributor Author

There's a change in the contract for environments. I agree that it's for good reasons and a good one on the long term. But it breaks backwards compatibility. Can we do that? How do you ensure that we don't break existing clients? Are our clients only internal SDKs or FE? Or do we potentially have external clients?

@andrescrz This feature is not yet in production, so the current environment is not being used by Frontend or SDK. Breaking contract is safe here.

Copy link
Copy Markdown
Member

@andrescrz andrescrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No concern about the backwards compatibility after your explanation.

I gave a more specific review.

Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/EnvironmentDAO.java Outdated
Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/EnvironmentService.java Outdated
Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/PromptService.java Outdated
Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/PromptService.java Outdated
Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/PromptService.java Outdated
--liquibase formatted sql
--changeset boryst:000077_add_prompt_version_envs_table

CREATE TABLE prompt_version_envs (
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the low number of envs (100). I'm not sure if a normalised schema here is beneficial over simply storing an array of values on the version. Likely, the number isn't going to increase.

I believe this needs more investigation for a proper decision.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for a separate table is as following:

  • Ownership tracking across versions. The ended_at column is what makes the "move environment" operation possible without losing history, was discussed with product team that history might be needed later. With an array on the version row, moving env=production from v2 to v3 means patching two rows (v2.environments -= production, v3.environments += production). With the current schema you soft-delete the old ownership row and insert a new one atomically — the history is preserved.

  • DB-level uniqueness. The generated column + unique index enforces "only one active version owns this environment per prompt" at the database level. With a JSON array you can't do that — you'd be back to app-level enforcement only.

  • Indexed ownership lookups. "Which version currently owns environment X for prompt Y?" is a single indexed point lookup against prompt_version_envs WHERE environment = :env AND ended_at IS NULL. With a JSON array on the version you'd need JSON_CONTAINS which is not indexable in MySQL.

Copy link
Copy Markdown
Member

@andrescrz andrescrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR still has some blocking gaps. Please take a look at the specific comments. I leave the other reviewer @thiagohora to double check the queries as result of the plans review on PromptDAO.

Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/EnvironmentDAO.java Outdated
Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/EnvironmentService.java Outdated
Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/PromptVersionDAO.java Outdated

@SqlUpdate("""
UPDATE prompt_version_envs
SET ended_at = CURRENT_TIMESTAMP(6)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you should track the typical update metadata: last_updated_at and last_updated_by. Any reason for not doing this in the new prompt_version_envs table?

Copy link
Copy Markdown
Contributor Author

@BorisTkachenko BorisTkachenko May 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this is a desired behavior. With our typical update metadata, we will not be able to track the history. It will be just updated in place.

Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/PromptVersionDAO.java Outdated
Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/PromptVersionDAO.java Outdated
Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/PromptVersionDAO.java Outdated
Comment thread apps/opik-backend/src/main/java/com/comet/opik/domain/PromptVersionDAO.java Outdated
Comment on lines +10 to +11
created_at TIMESTAMP(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
created_by VARCHAR(255) NOT NULL DEFAULT '',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mentioned in another comment. There are update semantics. We likely require the typical metadata for that: last_updated at and by.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never update an environment in a regular way

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker on my end, but you won't capture who closed the environment.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we will be able to capture it. Whoever created the next environment is the one who closed this. So we will be able to retrieve this info.

Copy link
Copy Markdown
Member

@andrescrz andrescrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On my end PR looks good and can go through. I recommend @thiagohora double checks the latest PromptDAO queries, as he requested the execution plan.

Comment on lines +10 to +11
created_at TIMESTAMP(6) NOT NULL DEFAULT CURRENT_TIMESTAMP(6),
created_by VARCHAR(255) NOT NULL DEFAULT '',
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker on my end, but you won't capture who closed the environment.

@BorisTkachenko BorisTkachenko merged commit 018d81e into main May 27, 2026
65 checks passed
@BorisTkachenko BorisTkachenko deleted the boryst/OPIK-6626-be-support-multiple-environments-for-prompts branch May 27, 2026 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Backend java Pull requests that update Java code tests Including test files, or tests related like configuration.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants