Skip to content

Conversation

@prwhelan
Copy link
Member

Preview currently shows the source document as the index request. There are some internal components that want both the source document and the id of the generated entities, so we've added a query param that controls the output format. as_index_request defaults to false to show the existing structure, and true will show the response that the transform would use to index the generated entities.

Preview currently shows the source document as the index request. There
are some internal components that want both the source document and the
id of the generated entities, so we've added a query param that controls
the output format. `as_index_request` defaults to false to show the
existing structure, and true will show the response that the transform
would use to index the generated entities.
@prwhelan prwhelan added >enhancement :ml/Transform Transform Team:ML Meta label for the ML team v9.3.0 labels Oct 31, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @prwhelan, I've created a changelog YAML for you.

@prwhelan prwhelan marked this pull request as ready for review November 3, 2025 19:09
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/ml-core (Team:ML)

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. However, we should throw if not supported by the transport layer.

Comment on lines 132 to 134
if (out.getTransportVersion().supports(PREVIEW_AS_INDEX_REQUEST)) {
out.writeBoolean(previewAsIndexRequest);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's throw if somebody specifically requested true, but the transport version doesn't support it. This way they don't think they ran into a bug, but instead their cluster just isn't fully upgraded yet.

Copy link
Member Author

@prwhelan prwhelan Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an example of this kind of thing elsewhere? This would effectively break the API until they're on the same version - would it be easier/better to use NodeFeature to instead disable the feature until every node is on the latest version?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would effectively break the API until they're on the same version - would it be easier/better to use NodeFeature to instead disable the feature until every node is on the latest version?

No it wouldn't. Folks can still use false which is the default value. Why would we want to silently fail the request if somebody specifically asked for a new feature that we cannot provide?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be easier/better to use NodeFeature

This I do not know, but that has additional guarantees and costs that we don't really need.

I think transport version is perfectly fine. I am saying that the default behavior of the API should still be usable, but if somebody sends a parameter that is not supported by nodes that will receive the message, we should not silently ignore it and instead tell the user "Hey, this feature you want to use isn't available yet"

This has been done before by other APIs in the past, I don't have any examples handy.

"_preview with "
+ TransformField.PREVIEW_AS_INDEX_REQUEST.getPreferredName()
+ " set to true only works if all the nodes support it.",
RestStatus.FORBIDDEN
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest BAD_REQUEST

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure - in the past I've used 403 or 422 for this, since the request was understood and syntactically correct, but the server is not going to process it. 400 has more of a "I don't know what you just said to me" vibe.

phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Nov 7, 2025
BASE=cb11a7164033f79a0c401a85ca8db22105631888
HEAD=078cff9adadabdb362d925d026037fe3ae84fed7
Branch=main
@prwhelan prwhelan enabled auto-merge (squash) November 7, 2025 14:47
@prwhelan prwhelan merged commit f5e8241 into elastic:main Nov 7, 2025
34 checks passed
phananh1010 added a commit to phananh1010/elasticsearch that referenced this pull request Nov 7, 2025
BASE=18dc946cc969eed4385e71085a6fa3f4198fa534
HEAD=b16caf7c0183196d8b034e090ca0561ade060a4e
Branch=main
Kubik42 pushed a commit to Kubik42/elasticsearch that referenced this pull request Nov 10, 2025
Preview currently shows the source document as the index request. There
are some internal components that want both the source document and the
id of the generated entities, so we've added a query param that controls
the output format. `as_index_request` defaults to false to show the
existing structure, and true will show the response that the transform
would use to index the generated entities.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :ml/Transform Transform Team:ML Meta label for the ML team v9.3.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants