Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Migrate scripted metric agg scripts to new ScriptContext design and use context variables for agg state #29328

Closed
rationull opened this issue Apr 1, 2018 · 16 comments

Comments

@rationull
Copy link
Contributor

rationull commented Apr 1, 2018

Per discussion on #28819 it would be nice to have custom ScriptContexts (and the abstract Script classes to go with them) for the scripts involved in scripted metric aggregations. One goal is to provide the _agg/_aggs variable as a script context variable rather than in the params map. Another goal is to continue migrating scripts to use the new dedicated context design.

Per discussion in #30111 _agg and _aggs shall become state and states when moved into the context.

The four script types in play, with their desired context properties:

  • init: params map, agg map
  • map: params map, agg map, doc, score (anything else? Do we need to do anything for _fields and _source here? I have a feeling we don't but I'm not sure)
  • combine: params map, agg map
  • reduce: params map, aggs array

We'll also need to modify ScriptedMetricAggregator where it pulls out the _agg param to use the new context variable instead.

Another concern is backwards compatibility since ultimately we would remove _agg/_aggs from params. I would guess the strategy here is to leave both the old and the new in place for at least one major release. Would we emit a deprecation warning whenever scripted metric aggregations are used or somehow try to detect uses of the old _agg/_aggs params in scripts and only emit a warning then? Any cases where we touch the agg variable content in code would need to honor the legacy _agg/_aggs content if present as well. I would assume we can rule that either the new context variable or the legacy params (not a mix of both) would be used within a given set of aggregation scripts.

@rationull
Copy link
Contributor Author

@colings86 after discussing a bit with @rjernst I'm interested in taking a shot at this if that's alright with you. It'll probably take me a bit of time as my availability/hobby time is a little spotty (plus it'll be some time figuring out the codebase) but it seems like a pretty low priority issue so guessing that won't be a problem unless this is something you'd prefer to handle more internally.

Seems prudent to split the design change and the params._agg -> agg change into separate PRs and they could even be separate tasks too if you think that's more appropriate.

@rationull
Copy link
Contributor Author

@rjernst one thing I'm not sure of is whether the map script will be a bigger deal than the rest since it's a SearchScript, not just an ExecutableScript. Should the SearchScript class+context really be replaced in this case or do we need to e.g. derive from SearchScript to retain all its behavior in addition to the new context variables? Not sure if there are any deeper SearchScript related special cases in the system in general or in Painless that would need to be modified in that case.

@rjernst
Copy link
Member

rjernst commented Apr 1, 2018

No need for the map script to derive from search or executable script. The only reason it is a search script right now is to give access to doc values, but that can be done with the new script context, and without trappily exposing (as search script does) _score.

@rationull
Copy link
Contributor Author

_score is currently documented as being available for aggregation scripts but sounds like that's not something that needs to carry forward with this change then.

@rjernst
Copy link
Member

rjernst commented Apr 2, 2018

I did not realize that. I'll let @colings86 comment on why that is necessary. If it is, then a subclass of a SearchScript makes sense.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@colings86
Copy link
Contributor

colings86 commented Apr 4, 2018

_score is made available to most (if not all) aggregation scripts. Personally I do not know the use cases for wanting to aggregate on the score but I know that users do this so at least for now I think we would need to maintain this functionality unless its a big problem for this change? I think we could also open a discussion about whether we should continue to expose the score to aggregation scripts.

For the scripted_metric aggregation only the map script needs to be a SearchScript as the other scripts are not involved in the query phase.

@rjernst
Copy link
Member

rjernst commented Apr 13, 2018

After further thought, I don't think subclassing SearchScript (or a future ScoringScript based on script contexts I plan to add) will work, as the execute method in that case will return a double, but this map script should be void. Additionally, the execute signature should take in the _agg variable (or whatever we should call it). So I think a new context is warranted, even for map.

@rationull
Copy link
Contributor Author

@colings86 @rjernst any feedback on this?

Other than changes to context parameter naming and writing a more detailed commit comment, I think any further changes as mentioned in the PR #30111 could be done in separate PRs to keep things simple. Plus any changes that come from your feedback, of course. If you’d rather I make progress on some of the outstanding items before merging this (to avoid leaving work in progress from an outside contributor) then let me know.

Another thing is if you’d rather keep the script context refactor separate from adding the new way to access the agg map, then I would be happy to pull the new context variables out into a separate PR.

@colings86
Copy link
Contributor

@rationull I left comments on the PR and set it for CI testing

colings86 pushed a commit that referenced this issue Jun 25, 2018
…30111)

* Migrate scripted metric aggregation scripts to ScriptContext design #29328

* Rename new script context container class and add clarifying comments to remaining references to params._agg(s)

* Misc cleanup: make mock metric agg script inner classes static

* Move _score to an accessor rather than an arg for scripted metric agg scripts

This causes the score to be evaluated only when it's used.

* Documentation changes for params._agg -> agg

* Migration doc addition for scripted metric aggs _agg object change

* Rename "agg" Scripted Metric Aggregation script context variable to "state"

* Rename a private base class from ...Agg to ...State that I missed in my last commit

* Clean up imports after merge
@rationull
Copy link
Contributor Author

With #30111 merged, the bulk of this is done. There will be two more (much smaller) PRs, one for a deprecation warning and system property control of access to the deprecated params._agg/_aggs and the second to remove this deprecated parameter. I will keep these separate since the removal will need to target a different release branch from the rest of the implementation.

@colings86 @rjernst I have these changes implemented already -- I will give them another look and create the next PR when I get a chance either today or tomorrow.

@rationull
Copy link
Contributor Author

rationull commented Jun 27, 2018

@colings86 I just created PR #31597 adding a deprecation warning and a path forward for disabling the legacy params._agg/_aggs. Pending feedback on that I will follow up with the final PR removing the deprecated functionality once you on the ES team have decided which branch this change should go on and when the old params should be removed. AFAIK that will be it for this issue.

rjernst pushed a commit that referenced this issue Jul 2, 2018
…30111)

* Migrate scripted metric aggregation scripts to ScriptContext design #29328

* Rename new script context container class and add clarifying comments to remaining references to params._agg(s)

* Misc cleanup: make mock metric agg script inner classes static

* Move _score to an accessor rather than an arg for scripted metric agg scripts

This causes the score to be evaluated only when it's used.

* Documentation changes for params._agg -> agg

* Migration doc addition for scripted metric aggs _agg object change

* Rename "agg" Scripted Metric Aggregation script context variable to "state"

* Rename a private base class from ...Agg to ...State that I missed in my last commit

* Clean up imports after merge
@rationull
Copy link
Contributor Author

rationull commented Aug 17, 2018

Sanity check @rjernst @nik9000 there are some leftover script files in docs/src/test/cluster/config/scripts/ my_[init,map,combine,reduce]_script.painless that look like they're defunct (replaced by yaml in build.gradle). They still have references to the old _agg params. Looks like they can be removed instead of modified. Agree?

@rjernst
Copy link
Member

rjernst commented Aug 17, 2018

Yes those can be removed. They are from when we had file scripts.

@rationull
Copy link
Contributor Author

With the deprecation/bwc removal commit on master I think there's no more work to be done under this task. @rjernst @colings86 do you agree?

@colings86
Copy link
Contributor

@rationull yep you are right, this can be closed now. Thanks again for your work on this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants