Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] XContent base classes from xcontent to core library #5902

Merged
merged 8 commits into from
Feb 21, 2023

Conversation

nknize
Copy link
Collaborator

@nknize nknize commented Jan 17, 2023

Refactors the xcontent base classes from the xcontent module to the core library so serializable xcontent contracts can be used across other libraries (e.g., new aggregation library, other community contributed libraries). Since opensearch-core is intended to be the foundation library for all of opensearch, this PR continues the effort to refactor those foundation classes from the :server module to appropriate library and base module plugins in order to reduce the surface area of the :server module.

Note: The surface area of this change is large due to xcontent being used as a foundation across the entire project.

relates #5910
depends on #6218

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@github-actions
Copy link
Contributor

Gradle Check (Jenkins) Run Completed with:

@codecov-commenter
Copy link

codecov-commenter commented Jan 17, 2023

Codecov Report

Merging #5902 (b605a19) into main (c95a6b9) will decrease coverage by 0.09%.
The diff coverage is 17.64%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@             Coverage Diff              @@
##               main    #5902      +/-   ##
============================================
- Coverage     70.70%   70.61%   -0.09%     
+ Complexity    58869    58765     -104     
============================================
  Files          4789     4789              
  Lines        281788   281783       -5     
  Branches      40669    40669              
============================================
- Hits         199229   198972     -257     
- Misses        66227    66436     +209     
- Partials      16332    16375      +43     
Impacted Files Coverage Δ
...ensearch/gradle/precommit/ThirdPartyAuditTask.java 0.00% <0.00%> (ø)
...ch/plugin/noop/action/bulk/RestNoopBulkAction.java 0.00% <ø> (ø)
...java/org/opensearch/client/GetAliasesResponse.java 88.75% <ø> (ø)
...main/java/org/opensearch/client/NodesResponse.java 0.00% <ø> (ø)
...ava/org/opensearch/client/NodesResponseHeader.java 0.00% <ø> (ø)
.../java/org/opensearch/client/RequestConverters.java 83.55% <ø> (-1.50%) ⬇️
...ava/org/opensearch/client/RestHighLevelClient.java 41.75% <ø> (ø)
...pensearch/client/cluster/RemoteConnectionInfo.java 0.00% <ø> (-73.18%) ⬇️
.../opensearch/client/cluster/RemoteInfoResponse.java 61.53% <ø> (-38.47%) ⬇️
...g/opensearch/client/core/AcknowledgedResponse.java 70.58% <ø> (ø)
... and 694 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@andrross
Copy link
Member

Am I correct in assuming that the Java package change (from org.opensearch.common to org.opensearch.core) makes this a breaking change that cannot be backported to the 2.x branch?

@reta
Copy link
Collaborator

reta commented Jan 18, 2023

@nknize I have difficulties understanding the reason for this change

Refactors the xcontent base classes from the xcontent module to the core library so serializable xcontent contracts can be used across other libraries

What exactly is stopping other libraries to use x-content module? We publish it as an artifact for everyone to use, I think consolidating all related x-content classes (pulling from core & co if needed) would be the way to shape this change.

Since opensearch-core is intended to be the foundation library for all of opensearch, this PR continues the effort to refactor those foundation classes from the :server module to appropriate library and base module plugins in order to reduce the surface area of the :server module.

We have more or less concise set of modules, why do we want to beef everything into opensearch-core vs keeping a set of modules? Also, there is more profound issue that this change introduces (or better to say, amplifies) the packages split brain: the o.o.c.xcontext package is now split between different modules, a showstopper for any future modularity efforts.

@nknize
Copy link
Collaborator Author

nknize commented Jan 18, 2023

....a showstopper for any future modularity efforts.

On the surface it seems that way so let me address each of these questions separately.

What exactly is stopping other libraries to use x-content module?

The build is configured such that library modules may not depend on other library modules except opensearch-core. This heavily restricts us from using libraries to trim down the bloated :server module (I'll give an example below). It was designed this way so that core retains generics and abstract contracts used across all of opensearch with decoupled concrete implementations in specific libraries, modules, and plugins.

We have more or less concise set of modules, why do we want to beef everything into opensearch-core vs keeping a set of modules?

This change isn't beefing everything into core. It's teasing out the reusable generics and interface contracts to the core library which is used to implement the concrete classes and architecture details across modules and concrete libraries.

Also, there is more profound issue that this change introduces (or better to say, amplifies) the packages split brain: the o.o.c.xcontext package is now split between different modules

It seems that way, but if you look at the change it's sticking w/ the intended design and refactoring the XContent base classes and interfaces to the core library while leaving the concrete implementations of those MediaTypes (e.g., JSON, CBOR, SMILE, YAML) in the x-content library. As it turns out, we might need to refactor those concrete implementations to an x-content module in a follow up PR but I'm deferring that unless it's absolutely necessary.

The driver for this is to further modularize the :server module by teasing out the generics to libraries and implementations to modules. Let me give an example where the cross library dependency limitation currently restricts a big part of this (hence the xcontent change).

Aggregations

Third party libraries, plugins, modules, etc, want to build new aggregations all the time. We provide this ability and it's super powerful. To do so is quite boiler plate: 1. If the CoreValuesSourceType library doesn't already provide the requisite ValueSource, write the needed ValuesSourceType the new aggregation depends on, 2. write the Aggregator, 3. write the Aggregator Factory, 4. write the AggregationBuilder, 5. register the aggregation through your plugin using the AggregationSpec

As you can see in this process there is a large number of abstract classes and interfaces required to support implementing a new concrete aggregation anywhere in the "ecosystem". Organizationally, these base classes are spread across the aggregations and aggregations.support namespace, along with the mechanisms needed to implement the aggregation phase(s) in the SearchModule and SearchService class implementations. This means for anyone wanting to build a custom aggregation they would take a build.gradle dependency on "org.opensearch:opensearch:${opensearchVersion}" which pulls in a bloated server jar containing much more than the classes and dependencies they need to build their aggregation.

I think most on this project agree this is a modularization anti-pattern, so to remedy we're striving to make the codebase much more modular. So how do we achieve that here with this clumsy aggregation foundation and user facing "API" (which isn't exactly well documented, or implemented, or supported)? At first pass you might say, "hey, how about we stick with the intent of the libraries and modules design and refactor the generics and base classes from the bloated :server module to a new :opensearch-aggregations library with the concrete implementation in a new :aggs-common module that continue to use the search.aggregations.* namespace? I'd agree with this, in fact I'd go a step further and say this is where I think we should go with, not just aggregations, most of the big opensearch mechanisms (field mappers, field types, doc value types, field data, etc). In the end the :server module should be nothing more than the "motherboard" providing the concrete implementation of all the opensearch base components within the libraries and concrete extendable mechanisms provided by the modules and optional plugins.

But if you embark down that journey you will quickly find the cross library dependency restriction blocks much of this objective and the tight dependency requirement of the :server module with other modules quickly leads to JarHell issues. Take the aggregation decoupling use case, for example. Aggregations, like field mappers, are serializable and therefore depend on XContent in order to implement the ToXContentFragment logic. So how about refactoring all the XContent base classes to the :opensearch-x-content library then? That works until you refactor any base interfaces that depend on XContent to a library other than opensearch-core, because libraries cannot depend on each other (e.g., any new library cannot take a dependency on both core and x-content).

So that leaves us with a few options:

  1. remove the cross library dependency restriction
  2. refactor all base classes that would result in a cross library dependency situation into modules instead
  3. Do nothing, leave all base classes and interfaces that would result in a cross library dependency in the server module and leave users to depend on the massive jar
  4. refactor opensearch core base classes and interface contracts that would result in a cross library dependency to the opensearch-core library and the implementation details to other libraries, modules, and plugins

With this PR I chose Option 4 because it not only fits the modularization objective but after the aggregation framework is refactored to a new module (and supporting library) users in the aggregation scenario above would only need to take a dependency on the opensearch-core and a (soon coming) opensearch-aggregations library to implement a new concrete aggregation in their own plugin.

I hope this helps paint a clearer picture?

@nknize
Copy link
Collaborator Author

nknize commented Jan 18, 2023

Am I correct in assuming that the Java package change (from org.opensearch.common to org.opensearch.core) makes this a breaking change that cannot be backported to the 2.x branch?

This change intends to target 3.0 only (I just added the labels, sorry about that). However, that wouldn't preclude a backport if we decided to pull it into 2.x since mixed cluster rolling upgrades are only mixed nodes not mixed jars. The side effect, however, is a namespace ripple across all plugins that depend on xcontent base classes. I'd wager that's at least 90% of plugins. The refactor would have to be done before releasing the 2.x version that would include this change.

nknize added a commit to nknize/OpenSearch that referenced this pull request Feb 23, 2023
…earch-project#5902)

Refactors the xcontent base classes from the xcontent module to the core
library so serializable xcontent contracts can be used across libraries.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
MaxKsyunz pushed a commit to opensearch-project/sql that referenced this pull request Mar 3, 2023
Xcontent classes were moved in this PR opensearch-project/OpenSearch#5902 causing builds to fail.

---------

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
GumpacG pushed a commit to Bit-Quill/opensearch-project-sql that referenced this pull request Mar 6, 2023
Xcontent classes were moved in this PR opensearch-project/OpenSearch#5902 causing builds to fail.

---------

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
matthewryanwells pushed a commit to Bit-Quill/opensearch-project-sql that referenced this pull request Mar 7, 2023
Xcontent classes were moved in this PR opensearch-project/OpenSearch#5902 causing builds to fail.

---------

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
acarbonetto pushed a commit to Bit-Quill/opensearch-project-sql that referenced this pull request Mar 7, 2023
Xcontent classes were moved in this PR opensearch-project/OpenSearch#5902 causing builds to fail.

---------

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
matthewryanwells pushed a commit to Bit-Quill/opensearch-project-sql that referenced this pull request Mar 10, 2023
Xcontent classes were moved in this PR opensearch-project/OpenSearch#5902 causing builds to fail.

---------

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
Signed-off-by: Matthew Wells <matthew.wells@improving.com>
nknize added a commit to nknize/OpenSearch that referenced this pull request Mar 14, 2023
…earch-project#5902)

Refactors the xcontent base classes from the xcontent module to the core
library so serializable xcontent contracts can be used across libraries.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
nknize added a commit to nknize/OpenSearch that referenced this pull request Mar 14, 2023
…earch-project#5902)

Refactors the xcontent base classes from the xcontent module to the core
library so serializable xcontent contracts can be used across libraries.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
nknize added a commit to nknize/OpenSearch that referenced this pull request Mar 16, 2023
…earch-project#5902)

Refactors the xcontent base classes from the xcontent module to the core
library so serializable xcontent contracts can be used across libraries.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
nknize added a commit to nknize/OpenSearch that referenced this pull request Mar 20, 2023
…earch-project#5902)

Refactors the xcontent base classes from the xcontent module to the core
library so serializable xcontent contracts can be used across libraries.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
nknize added a commit that referenced this pull request Mar 28, 2023
#6470)

Refactors the xcontent base classes from the xcontent module to the core
library so serializable xcontent contracts can be used across libraries.

Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
opensearch-trigger-bot bot pushed a commit to opensearch-project/sql that referenced this pull request Mar 29, 2023
Xcontent classes were moved in this PR opensearch-project/OpenSearch#5902 causing builds to fail.

---------

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
(cherry picked from commit cd95bc5)
MaxKsyunz pushed a commit to Bit-Quill/opensearch-project-sql that referenced this pull request Mar 29, 2023
Xcontent classes were moved in this PR opensearch-project/OpenSearch#5902 causing builds to fail.

---------

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
(cherry picked from commit cd95bc5)
Yury-Fridlyand pushed a commit to opensearch-project/sql that referenced this pull request Mar 29, 2023
* Address build failures (#1366)

Xcontent classes were moved in this PR opensearch-project/OpenSearch#5902 causing builds to fail.

Signed-off-by: MaxKsyunz <maxk@bitquilltech.com>
@AndreVirtimo
Copy link

This is a breaking change in version 2.10!

Where is the communication of breaking changes? This should be part of the changelog.

@reta
Copy link
Collaborator

reta commented Oct 18, 2023

This is a breaking change in version 2.10!

@AndreVirtimo you are absolutely right, the change was slipped away but it should have been in the change log and release notes for 2.10, sincere apologies for that.

@nknize
Copy link
Collaborator Author

nknize commented Oct 18, 2023

This is a breaking change in version 2.10!

@AndreVirtimo Can you share your use case? Where / how did you run into this? Did you get a NoClassDefFound? Do you have an implementation that relies on it? Are you importing dependencies yourself?

The reason I ask is because the core technically does not (yet) provide a formal or comprehensive plugin / library API (we're working on this; feel free to track history through #5910) and dependency libraries are (should) already be handled by the min-distribution, bundle, or docker releases. I want to make sure you didn't stumble on a release artifact bug.

@olfuerniss
Copy link

@nknize A breaking change info for developers would be important (+ necessary migration steps).

E.g. for a OpenSearch plugin, you must adjust the plugin version for the target OpenSearch version. This means also you must re-compile the plugin. This will break due to this and the other related refactoring. In our plugin, nearly every class (> 200) had to be adjusted. Not only moved classes but also API changes happened in 2.10. (Strings.toString(XContentBuilder) removed etc.)

But also in other code that uses the REST High Level Client we use the XContentBuilder quite often and like it. The RestStatus class and others also moved.

It was an extremely suprising and big change for a minor release (developer point of view). ;-)

@nknize
Copy link
Collaborator Author

nknize commented Oct 19, 2023

@olfuerniss thank you for sharing your experiences. It's always helpful to get feedback from others in how they're using the artifacts. It sounds like you're developing your own plugin? Similar to your comments, the OpenSearch plugins in general deal with these kinds of issues every time-based minor release cycle for reasons outlined in #5910. As I mentioned, the refactoring effort is a move to fix this in the future to support an official Developer API and (since it's not there yet and requires all core maintainers pulling in the same direction) will require some rocky moments like this since the project attempts to minimize major releases.

Just to add a few things to open the aperture here:

A breaking change info for developers would be important (+ necessary migration steps).

There is a >breaking label that is supposed to be used for this purpose. Lot's of low hanging areas for improvement here. For example, labels (beyond just >breaking) could/should be better used and applied by maintainers (this is an area I've been trying hard to pull folks along), the changelog process is manual, non-standard, and clumsy (this was discussed on a separate issue, moved to a new project wide proposal, and months overdue on being revised. I think your experiences and wishes could help nudge that along to fix these communication struggles). e.g., it would be nice to autogen a changelog (including breaking changes) direct from properly labeled merged PRs.

It was an extremely suprising and big change for a minor release (developer point of view)

+1. Until the core API is defined (through JPMS, library exports, and other java mechanisms like sealed classes), plugin developers within and outside the opensearch project org are going to experience changes like this relatively often. It's a side effect of forking a code base that did not establish a formal developer API (for these exact painful reasons your describing). This is why the refactoring is happening to begin with but requires a delicate balance of how much deprecated code surface area and internal logic paths should be retained solely for backwards compatibility purposes vs moving to a well supported secure Generally Available API. On this project we talk about SemVer in three areas. Progress toward Developer API enforcement was only added as recent as two months ago (to assist with the API blast radius from PRs like this). But expectations need to be set. Annotations aren't yet fully applied, and where they are applied many will change over the coming months because of all the maintainers on core only one worked at elastic when the core plugin framework and xcontent classes were first designed and written so the annotation choice is only best effort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport 2.x Backport to 2.x branch Build Libraries & Interfaces enhancement Enhancement or improvement to existing feature or request extensions v3.0.0 Issues and PRs related to version 3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants