Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOLR-15733: WIP #1094

Closed
wants to merge 15 commits into from
Closed

SOLR-15733: WIP #1094

wants to merge 15 commits into from

Conversation

joel-bernstein
Copy link
Contributor

@joel-bernstein joel-bernstein commented Oct 19, 2022

https://issues.apache.org/jira/browse/SOLR-XXXXX

Description

Please provide a short description of the changes you're making with this pull request.

Solution

Please provide a short description of the approach taken to implement your solution.

Tests

Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem.

Checklist

Please review the following and check all that apply:

  • I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
  • I have created a Jira issue and added the issue ID to my pull request title.
  • I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
  • I have developed this patch against the main branch.
  • I have run ./gradlew check.
  • I have added tests for my changes.
  • I have added documentation for the Reference Guide

@risdenk
Copy link
Contributor

risdenk commented Oct 19, 2022

I would have waited for after #953 potentially but if they are straight moves - might not be an issue. git may just handle it :)

@risdenk
Copy link
Contributor

risdenk commented Oct 19, 2022

Shouldn't these all be similar to https://github.com/apache/solr/tree/main/solr/solrj-zookeeper? looks like these ended up in modules?

@HoustonPutman
Copy link
Contributor

Agreed, modules is not for solrj code, its for server modules. We don't have a solrj folder that hosts sub-folders yet (but we will starting in Solr 10), so it should just be called solrj-streaming. (though I really don't know why streaming expressions live in client code anyways...)

@joel-bernstein
Copy link
Contributor Author

joel-bernstein commented Oct 20, 2022

Before I close this PR and start to work on a solrj-streaming PR lets be sure we don't want this to be a server module.

The original streaming classes were designed to be a Java API using the decorator pattern to layer on new behaviors. This fit pretty well with the Solrj library which is Java based. So it was placed there.

When streaming expressions came around it became easy to support complex expressions that simply are not practical with the Java API. The /stream handler became the focal point which moved things towards a server side model. At this point I consider Streaming Expressions a server side tool.

My suggestion is that we make it a server side module and we can add Solrj support for sending a Streaming Expression to the stream handler.

The only reason I see to keep it a client side tool is to support existing clients that are using the Java API. Which may be reason enough.

One compromise would be to leave a small subset of TupleStream implementations in Solrj. Then move all the rest of the classes to a server side module.

@janhoy
Copy link
Contributor

janhoy commented Oct 20, 2022

I have used streaming in just a limited amount client-side. And the number of classes was quite overwhelming. And a bit confusing to have all the backend classes also on the client. To be honest, I ended up using the low-level builders and constructing the expressions as JSON strings in my Java client. But it would be nice to get IDE help building more complex nested streams.

So if it is possible to figure out what Bean classes and builders are needed on the client side, it would be a better end-user experience to get only those classes in solrj-streaming. In that case, the server-side classes could be a module for sure. Do you imagine the module would be enabled by default in 9.x and disabled by default in 10.x?

@joel-bernstein
Copy link
Contributor Author

joel-bernstein commented Oct 20, 2022

Ok, I like the idea of keeping a very small workable subset of classes in Solrj and moving the rest along with the /stream handler etc... to a server side module. The best supported TupleStreams Java API classes are the ones that are used by the /sql handler. So I think we can just leave these few class in Solrj. That would move 98% of the classes to a server module.

Default enabled makes sense for Solr 9 and disabled in Solr 10.

@joel-bernstein
Copy link
Contributor Author

Another way to split the classes between server and client side module is any class that has a Java API test case can stay in Solrj. 98% of the classes only have streaming expression based test cases and they will be moved to the server side module.

@joel-bernstein
Copy link
Contributor Author

If people are ok with this approach I will keep this PR alive and continue working on the server side module. Once thats done, I'll migrate a small set of classes to solrj-streaming.

@joel-bernstein
Copy link
Contributor Author

joel-bernstein commented Oct 20, 2022

As part of the solrj-streaming module I will add documentation for using the Java API. There is a massive amount of documentation for streaming expressions on the server side but nothing for the Java API now.

@risdenk
Copy link
Contributor

risdenk commented Oct 20, 2022

@joel-bernstein for 9.x - I'm pretty sure everything needs to stay client side - so moving stuff should stay in solrj just in a separate artifact solrj-streaming.

For 10.x maybe we can see what it looks like to have a server side only module. The comments I made about solrj stuff being like maven dependencies - only applied to solrj client side. It does NOT apply to server side modules - at least not the same way.

@joel-bernstein
Copy link
Contributor Author

joel-bernstein commented Oct 20, 2022

@risdenk, I'm curious why it all needs to stay client?

@risdenk
Copy link
Contributor

risdenk commented Oct 20, 2022

user is using the streaming solrj client side java classes. - Solr 9.1 user upgrades to Solr 9.x - if all those classes are moved only to a server side module that would break in the point release.

@joel-bernstein
Copy link
Contributor Author

It is possible that people are using a more broad set of classes on the client side. Which would indeed break if they are removed. So I see your point about waiting for 10 for splitting.

I'm ok with a solrj-streaming module.

@janhoy
Copy link
Contributor

janhoy commented Oct 20, 2022

Yes, similar considerations were made for solrj-zookeeper.

So if you start with the new solrj-zookeeper maven-artifact, moving all IO classes as-is, that should be back-compat with 9.0 (client-side).

Next, @Deprecate 98% of the classes in the new solrj-zookeeper folder with forRemoval="10.0", so users can continue using them on the client in 9.x. Then I suppose you could COPY those classes to a the server-side module, give them other package names etc, and release the new server-side module in 9.x. The server-side streaming module will only use the backend classes from the module and not the deprecated ones from SolrJ (even if they may be on the classpath). Is this possible?

@joel-bernstein
Copy link
Contributor Author

joel-bernstein commented Oct 20, 2022

For this ticket I'll focus on the move to solrj-streaming as that cleans things up in Solrj. I'll likely create a new PR for this rather than working with this PR's branch.

I'll open another ticket to discuss the client side deprecations and the creation of the server side module.

@HoustonPutman
Copy link
Contributor

This sounds good to me, although I'm not sure why the streaming expressions stuff would really need to go into a module instead of solr-core. Does it require any dependencies that Solr doesn't already have?

@joel-bernstein
Copy link
Contributor Author

joel-bernstein commented Oct 20, 2022

@HoustonPutman, I'll focus on the move to solrj-streaming since thats what is needed for back compat for 9x.

For Solr 10 we can decide between a module or solr-core. I'm fine with either approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants