Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/adding solrcloud collection crd #17

Closed
wants to merge 5 commits into from
Closed

Feature/adding solrcloud collection crd #17

wants to merge 5 commits into from

Conversation

sepulworld
Copy link
Contributor

@sepulworld sepulworld commented Aug 26, 2019

*Issue number of the reported bug or feature request: #7 *

Describe your changes
DONE:

  • Added a CRD kind for SolrCloudCollections creation
  • Finalizer for deleting a collection
  • Support basic configs for both implicit and compositeId router names
  • Support autoAddReplica
  • Support basic modifications

Testing performed
DONE:
Local k8s on Docker for Mac, tested collection creation
Tested deletion of collection
Tested creation and modification of collections

@sepulworld
Copy link
Contributor Author

@HoustonPutman I think this is good for a review. Will make any suggested changes or additions per your feedback. I can follow up in a new PR to add ability to modify existing collections. Or do you want to have that in this PR?

@sepulworld sepulworld changed the title [WIP] Feature/adding solrcloud collection crd Feature/adding solrcloud collection crd Aug 28, 2019
@sepulworld
Copy link
Contributor Author

@HoustonPutman I think this is good for a review. Will make any suggested changes or additions per your feedback. I can follow up in a new PR to add ability to modify existing collections. Or do you want to have that in this PR?

Ok, I went with an implementation in this PR. 1032783

Tested this locally and it handles spec changes and calls the modification function as intended

@HoustonPutman
Copy link
Contributor

@sepulworld Thanks so much for the work and effort!

It's kinda busy season around here, but I'll try to find some time next week to test this out and do a review. Maybe we can hook it up with the backup controller at some point too.

@sepulworld
Copy link
Contributor Author

Ok, I think I am done with this PR. Last commit adds support for basic setup for both 'implicit' and 'compositeId' router types. Also, added support for autoAddReplicas parameter.

Let me know what you think and if you need further changes or testing I am down to do more. I did a lot of local testing and it all seems to work as expected. 👍

Copy link
Contributor

@HoustonPutman HoustonPutman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general I really like the work, and I have a few comments to get started.

Also I wonder if we should have a broader conversation about what the status of the solr collection CRD could contain, and if we could make it more empowering to people using the CRD. But that could also be further work after this is merged in.

pkg/apis/solr/v1beta1/solrcloudcollection_types.go Outdated Show resolved Hide resolved
pkg/apis/solr/v1beta1/solrcloudcollection_types.go Outdated Show resolved Hide resolved
collection.Status.CreatedTime = &now
}

if solrCloud != nil && collectionCreationStatus == false {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do you update the status subResource of the collection CRD?

Like done here for the backup: https://github.com/bloomberg/solr-operator/blob/master/pkg/controller/solrbackup/solrbackup_controller.go#L175

Also I'm surprised that it works for you given that you don't requeue the reconcile after you issue the collections api call. Otherwise I think you have to wait for the api server to get an update for the reconcile to be called again.

We do it here for the backups: https://github.com/bloomberg/solr-operator/blob/master/pkg/controller/solrbackup/solrbackup_controller.go#L141

I can go further into detail if you want. Also point out if I am missing it somewhere here, that's definitely possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think because the response from the solrcloud API is so fast that it didn't see it reattempting and breaking. I will look into adding this requeue logic for sure.

@sepulworld
Copy link
Contributor Author

sepulworld commented Sep 13, 2019

@HoustonPutman on a side note. Are you interested in starting a Slack Team or channel in K8s Slack Team for solr-operator? Would be nice to have a place to chat. 💯

@HoustonPutman
Copy link
Contributor

HoustonPutman commented Sep 13, 2019

So before this gets merged, could you squash down the commits into 1 or 2, and sign them as well? Because the rules for this repo make that required, as you can see from the DCO check that failed.

Also run make manifests before committing, as that check is failing in the Travis CI build above, which checks to make sure that the configs are correct. Running make manifests will make sure the configs are up to date with the code.

You can run make test and make manifest-check (this one you will have to run after a git commit because it uses git diff instead of just diff) to try to make sure that the Travis CI builds will pass before you push to github, which will hopefully make the debugging process faster :)

@HoustonPutman
Copy link
Contributor

Also good idea on the slack channel. I'm not sure how we'd get that done. But I'll reach out to someone on Monday and get their opinion.

@HoustonPutman
Copy link
Contributor

HoustonPutman commented Sep 17, 2019

So I merged my addition of the metrics collector. It looks like it only caused two conflicts. You should be able to just checkout the test_solrcloud.yaml, as I made additional fixes to that. And hopefully the README conflicts are straightforward.

Also I'm going to cut a release now, but I will also cut a release after this gets merged as it's a pretty big addition.

@sepulworld
Copy link
Contributor Author

Thanks for the feedback! No problem on the merge conflict, will rebase with master.

Presently, I am working on getting the reconcile working well with modifications and trying out the ReQueue functionality. Most of the calls we are making for collection creation, deletion and modifications aren't really async. I believe they should API requests is finished when response is provided. So, perhaps the ReQueue isn't needed? I'm still testing out some of the behavior though.

@sepulworld
Copy link
Contributor Author

Presently, I am trying to figure out best way to deal with spec modifications (Items like autoAddReplicas, numShards, replicationFactor are modifiable). I first tried to compare previous Status to current Status and then previous Spec to current Spec using DeepCopy() but that doesn't seem to be doable. So, I believe the best approach will be to compare the Spec values for those mutable parameters to what the Solr API shows for each collection and if there is a difference with one of the values, run the modify function. Any thoughts on how to handle Spec updates for mutable collection parameters like the ones noted above @HoustonPutman?

@HoustonPutman
Copy link
Contributor

honestly it sounds like having a ZkStateReader implementation in Go would be extremely useful and I think pretty necessary to collection management via the operator. With that we could watch on collection states, and react accordingly when things change, much like we do currently with the Collection CRDs via the controller APIs. Because I don't want to have to call out to Solr everytime we get a Reconcile request. That is potentially a ton of calls to solr. It could be a good first step before we have the necessary Solr Go libraries, which will not be trivial to implement.

Without that I'm not entirely sure the best course of action. I'll definitely think on it though, as it will be a recurring problem we have with the more solr-integrating features we add.

As for the async requests, I use them because Solr has timeouts that we may not be able to work around, as the client may want to set them to something low. Creating a collection with many replicas may take longer than that timeout and we wouldn't want to throw a failure when the collection actually exists. Every call to the Collections API can be called asynchronously, so the workflow to making the async calls for Collection Delete/Create/Modify should be that much different than doing it for the Collection Backup that is already implemented.

The async stuff will definitely be necessary in the future, but I don't think it is a blocker for this PR. If it isn't straightforward to do it now, I'm fine merging the PR without async calls and adding a GH issue to convert the calls to async when we get time in the future. So I wouldn't spend a whole lot of time working on that part right now. It'll be much easier to add that later when the base functionality is already merged and tested.

@sepulworld
Copy link
Contributor Author

honestly it sounds like having a ZkStateReader implementation in Go would be extremely useful and I think pretty necessary to collection management via the operator.

This makes good sense to me. Do you think this would be useful as a separate Go package and source? That we could import in solr-operator and other projects?

@HoustonPutman
Copy link
Contributor

I think this would be great as a separate Go package/library. I don't have time to get it started anytime soon, but I definitely think it would be a great addition and let us make the operator much more powerful.

@sepulworld
Copy link
Contributor Author

@HoustonPutman curious if you have any suggestions with the following:

https://play.golang.org/p/vgKNJqWGYc1

Basically, trying to unMarshall the collection status response which includes a 'key' that is the name of the collection. This key will change depending on the name of the collection you are requesting the status for. Curious if there is a clean way to deal with this type of response from SolrAPI. I put some notes in the Playground link too.

@sepulworld
Copy link
Contributor Author

@HoustonPutman curious if you have any suggestions with the following:

https://play.golang.org/p/vgKNJqWGYc1

Basically, trying to unMarshall the collection status response which includes a 'key' that is the name of the collection. This key will change depending on the name of the collection you are requesting the status for. Curious if there is a clean way to deal with this type of response from SolrAPI. I put some notes in the Playground link too.

Ok, I think I figured out best way... using a pointer. https://play.golang.org/p/fGLNP9kSYZQ will test it out

@sepulworld
Copy link
Contributor Author

Super annoying that the Solr api response has a key as a value. The key should be “collection” that has a hash of values that contains the collection name. Oy!

@sepulworld
Copy link
Contributor Author

Closing this in favor of #23 which I believe I performed the appropriate rebase and sign-off measure.

@sepulworld sepulworld closed this Sep 25, 2019
@swarupdonepudi
Copy link
Contributor

Created a issue to track adding zkstatereader to the mix - #40

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants