Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

initial implementation for tiered storage support #9727

Merged
merged 9 commits into from Feb 29, 2024
Merged

Conversation

Staniel
Copy link
Contributor

@Staniel Staniel commented Feb 22, 2024

Type of change

Select the type of your PR

  • new feature

Description

Please describe your pull request
Support tiered storage: https://github.com/strimzi/proposals/blob/main/065-support-tiered-storage.md

Checklist

Please go through this checklist and make sure all applicable tasks have been done

  • Write tests
  • Make sure all tests pass
  • Update documentation
  • Check RBAC rights for Kubernetes / OpenShift roles
  • Try your changes from Pod inside your Kubernetes and OpenShift cluster, not just locally
  • Reference relevant issue(s) and close them after merging
  • Update CHANGELOG.md
  • Supply screenshots for visual changes, such as Grafana dashboards

Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Staniel I think this looks pretty good ... nice work. I left few minor comments. Also, if you could run make crd_install, it will copy the CRD files to the places where they need to be updated and hopefully help the CI build to pass.

Lixin Yao added 3 commits February 22, 2024 10:21
Signed-off-by: Lixin Yao <lixin_yao@apple.com>
Signed-off-by: Lixin Yao <lixin_yao@apple.com>
Signed-off-by: Lixin Yao <lixin_yao@apple.com>
@scholzj
Copy link
Member

scholzj commented Feb 26, 2024

@im-konge @see-quick Can you please think a bit about the system tests for this? We do not have any plugin to test it with (Kafka has the [LocalTieredStorage)[https://github.com/apache/kafka/blob/trunk/storage/src/test/java/org/apache/kafka/server/log/remote/storage/LocalTieredStorage.java], but even that is not part of the Kafka distribution. That makes this a bit hard to test as we would need to first add it to the container image and build a new container and then use it to test something. Any ideas how to do it? Is it worth doing it? Or do you think unit testing that the configuration works is sufficient?

Copy link
Member

@ppatierno ppatierno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some nits but looks overall good to me.

Signed-off-by: Lixin Yao <lixin_yao@apple.com>
@im-konge
Copy link
Member

@scholzj IIRC we discussed this a while ago and there were two ideas:

  • having something like that on the long running environment and there do some tests
  • building the image (as we have f.e. for the Connect with File sink plugin) and then use it in our STs.

But I think having this tested on the ST level makes sense and it can possibly find some issues, which unit tests will not catch.
The only issue (which currently comes to my mind) is some upgrade testing, where we would need more than just one image with the plugin built before running those (I'm not sure if it's the requirement or not, I'm just thinking about it).

I can take a look at it :)

@scholzj
Copy link
Member

scholzj commented Feb 26, 2024

@scholzj IIRC we discussed this a while ago and there were two ideas:

  • having something like that on the long running environment and there do some tests
  • building the image (as we have f.e. for the Connect with File sink plugin) and then use it in our STs.

But I think having this tested on the ST level makes sense and it can possibly find some issues, which unit tests will not catch. The only issue (which currently comes to my mind) is some upgrade testing, where we would need more than just one image with the plugin built before running those (I'm not sure if it's the requirement or not, I'm just thinking about it).

I can take a look at it :)

I think using it in one of your long-running clusters would make sense. But I'm not completely sure how much would the system test really add. Would we just check that the cluster starts with given config? In any case, can then please help @Staniel with how to do it and point him in the right direction as this is not really a question of simply copying some existing test with the custom image.

Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some more nits around formatting of the code. But it looks good to me. I tried it as well and it seems to work for me with the LocalTieredStorage plugin as well as with the Aiven S3 plugin. So I think we just need to figure out the system tests to be able to merge it.

Signed-off-by: Lixin Yao <lixin_yao@apple.com>
@Staniel
Copy link
Contributor Author

Staniel commented Feb 27, 2024

I left some more nits around formatting of the code. But it looks good to me. I tried it as well and it seems to work for me with the LocalTieredStorage plugin as well as with the Aiven S3 plugin. So I think we just need to figure out the system tests to be able to merge it.

Thank you! I have also verified the change using a local build Strimzi image on my K8s cluster and the config is working as expected.

@scholzj scholzj added this to the 0.40.0 milestone Feb 27, 2024
@im-konge
Copy link
Member

In any case, can then please help @Staniel with how to do it and point him in the right direction as this is not really a question of simply copying some existing test with the custom image.

Yeah I will help @Staniel for sure. I'm just thinking that in case that it will be on the long running cluster, the STs will be skipped from this repository.
Regarding STs in this repository - I guess that we can always try to deploy the cluster with tiered storage, do some changes there, do a message transmission and check that everything is okay.

I will take a look at it tomorrow and provide info regarding STs. Maybe - would it be possible to add STs in separate PR, or it would be a blocker?

@scholzj
Copy link
Member

scholzj commented Feb 27, 2024

@im-konge I think we can do that in a separate PR, yes.

@scholzj
Copy link
Member

scholzj commented Feb 28, 2024

/azp run regression

Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@scholzj scholzj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks goo to me. Thanks for the PR.

Copy link
Member

@ppatierno ppatierno left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some nits/suggestions but LGTM overall. Thanks for the PR!

Copy link
Member

@see-quick see-quick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, Thanks 👍

@im-konge
Copy link
Member

@Staniel @scholzj to not block this PR, let's have the STs added in the different PR. Thanks!

Copy link
Contributor

@fvaleri fvaleri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Staniel.

Look mostly good for me, I just left a couple of comments.

Additionally, I tried with LocalTieredStorage and it works fine.

Copy link
Contributor

@PaulRMellor PaulRMellor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! I've left a few suggestions related to the doc.

Lixin Yao added 2 commits February 28, 2024 10:44
Signed-off-by: Lixin Yao <lixin_yao@apple.com>
Signed-off-by: Lixin Yao <lixin_yao@apple.com>
@scholzj
Copy link
Member

scholzj commented Feb 28, 2024

@Staniel You will need to updated the derived files:

ERROR: Uncommitted changes in derived resources:
M	documentation/modules/appendix_crds.adoc
M	packaging/helm-charts/helm3/strimzi-kafka-operator/crds/040-Crd-kafka.yaml
M	packaging/install/cluster-operator/040-Crd-kafka.yaml
Run the following to add up-to-date resources:
  mvn clean verify -DskipTests -DskipITs \
    && make crd_install \
    && make dashboard_install \
    && make helm_install \
    && git add packaging/install/ packaging/helm-charts/ documentation/modules/appendix_crds.adoc cluster-operator/src/main/resources/cluster-roles \
    && git commit -s -m 'Update derived resources'

Signed-off-by: Lixin Yao <lixin_yao@apple.com>
@scholzj
Copy link
Member

scholzj commented Feb 28, 2024

@PaulRMellor @fvaleri Does it look good now?

Copy link
Contributor

@fvaleri fvaleri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@PaulRMellor
Copy link
Contributor

Thanks for the updates

Signed-off-by: Jakub Scholz <www@scholzj.com>
@scholzj scholzj merged commit 4ca2bc6 into strimzi:main Feb 29, 2024
13 checks passed
@scholzj
Copy link
Member

scholzj commented Feb 29, 2024

Thanks for the PR @Staniel

@scholzj scholzj mentioned this pull request Mar 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants