Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Issue 8887][tiered-storage-jcloud] support tiered-storage provider by aliyun OSS #8985

Merged
merged 5 commits into from
Dec 29, 2020

Conversation

yufan022
Copy link
Contributor

@yufan022 yufan022 commented Dec 17, 2020

Fixes #8887

Motivation

support tiered storage by aliyun oss.

Modifications

Aliyun OSS is compatible with the S3 API.
For security reasons, OSS supports only virtual hosted style access.
So I use S3ApiMetadata to connect OSS, and set jclouds.s3.virtual-host-buckets=true.

documents:
China: https://help.aliyun.com/document_detail/64919.html
Internationale: https://www.alibabacloud.com/help/doc-detail/64919.htm

Verifying this change

test with local

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? more tiered-storage provider
  • If a feature is not documented yet in this PR, please create a followup issue for adding the documentation

@yufan022
Copy link
Contributor Author

/pulsarbot run-failure-checks

@sijie
Copy link
Member

sijie commented Dec 18, 2020

@yufan022 great contributions!

@gaoran10 @Renkai can you help review this change?

@sijie sijie added this to the 2.8.0 milestone Dec 18, 2020
@sijie sijie added type/feature The PR added a new feature or issue requested a new feature doc-required Your PR changes impact docs and you will update later. labels Dec 18, 2020
@yufan022
Copy link
Contributor Author

/pulsarbot run-failure-checks

Copy link
Contributor

@gaoran10 gaoran10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!


@Override
public BlobStore getBlobStore(TieredStorageConfiguration config) {
return OSS_BLOB_STORE_BUILDER.getBlobStore(config);
Copy link
Contributor

@gaoran10 gaoran10 Dec 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S3_STANDARD_BLOB_STORE_BUILDER, S3_COMMENT_BLOB_STORE_BUILDER or other names? It seems that oss is a universal name.

Copy link
Contributor Author

@yufan022 yufan022 Dec 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AWS_S3 enum is already compatible with S3 API.
if use OSS_BLOB_STORE_BUILDER,OSS_VALIDATION , it will only belongs to Aliyun OSS.
But using S3_API_BLOB_STORE_BUILDER,S3_API_VALIDATION make it universal, both make sense.
What's your suggestion?

ALIYUN_OSS("aliyun-oss", new AnonymousProviderMetadata(new S3ApiMetadata(), "")) {
@Override
public void validate(TieredStorageConfiguration config) throws IllegalArgumentException {
VALIDATION.validate(config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the serviceEndpoint is a necessary configuration? If necessary it's better to add a configuration validationcheck.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, serviceEndpointis a necessary configuration, I will add a new validator.


@Override
public void buildCredentials(TieredStorageConfiguration config) {
AWS_CREDENTIAL_BUILDER.buildCredentials(config);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the Aliyun security validate is the same as AWS? Do we need to add environment configurations for the Aliyun service? Currently, the AWS security configurations could refer to doc.

Copy link
Contributor Author

@yufan022 yufan022 Dec 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Environment variables are reuse for AWS_ACCESS_KEY_ID , AWS_SECRET_ACCESS_KEY .
You mean we should add new environment variables like ALIYUN_ACCESS_KEY_ID, ALIYUN_SECRET_ACCESS_KEY?

@jiazhai
Copy link
Member

jiazhai commented Dec 20, 2020

@yufan022 @gaoran10 Could we add some tests for this part of code change?

@yufan022
Copy link
Contributor Author

yufan022 commented Dec 20, 2020

@yufan022 @gaoran10 Could we add some tests for this part of code change?

Sure, Is there any example I can refer to? I couldn't find test case about AWS or AZURE

@Renkai
Copy link
Contributor

Renkai commented Dec 21, 2020

@yufan022 @gaoran10 Could we add some tests for this part of code change?

@jiazhai
Seems we don't have actually run any test on S3/Azureblob/Google cloud storage, only some mocks, so add a storage vendor will not actually change the mocks.

@yufan022
Copy link
Contributor Author

/pulsarbot run-failure-checks

@sijie
Copy link
Member

sijie commented Dec 22, 2020

@Renkai okay. make sense.

@sijie
Copy link
Member

sijie commented Dec 22, 2020

@jiazhai @gaoran10 can you review the change again?

@sijie
Copy link
Member

sijie commented Dec 24, 2020

Ping @gaoran10 @jiazhai

@yufan022
Copy link
Contributor Author

Hi, have we made any progress?

@gaoran10
Copy link
Contributor

Sorry for the late response, LGTM.

@jiazhai
Copy link
Member

jiazhai commented Dec 29, 2020

@yufan022 Would you please also add the docs for this aliyun offloader in another PR? Or we add this code, but most of the user will not know how to use this.

Here is some reference: https://github.com/apache/pulsar/blob/master/site2/website/versioned_docs/version-2.7.0/tiered-storage-aws.md

@codelipenghui codelipenghui merged commit 98ad39f into apache:master Dec 29, 2020
@yufan022
Copy link
Contributor Author

@jiazhai No problem. Will the code be merged into the latest version(2.8.0)? It seems that the 2.8.0 document has not been created in the master branch.

@jiazhai
Copy link
Member

jiazhai commented Dec 29, 2020

@jiazhai No problem. Will the code be merged into the latest version(2.8.0)? It seems that the 2.8.0 document has not been created in the master branch.

Thanks @yufan022, For the latest master, the doc is put at : https://github.com/apache/pulsar/tree/master/site2/docs

@yufan022
Copy link
Contributor Author

Sorry for late.
docs: #9696

@yufan022
Copy link
Contributor Author

@codelipenghui Hi, Will the code be merged into 2.7.1 or 2.8.0? Do I need to add documents for 2.7.1?

@Anonymitaet Anonymitaet removed the doc-required Your PR changes impact docs and you will update later. label Apr 12, 2021
zymap added a commit to zymap/pulsar that referenced this pull request May 23, 2022
---

*Motivation*

There have some cloud storages are compatible with S3
APIs, such as aliyun-oss. Some other storages also use
the S3 APIs and want to offload the data into them, but
we only support the AWS or the Aliyun.
The PR apache#8985 provides
the Aliyun offload provider, but it has a force limitation of
the `S3Constants.PROPERTY_S3_VIRTUAL_HOST_BUCKETS`. That
is not a limitation on other storage service which compatible
with S3 APIs.
This PR provides  a more general offload provider `S3` which uses
pure JClouds S3 metadata and allows people to override the
default JClouds properties through system properties.

*Modifications*

- Add the pure S3 offload provider
zymap added a commit that referenced this pull request May 25, 2022
* [improve] [tiered-storage] Add pure S3 provider for the offloader
---

*Motivation*

There have some cloud storages are compatible with S3
APIs, such as aliyun-oss. Some other storages also use
the S3 APIs and want to offload the data into them, but
we only support the AWS or the Aliyun.
The PR #8985 provides
the Aliyun offload provider, but it has a force limitation of
the `S3Constants.PROPERTY_S3_VIRTUAL_HOST_BUCKETS`. That
is not a limitation on other storage service which compatible
with S3 APIs.
This PR provides  a more general offload provider `S3` which uses
pure JClouds S3 metadata and allows people to override the
default JClouds properties through system properties.

*Modifications*

- Add the pure S3 offload provider
hangc0276 pushed a commit that referenced this pull request Jun 7, 2022
* [improve] [tiered-storage] Add pure S3 provider for the offloader
---

*Motivation*

There have some cloud storages are compatible with S3
APIs, such as aliyun-oss. Some other storages also use
the S3 APIs and want to offload the data into them, but
we only support the AWS or the Aliyun.
The PR #8985 provides
the Aliyun offload provider, but it has a force limitation of
the `S3Constants.PROPERTY_S3_VIRTUAL_HOST_BUCKETS`. That
is not a limitation on other storage service which compatible
with S3 APIs.
This PR provides  a more general offload provider `S3` which uses
pure JClouds S3 metadata and allows people to override the
default JClouds properties through system properties.

*Modifications*

- Add the pure S3 offload provider

(cherry picked from commit 047cb0e)
codelipenghui pushed a commit to codelipenghui/incubator-pulsar that referenced this pull request Jun 7, 2022
…e#15710)

* [improve] [tiered-storage] Add pure S3 provider for the offloader
---

*Motivation*

There have some cloud storages are compatible with S3
APIs, such as aliyun-oss. Some other storages also use
the S3 APIs and want to offload the data into them, but
we only support the AWS or the Aliyun.
The PR apache#8985 provides
the Aliyun offload provider, but it has a force limitation of
the `S3Constants.PROPERTY_S3_VIRTUAL_HOST_BUCKETS`. That
is not a limitation on other storage service which compatible
with S3 APIs.
This PR provides  a more general offload provider `S3` which uses
pure JClouds S3 metadata and allows people to override the
default JClouds properties through system properties.

*Modifications*

- Add the pure S3 offload provider

(cherry picked from commit 047cb0e)
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jun 7, 2022
…e#15710)

* [improve] [tiered-storage] Add pure S3 provider for the offloader
---

*Motivation*

There have some cloud storages are compatible with S3
APIs, such as aliyun-oss. Some other storages also use
the S3 APIs and want to offload the data into them, but
we only support the AWS or the Aliyun.
The PR apache#8985 provides
the Aliyun offload provider, but it has a force limitation of
the `S3Constants.PROPERTY_S3_VIRTUAL_HOST_BUCKETS`. That
is not a limitation on other storage service which compatible
with S3 APIs.
This PR provides  a more general offload provider `S3` which uses
pure JClouds S3 metadata and allows people to override the
default JClouds properties through system properties.

*Modifications*

- Add the pure S3 offload provider

(cherry picked from commit 047cb0e)
(cherry picked from commit 72629be)
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jun 7, 2022
…e#15710)

* [improve] [tiered-storage] Add pure S3 provider for the offloader
---

*Motivation*

There have some cloud storages are compatible with S3
APIs, such as aliyun-oss. Some other storages also use
the S3 APIs and want to offload the data into them, but
we only support the AWS or the Aliyun.
The PR apache#8985 provides
the Aliyun offload provider, but it has a force limitation of
the `S3Constants.PROPERTY_S3_VIRTUAL_HOST_BUCKETS`. That
is not a limitation on other storage service which compatible
with S3 APIs.
This PR provides  a more general offload provider `S3` which uses
pure JClouds S3 metadata and allows people to override the
default JClouds properties through system properties.

*Modifications*

- Add the pure S3 offload provider

(cherry picked from commit 047cb0e)
(cherry picked from commit d1db33e)
codelipenghui pushed a commit that referenced this pull request Jun 10, 2022
* [improve] [tiered-storage] Add pure S3 provider for the offloader
---

*Motivation*

There have some cloud storages are compatible with S3
APIs, such as aliyun-oss. Some other storages also use
the S3 APIs and want to offload the data into them, but
we only support the AWS or the Aliyun.
The PR #8985 provides
the Aliyun offload provider, but it has a force limitation of
the `S3Constants.PROPERTY_S3_VIRTUAL_HOST_BUCKETS`. That
is not a limitation on other storage service which compatible
with S3 APIs.
This PR provides  a more general offload provider `S3` which uses
pure JClouds S3 metadata and allows people to override the
default JClouds properties through system properties.

*Modifications*

- Add the pure S3 offload provider

(cherry picked from commit 047cb0e)
dragonls pushed a commit to dragonls/pulsar that referenced this pull request Oct 21, 2022
…e#15710)

* [improve] [tiered-storage] Add pure S3 provider for the offloader
---

*Motivation*

There have some cloud storages are compatible with S3
APIs, such as aliyun-oss. Some other storages also use
the S3 APIs and want to offload the data into them, but
we only support the AWS or the Aliyun.
The PR apache#8985 provides
the Aliyun offload provider, but it has a force limitation of
the `S3Constants.PROPERTY_S3_VIRTUAL_HOST_BUCKETS`. That
is not a limitation on other storage service which compatible
with S3 APIs.
This PR provides  a more general offload provider `S3` which uses
pure JClouds S3 metadata and allows people to override the
default JClouds properties through system properties.

*Modifications*

- Add the pure S3 offload provider
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/tieredstorage type/feature The PR added a new feature or issue requested a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Does tiered storage support aliyun oss?
7 participants