-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactored JCloud Tiered Storage #6335
Conversation
Any update on this? |
@david-streamlio sorry for the late response. @jiazhai @codelipenghui and @gaoran10 can you guys review this pull request? @david-streamlio is waiting for a review. |
OrderedScheduler scheduler) throws IOException { | ||
|
||
TieredStorageConfiguration config = TieredStorageConfiguration.create(userMetadata); | ||
return BlobStoreManagedLedgerOffloader.create(config, userMetadata, scheduler); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, the Pulsar support independent offload policy for each namespace, the userMetadata
param contains the global offload configs and the offloadPolicies
param contains the Namespace level offload configs. If the namespace has its own offload policy, we should use the independent offload policy, otherwise, we should use the global offload policy.
The OffloadPolicies
has independent config for S3 and GCS, such as s3ManagedLedgerOffloadRegion
and gcsManagedLedgerOffloadRegion
, follow your way, we could combine them as the region
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The focus of this PR is to refactor the tiered storage to use JCloud. I feel that we need to keep this PR to just that. I suggest we create an issue for this and assign it to me. I will make the change after this PR has been successfully merged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late response. If we don't care about the OffloadPolicies config, it will cause the namespace offload policies invalid.
public OffloadPolicies getOffloadPolicies() { | ||
return offloadPolicies; | ||
// TODO Auto-generated method stub | ||
return null; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, when the LedgerOffloader
is generated by the PulsarService
, the PulsarService
will check the namespace's OffloadPolicies
, if the new OffloadPolicies
is same as the existing OffloadPolicies
, the PulsarService
will reuse the existing LedgerOffloader
, so we should provide the OffloadPolicies
of the LedgerOffloader
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The focus of this PR is to refactor the tiered storage to use JCloud. I feel that we need to keep this PR to just that. I suggest we create an issue for this and assign it to me. I will make the change after this PR has been successfully merged.
move to 2.7.0 first. |
Anything needed from me on this? |
@gaoran10 @codelipenghui can you review this change again? |
@david-streamlio Would you please help resolve the conflicts with master branch? |
I will work on this conflicts shortly. |
@congbobo184 @gaoran10 Would you please help review this PR? |
LGTM, that's so cool. I added an issue to track the problem with the namespace level offload feature, we could complete this PR first and repair that problem. @david-streamlio Hi, could you make a rebase for this PR? Thanks. |
Code has been rebased. |
/pulsarbot run-failure-checks |
@david-streamlio Hi, there are still some problems in this PR, I made some changes based on your work, could you have a look and add these changes to this PR, thanks. gaoran10#12 |
Signed-off-by: xiaolong.ran <rxl@apache.org>
/pulsarbot run-failure-checks |
Would like to merge this first. |
Move this change to 2.6.2, because the #8310 depends on it. |
Motivation In order to facilitate the support of additional JClouds-supported providers, we first needed to clean up the existing code, as there were a lot of if/then/else constructs throughout the code that were based on the assumption that we either supported AWS S3 or Google Cloud Storage. I didn't want to keep adding else if's to these code blocks for every new provider we add, so I decided to refactor the code to make it a bit cleaner Modifications in addition to being home for most of the aforementioned if/then/else blocks, the BlobStoreManagedLedgerOffloader class had multiple responsibilities in addition to providing an implementation for the LedgerOffloader interface. My goal was to simplify this class such that its only responsibility was to implement the LedgerOffloader interface. The other major change was the addition of the JCloudBlobStoreProvider enum, which implements 3 interfaces that allow for it to handle the provider specific logic for things such as acquiring the credentials, validating the configuration, and creating a provider-specific instance of BlobStore. Result After this change, we will be able to easily add support for additional JClouds-supported providers by simply adding new elements to the JCloudBlobStoreProvider Enums since the other logic has been isolated and is not vendor specific. See #2865 for more details * Refactored JCloud Tiered Storage * Refactored JCloud Tiered Storage * Added missing import statements * Refactored JCloud Tiered Storage * Refactored JCloud Tiered Storage * Added missing import statements * Refactored JCloud Tiered Storage * Refactored JCloud Tiered Storage * Added missing import statements * fix test * add test logs * fix logs * fix test * fix * fix * fix broker log * fix configuration * fix * fix * fix * fix get BlobStore * repair test presto query tiered storage data * fix test * fix test * fix test TestPrestoQueryTieredStorage * fix * fix Co-authored-by: Sijie Guo <sijie@apache.org> Co-authored-by: gaoran10 <gaoran_10@126.com> Co-authored-by: xiaolong.ran <rxl@apache.org> (cherry picked from commit d3a5233)
Motivation In order to facilitate the support of additional JClouds-supported providers, we first needed to clean up the existing code, as there were a lot of if/then/else constructs throughout the code that were based on the assumption that we either supported AWS S3 or Google Cloud Storage. I didn't want to keep adding else if's to these code blocks for every new provider we add, so I decided to refactor the code to make it a bit cleaner Modifications in addition to being home for most of the aforementioned if/then/else blocks, the BlobStoreManagedLedgerOffloader class had multiple responsibilities in addition to providing an implementation for the LedgerOffloader interface. My goal was to simplify this class such that its only responsibility was to implement the LedgerOffloader interface. The other major change was the addition of the JCloudBlobStoreProvider enum, which implements 3 interfaces that allow for it to handle the provider specific logic for things such as acquiring the credentials, validating the configuration, and creating a provider-specific instance of BlobStore. Result After this change, we will be able to easily add support for additional JClouds-supported providers by simply adding new elements to the JCloudBlobStoreProvider Enums since the other logic has been isolated and is not vendor specific. See apache#2865 for more details * Refactored JCloud Tiered Storage * Refactored JCloud Tiered Storage * Added missing import statements * Refactored JCloud Tiered Storage * Refactored JCloud Tiered Storage * Added missing import statements * Refactored JCloud Tiered Storage * Refactored JCloud Tiered Storage * Added missing import statements * fix test * add test logs * fix logs * fix test * fix * fix * fix broker log * fix configuration * fix * fix * fix * fix get BlobStore * repair test presto query tiered storage data * fix test * fix test * fix test TestPrestoQueryTieredStorage * fix * fix Co-authored-by: Sijie Guo <sijie@apache.org> Co-authored-by: gaoran10 <gaoran_10@126.com> Co-authored-by: xiaolong.ran <rxl@apache.org>
Motivation In order to facilitate the support of additional JClouds-supported providers, we first needed to clean up the existing code, as there were a lot of if/then/else constructs throughout the code that were based on the assumption that we either supported AWS S3 or Google Cloud Storage. I didn't want to keep adding else if's to these code blocks for every new provider we add, so I decided to refactor the code to make it a bit cleaner Modifications in addition to being home for most of the aforementioned if/then/else blocks, the BlobStoreManagedLedgerOffloader class had multiple responsibilities in addition to providing an implementation for the LedgerOffloader interface. My goal was to simplify this class such that its only responsibility was to implement the LedgerOffloader interface. The other major change was the addition of the JCloudBlobStoreProvider enum, which implements 3 interfaces that allow for it to handle the provider specific logic for things such as acquiring the credentials, validating the configuration, and creating a provider-specific instance of BlobStore. Result After this change, we will be able to easily add support for additional JClouds-supported providers by simply adding new elements to the JCloudBlobStoreProvider Enums since the other logic has been isolated and is not vendor specific. See apache#2865 for more details * Refactored JCloud Tiered Storage * Refactored JCloud Tiered Storage * Added missing import statements * Refactored JCloud Tiered Storage * Refactored JCloud Tiered Storage * Added missing import statements * Refactored JCloud Tiered Storage * Refactored JCloud Tiered Storage * Added missing import statements * fix test * add test logs * fix logs * fix test * fix * fix * fix broker log * fix configuration * fix * fix * fix * fix get BlobStore * repair test presto query tiered storage data * fix test * fix test * fix test TestPrestoQueryTieredStorage * fix * fix Co-authored-by: Sijie Guo <sijie@apache.org> Co-authored-by: gaoran10 <gaoran_10@126.com> Co-authored-by: xiaolong.ran <rxl@apache.org>
# Motivation The PR #6335 lost some PR changes, related PRs as below. 1. PR 4196 (2019/5/29 Merli) Configure static PulsarByteBufAllocator to handle OOM errors (#4196) 2. PR 5356 (2019/10/30 Kelly) [TIEREDSTORAGE] Only seek when reading unexpected entry (#5356) 3. PR 4433 (2019/6/4 Higham) [tiered-storage] Add support for AWS instance and role creds (#4433)
# Motivation The PR #6335 lost some PR changes, related PRs as below. 1. PR 4196 (2019/5/29 Merli) Configure static PulsarByteBufAllocator to handle OOM errors (#4196) 2. PR 5356 (2019/10/30 Kelly) [TIEREDSTORAGE] Only seek when reading unexpected entry (#5356) 3. PR 4433 (2019/6/4 Higham) [tiered-storage] Add support for AWS instance and role creds (#4433) (cherry picked from commit 68759ff)
…ache#8630) # Motivation The PR apache#6335 lost some PR changes, related PRs as below. 1. PR 4196 (2019/5/29 Merli) Configure static PulsarByteBufAllocator to handle OOM errors (#4196) 2. PR 5356 (2019/10/30 Kelly) [TIEREDSTORAGE] Only seek when reading unexpected entry (#5356) 3. PR 4433 (2019/6/4 Higham) [tiered-storage] Add support for AWS instance and role creds (#4433) (cherry picked from commit 68759ff) (cherry picked from commit 8793963)
…ache#8630) # Motivation The PR apache#6335 lost some PR changes, related PRs as below. 1. PR 4196 (2019/5/29 Merli) Configure static PulsarByteBufAllocator to handle OOM errors (#4196) 2. PR 5356 (2019/10/30 Kelly) [TIEREDSTORAGE] Only seek when reading unexpected entry (#5356) 3. PR 4433 (2019/6/4 Higham) [tiered-storage] Add support for AWS instance and role creds (#4433) (cherry picked from commit 68759ff) (cherry picked from commit 8793963) (cherry picked from commit 0388a1a)
Motivation
In order to facilitate the support of additional JClouds-supported providers, we first needed to clean up the existing code, as there were a lot of if/then/else constructs throughout the code that were based on the assumption that we either supported AWS S3 or Google Cloud Storage. I didn't want to keep adding else if's to these code blocks for every new provider we add, so I decided to refactor the code to make it a bit cleaner
Modifications
in addition to being home for most of the aforementioned if/then/else blocks, the BlobStoreManagedLedgerOffloader class had multiple responsibilities in addition to providing an implementation for the LedgerOffloader interface. My goal was to simplify this class such that its only responsibility was to implement the LedgerOffloader interface.
The other major change was the addition of the JCloudBlobStoreProvider enum, which implements 3 interfaces that allow for it to handle the provider specific logic for things such as acquiring the credentials, validating the configuration, and creating a provider-specific instance of BlobStore.
Result
After this change, we will be able to easily add support for additional JClouds-supported providers by simply adding new elements to the JCloudBlobStoreProvider Enums since the other logic has been isolated and is not vendor specific.
See #2865 for more details