New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage: Add Oracle Cloud Infrastructure Object Storage Bucket support #4661
Storage: Add Oracle Cloud Infrastructure Object Storage Bucket support #4661
Conversation
16a3cfd
to
21e07f2
Compare
Here is
The |
For ci/circleci testing |
For go / Linters (Static Analysis) for Go
I assume this is normal for a PR. Please let me know if this is not the case here. Thx! |
Please add the DCO (: |
0970286
to
91db551
Compare
Please let me know if there is any additional information that I need to provide to help the review process. Thx! |
3ef73f7
to
e67c44d
Compare
Hi, looks great and I think it makes sense to provide wider support for different clouds.
Is it though? I'm quite confident that a specific ULID is created and it should not be a problem to use the same object store for n-amount of instances. If one would like to separate the data, one could use a different object store instance. I'm not really a fan of the Perhaps you could elaborate a bit more on it for me, it might be that I'm missing the actual point of course :) |
To answer your initial question. Here is use case for our team: We use OCI instance principle to grant bucket access to the K8s cluster. We also create/delete prometheus instances dynamically (on demand) all the time. With the In additional, with the
I am open for suggestion for my MR. If you really think I have the similar code running in few of my prometheus instances for over couple months with hiccup 🤞 Again, thanks for reviewing my MR. |
To be clear, you have equal say in how we should do things! :) I think your use-case makes sense and perhaps it should even be a thing to implement that path for each backend we have. Let's wait for some more reviews. I'll reminder myself of this PR and will ping others within 2 weeks if there is no other review to get this moving :) |
Thanks. Looking forward to seeing more reviewers. If it helps, I am happy to do a quick demo during the Thursday US community meeting in couple weeks so reviewers/people can better understand the motivation of this MR and how things work in our team and OCI. Just let me know here and feel free to slack me in #thanos-dev. |
e67c44d
to
729f80b
Compare
Hi @wiardvanrij, @GiedriusS, @bwplotka |
729f80b
to
83084b1
Compare
Hi @wiardvanrij @GiedriusS @bwplotka |
docs/storage.md
Outdated
The default HTTP Client comes with this Thanos OCI Object Storage should be able to handle most of the use cases. | ||
|
||
By default, like other clients, Thanos OCI Object Storage client will storage the data from a single prometheus instance to the top directory of the bucket. | ||
This means each bucket can only store one prometheus instance data. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this mean? Why only one? 🤔 Block UUIDs are quite random and it is very unlikely that there would be a clash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the poor description.
When thanos uploads all the blocks from a prometheus instance, it will create a directory structure similar to the following in OCI objectstore
|-- 01FF8QX35297NG9HCDHGH1BN1
| |-- chunks
| |-- 000001
| |-- 000002
| |-- ....
|-- 01FFPFKMAQQNF4N945X7JWPQ8
| |-- chunks
|. |-- ....
|-- .... other prometheus blocks
|
|-- debug
|-- meta
|-- .....
I consider above as one prometheus instance data.
In this PR, a new option called object_base_path is added.
If defined, the whole directory structure above will be uploaded under the object_base_path prefix.
In second screenshot in this PR, it holds 4 prometheus instances.
Prometheus A - object_base_path is set to thanos.dev.thanos
Prometheus B - object_base_path is set to thanos.meta.sa/0
Prometheus C - object_base_path is set to thanos.meta.sa/1
Prometheus D - object_base_path is set to thanos.meta.sa/2
Here is the illustration in OCI objectstorage bucket.
|-- thanos.dev.thanos
|-- <<Promethus-A with all blocks and debug meta>>
|-- thanos.meta.sa
|-- 0 <<Promethus-B with all blocks and debug meta>>
|-- 1 <<Promethus-C with all blocks and debug meta>>
|-- 2 <<Promethus-D with all blocks and debug meta>>
|-- 01FF8QX35297NG9HCDHGH1BN1
| |-- chunks
| |-- 000001
| |-- 000002
| |-- ....
|-- 01FFPFKMAQQNF4N945X7JWPQ8
| |-- chunks
|. |-- ....
|-- .... other prometheus blocks
|
|-- debug
|-- meta
|-- .....
If you want to provide a better description for the Thanos community, I will be happy to change it in my PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's what I mean - this #1318 has been a long-standing issue. I think this kind of functionality needs to be added "one level above" i.e. not at the raw object storage calls level. Otherwise, it would be weird to have duplicate functionality. I think others would agree as well. My suggestion would be to remove this prefix functionality and move it one level above so that it could be used with any kind of object storage. It could be done in a separate pull request. We could double check in #thanos-dev, though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Yes, adding the feature to one level up
is a great idea for all Objectstorage clients.
Since allowing the same bucket to hold multiple prometheus instances' data is the main use case of our team and we would like to use Thanos master branch instead of my forked branch.
I can think of three different approaches (order by my preferences) to avoid duplicate functionality in my PR:
Approach # 1
- hide the object_base_path configuration from storage.md documentation
- without the object_base_path setting, my PR will act exactly the same as other ObjecctStorage Clients
- the hidden feature can still be used by the insider who know the existence of this feature
- after the code check it, whoever want to work on *: Support multi-tenancy on object storage using custom prefix #1318 will have an example how OCI client is implemented.
Approach # 2
- hide the object_base_path configuration from storage.md documentation
- hard code the object_base_path to empty string and won't accept any hidden customization
Approach # 3
Remove all the object_base_path throughout the PR.
Please let me know which approach is acceptable for you.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay again, bit busy times.
As I like the different approaches I'm not that fan of hidden features in a way that one needs to know the existence in an open source project. Also I took some time to read the long standing issue and the consensus after the discussion was this:
Looks like the decision is that we are ok for simple custom prefix on config level (for all objstorages) in YAML (:
So, I would say that if we implement it, it should be "that". Still open for discussion though, but then it should require a lot more discussion from everyone else again IMO ^^
I would say, let's just implement this on level higher as discussed in that main issue. For all the storage backends, implement at least the config in a way that it can say "This is not supported yet for this object store" if one would configure it. With then the exception for OCI, in which you did implement it.
This gives a bit more motivation to get that issue fixed, and we can have iterations per objectstorage to actually implement it. I think by saying "It needs to be done for them all in one go", is really limiting us to get it moving forward.
What do you think? Also, I'm actually happy to help on that. I got some spare time soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really think we should do (3) approach here. Then, you could implement it in a follow up PR. We have interfaces for object storage so IMHO it should be fairly easy to implement prefix functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove the prefix option when I have a chance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't forget to remove this sentence as it is not true in practice, it is very hard to get a UUID collision.
Thanks @GiedriusS for the review. |
83084b1
to
c6c1e11
Compare
f0edce8
to
091134f
Compare
399c53c
to
98e4346
Compare
Hi @wiardvanrij |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the acronym OCI is overloaded, it would be better to expand out all the visible uses of OCI to Oracle Cloud Infrastructure to avoid any potential confusion. I've also had a go at tidying up the docs and error messages, but please test this before updating the PR! :)
98e4346
to
f972f82
Compare
@wiardvanrij
|
Yea, katacoda closed it's doors (: - We have fixed this on main now (#5454) , please have an other rebase (sorry about that :D ) |
Also feel free to create a changelog entry already ^^ |
f972f82
to
e5c3c12
Compare
Hi @wiardvanrij Rebased. Hopefully, that will resolve the last failed doc check. I also checked the |
e5c3c12
to
8ad71ac
Compare
FYI
|
pkg/objstore/oci/helper.go
Outdated
"github.com/thanos-io/thanos/pkg/objstore" | ||
) | ||
|
||
func DefaultTransport(config Config) *http.Transport { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a common function for this now - could you please remove this and use that function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @GiedriusS,
Sorry for the late response, I am on vacation this week.
This DefaultTransport
func allow user to define some customize setting here
type: OCI
config:
provider: "default"
bucket: ""
compartment_ocid: ""
part_size: "" // Optional part size to override the OCI default of 128 MiB, value is in bytes.
max_request_retries: "" // Optional maximum number of retries for a request.
request_retry_interval: "" // Optional sleep duration in seconds between retry requests.
http_config:
idle_conn_timeout: 1m30s // Optional maximum amount of time an idle (keep-alive) connection will remain idle before closing itself. Zero means no limit.
response_header_timeout: 2m // Optional amount of time to wait for a server's response headers after fully writing the request.
tls_handshake_timeout: 10s // Optional maximum amount of time waiting to wait for a TLS handshake. Zero means no timeout.
expect_continue_timeout: 1s // Optional amount of time to wait for a server's first response headers. Zero means no timeout and causes the body to be sent immediately.
insecure_skip_verify: false // Optional. If true, crypto/tls accepts any certificate presented by the server and any host name in that certificate.
max_idle_conns: 100 // Optional maximum number of idle (keep-alive) connections across all hosts. Zero means no limit.
max_idle_conns_per_host: 100 // Optional maximum idle (keep-alive) connections to keep per-host. If zero, DefaultMaxIdleConnsPerHost=2 is used.
max_conns_per_host: 0 // Optional maximum total number of connections per host.
disable_compression: false // Optional. If true, prevents the Transport from requesting compression.
client_timeout: 90s // Optional time limit for requests made by the HTTP Client.
Do you want me to ONLY
use the DefaultTransport
func in net/http/transport.go to construct the Transport for the httpclient?
If this is the case, I will need to remove the customize config fields (which are hard coded) from the PR.
For now, I renamed the DefaultTransport
to CustomTransport
.
Please advice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the late response, I am on vacation this week.
Sorry @aarontams for the earlier message, just saw this one. Maybe you can open another PR against the new repo when you get back.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some suggestions by @Djelibeybi to the documentation + one nit. Could you please look through those, rebase, and then we can merge this?
Signed-off-by: aaron.tam <aaron.tam@oracle.com>
8ad71ac
to
575d30e
Compare
It was already done in previous update from 2 weeks ago Thx! |
@aarontams Could you please rebase the PR? I would like to merge these changes and sync them to thanos-io/objstore before the #5254 is merged. |
Closing this to move it to thanos-io/objstore#14 |
@kakkoyun @GiedriusS @wiardvanrij |
@aarontams it was moved to thanos-io/objstore#14 |
@Djelibeybi
I thought @kakkoyun wanted to merge this PR to https://github.com/thanos-io/thanos repo first. Instead this PR is closed. I am confused. Am I missing something obvious? |
Ah, right. Sorry, I just saw the link and didn't look too closely. Guess we'll have to wait and see. |
Sorry about that. I totally missed this pr and enabled auto-merge for #5510. This causes all the object-store related code to be removed from this codebase. I think @kakkoyun intended to merge this pr first so that's why he asked you for a rebase. But I just merged that #5510 so that Kemal had to only close this pr. I am so sorry about it. @aarontams Would you mind opening a pr on the repo for this change? Thanks! |
Sorry for not clearly communicating my actions here. It is exactly as @yeya24 explained. I saw that you're on vacation and I didn't want to push you to rebase. So I moved on. We still want to have these changes. That's why I have opened an issue on the new objstore repo for this. I know it's a bit of more work. Could you please open a new PR on the new repo? I hadn't done it because the package structure has changed and it's more involved than syncing over some files. Sorry for the inconvenience again @aarontams |
Hi @kakkoyun and @yeya24 I have a new PR ready in thanos-io/objstore#15 with new test result. Hopefully, it will be a smooth ride from here. We will move the future conversation to the new PR. |
Changes
Add Oracle Cloud Infrastructure (OCI) Object Storage Bucket support
OCI Object Storage supports three different providers, default, instance principle, and raw. A lot of OCI users like to use Instance principle to security. It is very important to support instance principle for the OCI users.
S3 doesn't support OCI instance principle.
Support multiple prometheus instances with a single bucket
By default, like other clients, this new Thanos OCI Object Storage client will storage the data from a single prometheus instance to the top directory of the bucket. This means each bucket can only store one prometheus instance data.
In OCI, a lot of users like to put different type of data in a single bucket.
To support multiple prometheus instances with a single bucket, a new object_base_path option is introduced in this OCI Object Storage client. If defined, the value will be used as the prefix of the prometheus data in OCI bucket.
For example, if you want to use the same bucket for 2 prometheus instances, you can define object_base_path as /path/for/prometheus1 for the first prometheus instance, and /path/for/prometheus2 for the second prometheus instance.
Three tests are included here to demo how the object_base_path work.
Verification
Running similar OCI client against 3 prometheus instances for over 50 days without any issue.
For each prometheus instance
Testing
See the test steps in storage.md under OCI section.
Ran three tests with different OCI_OBJECT_BASE_PATH settings. Below are the screenshots from Oracle Cloud Infrastructure console to show the object paths when the tests were running.
Test 1 - OCI_OBJECT_BASE_PATH is not set
Test 2 - OCI_OBJECT_BASE_PATH=/my/base/path
Test 3 - OCI_OBJECT_BASE_PATH=test3/base-path