Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with Oracle Kubernetes Engine. #146

Open
shb-mll opened this issue May 19, 2023 · 11 comments
Open

Compatibility with Oracle Kubernetes Engine. #146

shb-mll opened this issue May 19, 2023 · 11 comments

Comments

@shb-mll
Copy link

shb-mll commented May 19, 2023

Hi Team,

I want to install the core dump handler on a OKE cluster (the nodes are on v1.24.1 with Oracle-Linux-8.7) . The https://github.com/IBM/core-dump-handler#kubernetes-service-compatibility doesn't list oracle linux as a supported product. However could you confirm if this can be deployed OKE ? If yes, could you provide the below.

  • supporting configuration to deploy.
  • a configuration we can use to send core dumps to OCI buckets.
@No9
Copy link
Collaborator

No9 commented May 19, 2023

Hey @shb-mll
Unfortunately I don't have a OKE account so I can't provide supporting configuration or the configuration to send core dumps to the OCI buckets.

That said if you want to try to install it and report errors or questons in this thread I am happy to troubleshoot with you as much as I can.

Also happy for you to land the change you discover as a PR.

@shb-mll
Copy link
Author

shb-mll commented May 20, 2023

I installed core-dump-handler on OKE with below values for daemonset. segfaulter test confirm the cores were collected at /var/mnt/core-dump-handler/cores on the node.

daemonset:
 crioEndpoint: "unix:///var/run/crio/crio.sock"
 hostContainerRuntimeEndpoint: "/run/crio/crio.sock"
 mountContainerRuntimeEndpoint: true
 extraEnvVars: |-
   - name: S3_ENDPOINT
     value: "https://{bucketnamespace}.compat.objectstorage.us-ashburn-1.oraclecloud.com"
 s3BucketName: "BUCKETNAME"
 s3Region: "us-ashburn-1"
 s3Secret: "XXX"
 s3AccessKey: "XXX"

OCI is amazon s3 API compatible - link.
As outlined in the prerequiste section of the article, I have configured the below to setup access from Amazon S3(coredumphandler) to Object Storage.
a. designated a compartment for the Amazon S3 Compatibility API
b. customer secret key
c. S3_ENDPOINT

However I see below error during upload to the OCI bucket.

[2023-05-20T09:20:52Z INFO  core_dump_agent] Executing Agent with location : /var/mnt/core-dump-handler/cores
[2023-05-20T09:20:52Z INFO  core_dump_agent] Setting s3 endpoint location to: https://{bucketnamespace}.compat.objectstorage.us-ashburn-1.oraclecloud.com
[2023-05-20T09:20:52Z INFO  core_dump_agent] Dir Content ["/var/mnt/core-dump-handler/cores/858c15b9-8ef2-47b5-97ac-6ce2febb272a-dump-1684532650-segfaulter-segfaulter-1-4.zip"]
[2023-05-20T09:20:52Z INFO  core_dump_agent] Uploading: /var/mnt/core-dump-handler/cores/858c15b9-8ef2-47b5-97ac-6ce2febb272a-dump-1684532650-segfaulter-segfaulter-1-4.zip
[2023-05-20T09:20:52Z INFO  core_dump_agent] zip size is 29662
[2023-05-20T09:20:52Z ERROR core_dump_agent] Upload Failed Got HTTP 403 with content '<?xml version="1.0" encoding="UTF-8"?><Error><Message>The required information to complete authentication was not provided.</Message><Code>SignatureDoesNotMatch</Code></Error>'
[2023-05-20T09:20:52Z INFO  core_dump_agent] INotify Starting...
[2023-05-20T09:20:52Z INFO  core_dump_agent] INotify Initialised...
[2023-05-20T09:20:52Z INFO  core_dump_agent] INotify watching : /var/mnt/core-dump-handler/cores

@shb-mll
Copy link
Author

shb-mll commented May 20, 2023

ok it appears that there was some issue with node where the coredump was collected. I removed the node and ran the segfaulter test again and it completed with a successful upload to the OCI bucket.

However storing the customer secret key in a kubernetes secret is not optimal way, need to find a better way to authenticate to OCI bucket.

@No9
Copy link
Collaborator

No9 commented May 21, 2023

Hey @shb-mll
This is excellent progress thanks for the update.
In terms of a different way to manage access you may want to investigate if OKE workload identities are integrated with OCI buckets.
There may be a similar pattern to the AWS security token service available.
https://github.com/IBM/core-dump-handler/blob/main/charts/core-dump-handler/values.aws.sts.yaml

@shb-mll
Copy link
Author

shb-mll commented May 22, 2023

Hey @No9,
Yeah I checked that however the workload identity implementation is different in OKE. As of today there is no concept of annotating a k8s service account with OCI IAM account.
With workload identity feature the OKE service account does act as an identity however the authorisation is handled in the application code. So to use this one would need to update the coredumphandler code to use OkeWorkloadIdentityAuthenticationDetailsProvider and also provide the OCI resource to access.

Currently workload identity is only supported in Go and JAVA SDK's https://docs.oracle.com/en-us/iaas/Content/ContEng/Tasks/contenggrantingworkloadaccesstoresources.htm#:~:text=The%20following%20OCI,v2.54.0%20(and%20later). So I am not sure if it will work in RUST.

@No9
Copy link
Collaborator

No9 commented May 22, 2023

OK can you explain a bit more about what is meant by

storing the customer secret key in a kubernetes secret is not optimal way,

If you are looking to provide the core dump to an external user you may want to look at building a post processor by using one of these two options:

  1. Disable the built in uploader and providing your own.
    https://github.com/IBM/core-dump-handler/blob/main/FAQ.md#how-should-i-integrate-my-own-uploader

  2. Enable the event system and implement an external service.
    By setting the daemonset.eventDirectory and composer.coreEvents options in the chart an extra file is generated when a core dump is generated that can be used for post processing. This enables you to still utilise the upload feature but you may want to move the coredump to a location outside of the core environment.

@shb-mll
Copy link
Author

shb-mll commented May 22, 2023

The customer secret key is created per user in OCI. This customer secret key is a Access Key/Secret Key pair used to access the object storage in OCI via amazon s3 compatible api.

Going by the default setup of core-dump-handler they keys are base64 encoded and stored in secret s3config which is not that secure.

Thanks for the suggestions, I will check if its possible to implement the two options in my setup.

About option 2 is there more information on how exactly the additional file be used for post processing. could you share some example setups if available.

@shb-mll
Copy link
Author

shb-mll commented Sep 15, 2023

Hi @No9 Due to some requirements I had to downgrade my worker nodes to Oracle Linux 7 (earlier I was using oracle linux 8). I made some changes to the daemonset values (listed below)
and with these changes I can see the coredump is generated by the composer however its running into an error while starting the upload step . Could you assist with this error message.

[2023-09-15T19:28:15Z INFO  core_dump_agent] Uploading: /var/mnt/core-dump-handler/cores/xxx-xxx-xxx-xx-xxx-dump-xxx-segfaulter-segfaulter-1-4.zip
[2023-09-15T19:28:15Z INFO  core_dump_agent] zip size is 29397
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidHeaderValue', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/rust-s3-0.31.0/src/request_trait.rs:434:65
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

My current setting for daemonset and composer.

        daemonset:
          crioEndpoint: "unix:///var/run/crio/crio.sock"
          hostContainerRuntimeEndpoint: "/run/crio/crio.sock"
          mountContainerRuntimeEndpoint: true
          vendor: rhel7
     
          extraEnvVars: |-
            - name: S3_ENDPOINT
              value: "https://{namespace}.compat.objectstorage.us-ashburn-1.oraclecloud.com"
        composer:
          logLevel: "Debug"

@No9
Copy link
Collaborator

No9 commented Sep 15, 2023

Hey @shb-mll
This error is being thrown by the RustS3 Library

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidHeaderValue', /root/.cargo/registry/src/github.com-1ecc6299db9ec823/rust-s3-0.31.0/src/request_trait.rs:434:65

https://github.com/durch/rust-s3/blob/7fdb685d71385152198f906068f15faaabd28592/s3/src/error.rs#L39

Looks like the Oracle objectstorage API isn't compatible with that library.

Just double checking but in your daemonset config you have replaced {namespace} with the actual namespace.

If you have configured the namespace properly then can I suggest you raise an issue/provide a fix in the rust-s3 library and we can catch it by bumping the dependency.

[Edit]
Or you can provide your own uploader as discussed previously

@shb-mll
Copy link
Author

shb-mll commented Sep 20, 2023

@No9 yes I provided the namespace value for S3_ENDPOINT.

Also this worked in oracle linux 8, when I tested earlier on 20th May. (screenshot below)
image

Also there has been no update to oracle compatibility API since 2017
https://docs.oracle.com/en-us/iaas/releasenotes/changes/0045f4a2-9afa-4f68-86b6-59dd70052ca8/

@No9
Copy link
Collaborator

No9 commented Sep 22, 2023

Thanks for the update - As the last release for this project was in January and the service was working in May and the error being based on the http response from the object storage service it does point to the issue being due to downstream (i.e. object storage) config issues or changes.

We don't have an Oracle Cloud account to validate or debug so any further investigation would need to come from your side.

Can I suggest as a next step you reproduce the issue by creating a standalone app that just contains the same version of the rust-s3 library at version 0.31.0. This will really help with triage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants