Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Allow enabling secure mode with Kerberos #334

Closed
wants to merge 107 commits into from

Conversation

sbernauer
Copy link
Member

@sbernauer sbernauer commented Mar 16, 2023

Description

Closes #178
Fixes #338

TODOs

  • Release new Hadoop image with openssl and Kerberos clients use in docs and tests
  • Release and use operator-rs change
  • Fix hardcoded kinit nn/simple-hdfs-namenode-default.default.svc.cluster.local@CLUSTER.LOCAL -kt /stackable/kerberos/keytab in entrypoints
  • Go through all hadoop settings and see if they can be improved
  • Test different realms
  • Discuss CRD change
  • Discuss how to expose this in Discovery CM -> During on-site 2023/05 we have decided to ship this feature without exposing it via discovery for now
  • Implement discovery
  • Tests
  • Docs
  • Let @maltesander have a look how we can better include the init container in the code structure
  • Test long running cluster (maybe turn down ticket lifetime for that)

Definition of Done Checklist

  • Not all of these items are applicable to all PRs, the author should update this template to only leave the boxes in that are relevant
  • Please make sure all these things are done and tick the boxes

Author

Edit tasklist title
Beta Give feedback Tasklist Author, more options

Delete tasklist

Delete tasklist block?
Are you sure? All relationships in this tasklist will be removed.
  1. Changes are OpenShift compatible
    Options
  2. CRD changes approved
    Options
  3. Helm chart can be installed and deployed operator works
    Options
  4. Integration tests passed (for non trivial changes)
    Options

Reviewer

Edit tasklist title
Beta Give feedback Tasklist Reviewer, more options

Delete tasklist

Delete tasklist block?
Are you sure? All relationships in this tasklist will be removed.
  1. Code contains useful comments
    Options
  2. (Integration-)Test cases added
    Options
  3. Documentation added or updated
    Options
  4. Changelog updated
    Options
  5. Cargo.toml only contains references to git tags (not specific commits or branches)
    Options

Acceptance

Edit tasklist title
Beta Give feedback Tasklist Acceptance, more options

Delete tasklist

Delete tasklist block?
Are you sure? All relationships in this tasklist will be removed.
  1. Feature Tracker has been updated
    Options
  2. Proper release label has been added
    Options

Once the review is done, comment bors r+ (or bors merge) to merge. Further information

…he datanode wait-for-namenodes init containers fails because it's stupid any requires the dfs.namenode.kerberos.principal setting (and ignores the dfs.namenode.kerberos.principal.pattern) (hdfs will start anyway)
bors bot pushed a commit to stackabletech/operator-rs that referenced this pull request Mar 17, 2023
@sbernauer sbernauer requested a review from a team June 2, 2023 06:41
Copy link
Member

@nightkr nightkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly LGTM in theory, a few nits and haven't tested it yet.

docs/modules/hdfs/pages/usage-guide/security.adoc Outdated Show resolved Hide resolved
rust/operator/src/container.rs Outdated Show resolved Hide resolved
rust/operator/src/container.rs Outdated Show resolved Hide resolved
rust/operator/src/container.rs Show resolved Hide resolved
rust/operator/src/kerberos.rs Show resolved Hide resolved
rust/operator/src/kerberos.rs Outdated Show resolved Hide resolved
@@ -0,0 +1,49 @@
# Tribute to https://github.com/Netflix/chaosmonkey

# We need to force-delete the Pods, because IONOS is sometimes unable to delete the pod (it's stuck in Terminating for > 20 minutes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's worrying... do we have access to IONOS' kubelet logs? Is this an IONOS-only issue?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have to admit I'm not sure if we have access to the kubelet logs. Test passed on Azure however

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update: Support recommended using https://github.com/kvaps/kubectl-node-shell, which worked.
Sadly it only logs stuff during startup and than remains silent.

journalctl -u kubelet --since "48 hour ago"
-- Logs begin at Sat 2023-06-03 19:51:35 UTC, end at Tue 2023-06-06 12:01:05 UTC. --
-- No entries --

Anyway, even with force deletion tests have random timeouts on IONOS after we added the chaosmonkey. Everything is really slow

@sbernauer
Copy link
Member Author

Should be ready to be merged 👍

@sbernauer sbernauer requested a review from nightkr June 12, 2023 13:13
Copy link
Member

@nightkr nightkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for integration tests to complete, but LGTM otherwise.

We'll probably want to note this in the changelog though?

@nightkr
Copy link
Member

nightkr commented Jun 13, 2023

Tests passed.

@sbernauer
Copy link
Member Author

All that and I forgot the CHANGELOG :D
Thanks for your review and work + help on Kerberos overall!

I have 4 testclusters running since this morning, let's wait some time with merging to see if they survive

@sbernauer
Copy link
Member Author

All 4 clusters are still healthy, merging this

@sbernauer
Copy link
Member Author

bors r+

bors bot pushed a commit that referenced this pull request Jun 14, 2023
# Description

Closes #178
Fixes #338

TODOs

- [x] Release new Hadoop image with openssl and Kerberos clients use in docs and tests
- [x] Release and use operator-rs change
- [x] Fix hardcoded `kinit nn/simple-hdfs-namenode-default.default.svc.cluster.local@CLUSTER.LOCAL -kt /stackable/kerberos/keytab` in entrypoints
- [x] Go through all hadoop settings and see if they can be improved
- [X] Test different realms
- [x] Discuss CRD change
- [x] Discuss how to expose this in Discovery CM -> During on-site 2023/05 we have decided to ship this feature without exposing it via discovery *for now*
- [x] Implement discovery
- [x] Tests
- [x] Docs
- [x] Let  @maltesander have a look how we can better include the init container in the code structure
- [x] Test long running cluster (maybe turn down ticket lifetime for that)
@bors
Copy link
Contributor

bors bot commented Jun 14, 2023

Pull request successfully merged into main.

Build succeeded!

The publicly hosted instance of bors-ng is deprecated and will go away soon.

If you want to self-host your own instance, instructions are here.
For more help, visit the forum.

If you want to switch to GitHub's built-in merge queue, visit their help page.

@bors bors bot changed the title Allow enabling secure mode with Kerberos [Merged by Bors] - Allow enabling secure mode with Kerberos Jun 14, 2023
@bors bors bot closed this Jun 14, 2023
@bors bors bot deleted the spike/security2 branch June 14, 2023 11:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cluster bricks when all journalnodes are down Allow enabling secure mode with Kerberos on HDFS
3 participants