Skip to content

feat: enable TLS for Flame#388

Merged
k82cn merged 12 commits into
xflops:mainfrom
k82cn:flm_234
Mar 29, 2026
Merged

feat: enable TLS for Flame#388
k82cn merged 12 commits into
xflops:mainfrom
k82cn:flm_234

Conversation

@k82cn
Copy link
Copy Markdown
Contributor

@k82cn k82cn commented Mar 26, 2026

fix #234

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the security posture of the Flame cluster by implementing mutual TLS (mTLS) across all internal communication channels. Beyond encrypting data in transit and authenticating components, it introduces a robust mechanism for user-based isolation by associating sessions and applications with the identity of the client that created them. This foundational change improves data integrity, prevents unauthorized access, and lays the groundwork for future access control features.

Highlights

  • Mutual TLS (mTLS) Implementation: Enabled mTLS for all inter-component communication within the Flame cluster, securing connections between the Session Manager, Executor Manager, and Object Cache. This includes encryption, server authentication, and mandatory client authentication.
  • Session and Application Isolation: Introduced client identity extraction from mTLS certificates (Common Name) to establish ownership for sessions and applications. This enforces user-based isolation, ensuring users can only access resources they own, while allowing configurable admin users full access.
  • TLS Configuration and Management: Added comprehensive TLS configuration options to flame-cluster.yaml and flame.yaml, allowing users to enable/disable mTLS and specify certificate paths. New flmadm cert commands were introduced to simplify the generation of CA, server, and client certificates.
  • Dependency and Schema Updates: Updated Cargo.toml and Cargo.lock with new Rust crates for X.509 parsing, certificate generation, and Rustls-based TLS. Database schemas for sessions and applications were modified to include an owner field and corresponding indexes for persistent ownership tracking.
Ignored Files
  • Ignored by pattern: .github/workflows/** (3)
    • .github/workflows/e2e-bench.yaml
    • .github/workflows/e2e-py.yaml
    • .github/workflows/e2e-rust.yaml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements mutual TLS (mTLS) across Flame components, enhancing security by encrypting inter-component communication and enabling client authentication. Key changes include adding owner fields to Application and Session structs to enforce identity-based access control using client certificate Common Names (CNs). Configuration files (flame-cluster.yaml, flame.yaml) are updated to support TLS settings, and new flmadm cert commands are introduced for certificate generation. The Cargo.toml files are updated with necessary TLS-related dependencies and tonic features. Review feedback points out a discrepancy between the design document and implementation regarding a min_version TLS configuration option, a missing tls-native-roots feature for tonic, an opportunity to simplify client CN extraction logic, and a missing test case for duplicate session ID creation.

Comment thread docs/designs/RFE234-mtls/FS.md Outdated
Comment thread docs/designs/RFE234-mtls/FS.md Outdated
Comment thread session_manager/src/apiserver/frontend.rs Outdated
Comment thread session_manager/src/storage/engine/filesystem.rs
Signed-off-by: Klaus Ma <klausm@nvidia.com>
@k82cn
Copy link
Copy Markdown
Contributor Author

k82cn commented Mar 28, 2026

@gemini-code-assist , can help to review the new code?

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements server-side TLS encryption for all gRPC and Arrow Flight connections across Flame components, including the Session Manager, Executor Manager, and Object Cache. It introduces a new configuration structure where executor settings are moved under the cluster section, adds a certificate generation script, and updates both the Rust and Python SDKs to support secure connections. Critical security issues were identified in the implementation of the insecure_skip_verify flag in both SDKs, as the current code does not actually bypass certificate verification. Additionally, feedback suggests centralizing TLS configuration logic, removing redundant file existence checks to prevent TOCTOU race conditions, and correcting contradictions in the design documentation regarding TLS inheritance.

Comment thread sdk/python/src/flamepy/core/client.py Outdated
Comment thread sdk/rust/src/client/mod.rs Outdated
Comment thread docs/designs/RFE234-tls/FS.md Outdated
Comment thread sdk/rust/src/apis/ctx.rs
Comment thread ci/generate-certs.sh Outdated
Comment thread common/src/ctx.rs Outdated
Comment thread docs/designs/RFE234-tls/FS.md
@k82cn k82cn changed the title feat: enable mTLS for isolation. feat: enable TLS for Flame Mar 28, 2026
k82cn added 11 commits March 28, 2026 22:18
Signed-off-by: Klaus Ma <klausm@nvidia.com>
Signed-off-by: Klaus Ma <klausm@nvidia.com>
Signed-off-by: Klaus Ma <klausm@nvidia.com>
Signed-off-by: Klaus Ma <klausm@nvidia.com>
Signed-off-by: Klaus Ma <klausm@nvidia.com>
Signed-off-by: Klaus Ma <klausm@nvidia.com>
Signed-off-by: Klaus Ma <klausm@nvidia.com>
Signed-off-by: Klaus Ma <klausm@nvidia.com>
Signed-off-by: Klaus Ma <klausm@nvidia.com>
Signed-off-by: Klaus Ma <klausm@nvidia.com>
Signed-off-by: Klaus Ma <klausm@nvidia.com>
@k82cn k82cn merged commit 040bf11 into xflops:main Mar 29, 2026
6 checks passed
@k82cn k82cn deleted the flm_234 branch March 29, 2026 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Enable TLS for all components

1 participant