feat: Authoritative CoreDNS for Slurm/MPI hostname resolution#4353
Open
sujit-jadhav wants to merge 1 commit into
Open
feat: Authoritative CoreDNS for Slurm/MPI hostname resolution#4353sujit-jadhav wants to merge 1 commit into
sujit-jadhav wants to merge 1 commit into
Conversation
Implement CoreDNS as the authoritative DNS server for cluster-internal hostname resolution, replacing /etc/hosts-based management. New input configuration: - input/dns_config.yml: dns_enabled, dns_domain, dns_ttl, dns_cache_ttl, dns_fabric_suffixes, dns_soa, dns_reverse_enabled Validation: - JSON schema (dns_config.json) and validation logic (validate_dns_config) - RFC 1035 domain validation, TTL range checks, SOA positive-int checks, fabric suffix format validation, reserved domain detection - 33 unit tests covering all validation paths CoreDNS deployment (OIM): - Corefile.j2 template: file plugin for forward/reverse zones, cache, reload (10s), forward to upstream DNS - Systemd quadlet (coredns.container.j2) for podman-managed container - deploy_coredns.yml task: image pull, config generation, service start DNS zone rendering pipeline: - forward_zone.j2: SOA + NS + A records from ip_name_map - reverse_zone.j2: SOA + NS + PTR records - generate_dns_zones.yml: reads SMD inventory, renders zones - generate_reverse_zone_additional.yml: per-additional-subnet reverse zones - update_dns_zones.yml: lifecycle hook for node add/remove Cloud-init templates (7 files): - Conditional: resolv.conf pointing to OIM CoreDNS when dns_enabled, otherwise legacy /etc/hosts append Slurm /etc/hosts management: - update_hosts_munge.yml: skip /etc/hosts edits when dns_enabled - update_hosts.yml: skip bulk /etc/hosts updates when dns_enabled K8s CoreDNS integration: - Forward dns_domain queries to OIM CoreDNS from K8s CoreDNS ConfigMap Multi-subnet DHCP compatibility (PR #4352): - Reverse zones generated for admin + additional subnets - All variable names compatible with multi-subnet PR Backward compatible: dns_enabled defaults to false, preserving existing /etc/hosts behavior for users who do not opt in.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements CoreDNS as the authoritative DNS server for cluster-internal hostname resolution, replacing
/etc/hosts-based management for Slurm and MPI workloads.Changes
New Files (11 new, 19 modified — 30 files, +1038 / -17 lines)
Input & Validation
input/dns_config.yml— user configuration (dns_enabled, dns_domain, TTLs, SOA, fabric suffixes)common/.../schema/dns_config.json— JSON Schemacommon/.../en_us_validation_msg.py— 8 DNS error message constantscommon/.../provision_validation.py—validate_dns_config()functioncommon/.../config.py— register dns_config in validation pipelinecommon/.../tests/test_dns_config_validation.py— 33 unit tests (all pass)CoreDNS Deployment (OIM)
prepare_oim/.../templates/Corefile.j2— file, cache, reload, forward pluginsprepare_oim/.../templates/coredns.container.j2— systemd quadletprepare_oim/.../tasks/deploy_coredns.yml— pull, configure, startDNS Zone Pipeline
provision/.../templates/dns/forward_zone.j2— A records from ip_name_mapprovision/.../templates/dns/reverse_zone.j2— PTR recordsprovision/.../tasks/generate_dns_zones.yml— zone rendering from SMD inventoryprovision/.../tasks/generate_reverse_zone_additional.yml— per-additional-subnet reverse zonesprovision/.../tasks/update_dns_zones.yml— lifecycle hook (node add/remove)Cloud-init Templates (7 files)
resolv.conf→ OIM CoreDNS whendns_enabled, otherwise legacy/etc/hostsSlurm /etc/hosts
update_hosts_munge.yml/update_hosts.yml— skip whendns_enabledK8s Integration
dns_domainto OIM CoreDNS from K8s CoreDNS ConfigMapPR #4352 Compatibility
Backward Compatible
dns_enableddefaults tofalse— zero behavioral change for existing deployments.Tests