Skip to content

Integrate the zone server(s) with the new zone storage#552

Merged
bal-e merged 10 commits intomainfrom
server-storage-integration
Apr 13, 2026
Merged

Integrate the zone server(s) with the new zone storage#552
bal-e merged 10 commits intomainfrom
server-storage-integration

Conversation

@bal-e
Copy link
Copy Markdown
Contributor

@bal-e bal-e commented Mar 26, 2026

This PR integrates the last of Cascade's major components, the zone server, with the new zone storage. A new ZoneService type is added, which implements domain::net::server::service::Service; it stores a LoadedZoneReviewer/SignedZoneReviewer/ZoneViewer and returns data from that. This type is used in place of the previous zone_server_service() function, which read from zonetrees.

src/zone/storage.rs has been modified to exchange viewers with these zone server units when a zone undergoes a change. This process is async, which complicates the control flow a bit; I hope to simplify them soon.

This change also interacts with how zones are initialized; at initialization, the zone (re)viewer types are held in ZoneState and are extracted by the zone server units. This is a hack to work around design issues in zone initialization, and they should be addressed in the future. This also interacts with zone persistence.

This PR removes the last major use of zonetrees within Cascade. However, it does not remove the zonetrees yet; they will be removed in a future PR (once some minor uses are also resolved).


  • If you are changing Rust code or integration tests (Cargo.*, crates/, etc/, integration-tests/, src/):
    • Did you run the integration tests with act through the act-wrapper (as described in TESTING.md)?

Note that the ixfr-in test is currently failing, but for apparently minor reasons (the new zone server only supports AXFR and SOA queries, but the test attempts A and MX queries). The test will be changed in a separate PR, before this PR is merged.

@bal-e bal-e requested review from tertsdiepraam and ximon18 March 26, 2026 12:34
@bal-e bal-e self-assigned this Mar 26, 2026
@bal-e bal-e force-pushed the server-storage-integration branch from 3421a9e to 6502cb3 Compare March 31, 2026 13:53
@bal-e bal-e marked this pull request as ready for review March 31, 2026 14:11
@bal-e bal-e requested review from tertsdiepraam and ximon18 March 31, 2026 14:11
@bal-e bal-e force-pushed the server-storage-integration branch from 6502cb3 to 0e12c74 Compare April 2, 2026 09:01
Copy link
Copy Markdown
Contributor

@tertsdiepraam tertsdiepraam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good. There's not much that I'm really surprised by. I think you wanted a second opinion on it and that's probably a good idea, but the commits tell a good story and if it passes the integration tests then this should be good.

Comment thread src/server/service.rs
Comment thread src/server/service.rs Outdated
Comment thread src/server/service.rs Outdated
Comment thread src/server/service.rs
//----------- ZoneServiceHandle ------------------------------------------------

/// A handle for controlling a [`ZoneService`].
pub struct ZoneServiceHandle<V> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point in reviewing I'm wondering why the handle is needed if it contains the same field as ZoneService. Maybe the next commits will clear that up.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types might change to sender/receiver ends of a channel in the future (which is mentioned on ZoneService). Also, they have distinct uses -- one is for actual serving, and the other is for interacting with the service. Separating them made sense from an API perspective.

Comment thread src/server/mod.rs
Comment thread src/zone/storage.rs
//
// TODO: Move into the zone server unit.
loaded_reviewer: LoadedZoneReviewer,
// TODO: Output it directly somehow?
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Output what directly?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This field is currently acting as a hacky output of zone initialization; it is Some when a zone is created, and this field is then immediately moved out of. Ideally it would be returned directly be the zone initialization code. I'll leave a comment to make that more explicit.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code will be addressed by the changes I will make to #550, so I'm going to leave it alone here.

Copy link
Copy Markdown
Member

@ximon18 ximon18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial feedback, unfortunately I don't have time to continue further at this point.

Comment thread src/server/mod.rs
Comment thread src/server/service.rs Outdated
Comment thread src/server/service.rs
// Obtain a read lock to read the zone for an extended duration.
let viewer = zone.viewer.read_owned().await;

if viewer.is_empty() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can a zone in Cascade be empty?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a somewhat murky part of the zone storage. When a zone in Cascade is initialized, I still create viewers for it, so it could hypothetically be served; but since there is no content yet available, I express this as "empty zone". I think this is a useful way to express this "we don't have data" case, but the semantics of it are still not documented or entirely clear.

Comment thread src/server/service.rs Outdated
Comment thread src/server/service.rs

// TODO: Support IXFR.
ZoneRequestKind::Ixfr { .. } => Box::pin(std::future::ready(error(
old_request.message(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to return NOTIMP here, we can instead return an AXFR response per https://datatracker.ietf.org/doc/html/rfc1995#section-4.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we could just return the AXFR response. AFAIK IXFR client implementations are expected to observe a NOTIMP and request an AXFR as usual. I don't think the difference is substantial; as the code is right now, it leaves IXFR to be tackled as a whole new function, which seems nicer for whoever will write it.

Comment thread src/units/zone_server.rs Outdated
Comment thread src/server/request.rs
pub kind: RequestKind,
//
// TODO:
// - EDNS cookie state.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are handled by lower level middleware services which you currently still use and so these don't need to be handled here. Unless I am forgetting something, it's been a while since I looked at domain::net::server.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right; but I want to handle them here eventually, partly as part of the transition to domain::new and partly so I can use a different control flow here. I believe I left a comment about this somewhere.

Comment thread src/server/service.rs
Comment thread src/zone/storage.rs Outdated
}
};

let span = trace_span!("reset_loaded_review_server");
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean to "reset" a viewer?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating it so it stops serving the upcoming instance of the zone. Would "rewind" be better?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've renamed it to "rewind", moved it to a dedicated function, and added some docs there to helpfully make things clearer.

Comment thread src/zone/storage.rs Outdated
cleaner
}

_ => panic!("The zone was left in 'CleanLoadedPending' state"),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this match arm. Don't you match against CleanLoadedPending in the arm above, so if you match this arm that means it was NOT in CleanLoadedPending state, while the comment says it IS in CleanLoadedPending state?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should have been an unreachable!(). Does the message make more sense in that context? It states that the above code should have worked because the zone had been left in the CleanLoadedPending state.

Comment thread src/zone/storage.rs Outdated
Comment thread src/zone/storage.rs Outdated
Comment thread src/zone/storage.rs Outdated
Comment thread src/zone/storage.rs Outdated
Comment thread src/zone/storage.rs Outdated
Comment thread src/server/service.rs
pub struct ZoneService<V> {
/// The underlying state.
//
// TODO: The state is currently wrapped in an 'RwLock'. This is necessary
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you can explain this to me in person. Your ZoneServiceState type doesn't hold any types that I can see that are from domain so in what way does domain::net::server architecture require you to have an RwLock here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, ZoneService implements the domain::net::server Service trait, and so is bound by its requirements (e.g. Send + Sync, Service::call() taking &self). I can describe that in more detail, but I didn't want to draw out the comment too much.

Comment thread src/server/service.rs
///
/// In the future, the network server stack should be gradually inlined here,
/// so it can use [`domain::new`] and support more functionality (e.g. handling
/// XFRs by spawning OS threads).
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what way does domain::net::server prevent using a thread here if wanted? I'm not convinced that "should" is the right term here, I don't believe we as a team discussed the changes you have in mind for net::server yet, I'd prefix "the network server stack should be gradually inlined here" with "I think that".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think everybody is on board with the transition to domain::new, and the network stack is necessarily affected by that.

Regarding the threads: When a DNS request is received and passed to the ZoneService by Service::call(), it must be responded to by returning an async stream/future. Instead, I want to try spawning a thread to respond to each XFR, and moving all the state necessary for building the response to that. I hope this can simplify the AXFR message building logic and eliminate some async code.

@bal-e bal-e force-pushed the server-storage-integration branch from e45a0a0 to f6c326d Compare April 9, 2026 08:49
@bal-e
Copy link
Copy Markdown
Contributor Author

bal-e commented Apr 9, 2026

Broken pending #569.

bal-e added 10 commits April 13, 2026 11:56
These are specialized versions of 'ZoneServer'. They provide some
methods that forward to 'ZoneServer' for now; these will gradually get
inlined.

Having a dedicated type for each server allows storing type-specific
data, prevents methods from being called on the wrong types, and helps
simplify the code (removing 'match' statements).
This performs basic parsing of request messages. In the future, it will
collect data across the whole DNS message, e.g. EDNS options and TSIG
state.
The newly introduced 'ZoneService' will serve as the basis for the zone
server. It integrates with 'domain::net::server' and so should be a
drop-in replacement for the existing query service.

At the moment, 'ZoneService' only supports SOA queries. Support for AXFR
and IXFR will follow; NOTIFY messages and extensions like EDNS and TSIG
are left to the existing middleware services for now.
@bal-e bal-e force-pushed the server-storage-integration branch from 54c90d8 to 176097f Compare April 13, 2026 10:00
@bal-e bal-e merged commit 4a0799f into main Apr 13, 2026
@bal-e bal-e deleted the server-storage-integration branch April 13, 2026 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants