Skip to content

feat: add network monitoring application#1217

Merged
bobbinth merged 20 commits intonextfrom
santiagopittella-network-monitoring-app
Sep 18, 2025
Merged

feat: add network monitoring application#1217
bobbinth merged 20 commits intonextfrom
santiagopittella-network-monitoring-app

Conversation

@SantiagoPittella
Copy link
Collaborator

partially addresses #1190

This is the current look of the frontned
Screenshot 2025-09-09 at 17 50 19

@SantiagoPittella SantiagoPittella force-pushed the santiagopittella-network-monitoring-app branch 2 times, most recently from 7b88f03 to 5c3c2c3 Compare September 9, 2025 21:52
@SantiagoPittella SantiagoPittella force-pushed the santiagopittella-network-monitoring-app branch from 5c3c2c3 to f6d0e0b Compare September 9, 2025 21:52
@@ -0,0 +1,32 @@
# Miden network monitoring

This crate contains a binary for running a Miden network monitor that can monitor multiple remote provers.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like

Suggested change
This crate contains a binary for running a Miden network monitor that can monitor multiple remote provers.
A monitoring app for a Miden network's infrastructure.
It serves a webpage with an overview of the current infrastructure status and emits OpenTelemetry events which can be ingested for more advanced monitoring and alerting.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also a section on currently supported items, and future items?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get away with just cargo check? But maybe the full one is better - I'm mainly concerned with keeping github cache usage low. So maybe we keep as is, and then if cache gets clobbered we revisit.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a relatively simple binary, using just check is probably enough. I will remove both jobs (build, install) and add a check one. I can re-add it if we change our minds

Comment on lines +50 to +51
// Wait for either task to complete or fail, then abort the other
let (frontend_result, status_result) = tokio::join!(frontend_task, status_task);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this works - join! waits for all tasks to complete:

Waits on multiple concurrent branches, returning when all branches complete.

For short lists like this tokio::select! is probably the way to go; for longer lists I use JoinSet with join_next_with_id but that involves storing the IDs as well to identify which task completed.

Comment on lines +35 to +44
pub async fn get_dashboard() -> Html<&'static str> {
Html(include_str!("../assets/index.html"))
}

pub async fn get_status(
axum::extract::State(shared_status): axum::extract::State<SharedStatus>,
) -> axum::response::Json<crate::status::NetworkStatus> {
let status = shared_status.lock().await;
axum::response::Json(status.clone())
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do these need to be pub?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it does not

///
/// * `shared_status` - The shared status of the network.
/// * `config` - The configuration of the network.
pub async fn run_frontend(shared_status: SharedStatus, config: MonitoringConfig) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

Suggested change
pub async fn run_frontend(shared_status: SharedStatus, config: MonitoringConfig) {
pub async fn serve(shared_status: SharedStatus, config: MonitoringConfig) {

// build our application with routes
let app = Router::new()
// Serve static files from assets directory
.nest_service("/assets", ServeDir::new("bin/network-monitoring/assets"))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work when installed via cargo install? Because I don't think the assets folder will be bundled with unless ServeDir somehow embeds the assets into the binary.

I've typically forced this by embedding the files directly in the binary by using include_str! and friends, but then you can't use a directory approach.

See 0xMiden/miden-faucet#59 for some additional context.

.route("/status", get(get_status))
.with_state(shared_status);

// run our app with hyper, listening globally on the configured port
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: don't think we use hyper directly

Suggested change
// run our app with hyper, listening globally on the configured port

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now but I think we'll run into problems if/when we start adding more checks.

An alternative I would suggest as follow-up is to split each check into its own task, using a tokio Watch channel to communicate its latest status.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just created #1223

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Thank you! This is a very shallow review from me, but I left a couple of comments inline. The main one is renaming everything from "miden network monitoring" to "miden network monitor".

@bobbinth bobbinth requested a review from sergerad September 11, 2025 23:47

let bind_address = format!("0.0.0.0:{}", config.port);
println!("Starting web server on {bind_address}");
println!("Dashboard available at: http://localhost:{}/", config.port);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we want to use the same tracing setup we use everywhere else?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do it easily by importing it from miden_node_utils. Should I always mark opentelemetry as disabled? Not sure if we want that here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would want it enabled so that we can keep track of it via Honeycomb too

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually not sure the monitor should be tied in to Honeycomb - or even have OTEL enabled at all. In my mind it is a simple binary that we run purely as an external user (and anyone can run it as well). So, the simpler we can make it, the better.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Than you! Again not a very deep review from me, but i left a couple of small comments inline.


[dependencies]
anyhow = { workspace = true }
axum = { version = "0.8.4" }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: just 0.8 for version should be enough.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is being reviewed by Cursor Bugbot

Details

Your team is on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle for each member of your team.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

@bobbinth
Copy link
Contributor

@SantiagoPittella, @Mirko-von-Leipzig, @sergerad - is this mostly done? or is there anything outstanding?

@SantiagoPittella
Copy link
Collaborator Author

@SantiagoPittella, @Mirko-von-Leipzig, @sergerad - is this mostly done? or is there anything outstanding?

Yes, there is a follow up issue that I'm already addressing and has a PR open that can be merged after this one. All comments were addressed here.

Copy link
Contributor

@bobbinth bobbinth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a deep review from me - but looks good! Thank you!

@Mirko-von-Leipzig, @sergerad - we should probably deploy this on a small instance (e.g., 1 - 2 core). Maybe one deployment for testnet and one for devenet? It would be cool if this was accessible under something like status.testnet.miden.io and status.devnet.miden.io.

Another cool thing for a future PR would be to include some additional data for block producer (and maybe NTX builder). For example, something like number of tx in the mempool. Let's create an issue for this (unless we already have one).

@bobbinth bobbinth merged commit df04d7d into next Sep 18, 2025
7 checks passed
@bobbinth bobbinth deleted the santiagopittella-network-monitoring-app branch September 18, 2025 07:03
@Mirko-von-Leipzig
Copy link
Collaborator

Another cool thing for a future PR would be to include some additional data for block producer (and maybe NTX builder). For example, something like number of tx in the mempool. Let's create an issue for this (unless we already have one).

We do already track this via the mempool's instrumentation, so its query-able in honeycomb

image

But of course this isn't available publicly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants