Skip to content

Comments

feat: Validator database#1614

Merged
sergerad merged 83 commits intonextfrom
sergerad-validator-db
Feb 17, 2026
Merged

feat: Validator database#1614
sergerad merged 83 commits intonextfrom
sergerad-validator-db

Conversation

@sergerad
Copy link
Collaborator

@sergerad sergerad commented Jan 29, 2026

Context

The Validator currently only stores validated transactions in-memory. We want to persist validated transactions for resilience and potential future use cases around debugging / audit-ability w.r.t public transactions.

The Validator will be run as a separate instance to the node so it needs its own database, schema, etc.

Relates to #1316.

Changes

  • Add data directory CLI arg for Validator command.
  • Make various types and functions from Db crate pub for use in Validator crate.
  • Add sqlite, diesel scaffolding to Validator crate alongside new schema.
  • Fix standalone Validator setup with bundled components.
  • Rename insecure key variables to just validator key.
  • Merge Validator command args into ValidatorConfig struct.

@sergerad sergerad marked this pull request as ready for review January 29, 2026 22:47
@sergerad
Copy link
Collaborator Author

sergerad commented Feb 1, 2026

@bobbinth I have validated these behaviours on this branch:

  • Validator as separate process works
  • Validator going down does not stop Node, can serve read requests
  • Node resumes normal functioning when Validator is brought back up

#[instrument(target = COMPONENT, skip_all)]
pub async fn load(database_filepath: PathBuf) -> Result<miden_node_store::Db, DatabaseSetupError> {
let manager = ConnectionManager::new(database_filepath.to_str().unwrap());
let pool = deadpool_diesel::Pool::builder(manager).max_size(16).build()?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does max_size mean the capacity, or is the actual capacity chosen differently and this is just the upper bound to that algorithm?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An upper bound but I don't know what goes on under the hood. @drahnr do you know why we wouldn't just use default?

    /// Maximum size of the [`Pool`].
    ///
    /// Default: `cpu_count * 4`
    ///
    /// [`Pool`]: super::Pool
    pub max_size: usize,

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't remember when or who introduced it

Comment on lines 19 to 25
pub fn new(info: &ValidatedTransactionInfo) -> Self {
Self {
id: info.tx_id().to_bytes(),
block_num: info.block_num().to_raw_sql(),
account_id: info.account_id().to_bytes(),
info: info.to_bytes(),
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should handle all the serialization here, instead of serializing in VAlidatedTransactionInfo as I allude to in this comment

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LMK if my latest changes reflect your thinking here

}

// Start the Validator if we have bound a socket.
if let Some(address) = validator_socket_address {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The configuration decision based on the validator_socket_address is a bit obfuscating the fact that the user did not provide a validator URL but a key

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure that part of the stack should be considering the key. Key presence is considered elsewhere. This part of the stack just cares about whether to run the local instance or not

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mirko-von-Leipzig please remind me do we have an issue for cleaning up the bundled config/startup stuff?

/// Open a connection to the DB and apply any pending migrations.
#[instrument(target = COMPONENT, skip_all)]
pub async fn load(database_filepath: PathBuf) -> Result<miden_node_store::Db, DatabaseSetupError> {
let manager = ConnectionManager::new(database_filepath.to_str().unwrap());
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Mirko-von-Leipzig @drahnr despite this (and the PRAGMA queries it makes for WAL etc) I still get db locked errors when running the integration tests.

ERROR rpc:submit_proven_transaction: miden-validator: crates/validator/src/server/mod.rs:126: 
error: status: 'Internal error', self: "database is locked" rpc.service: 
"validator.Api", rpc.method: "SubmitProvenTransaction", otel.name: "validator.Api/SubmitProvenTransaction"

Validator could receive parallel requests for submit_proven_transaction endpoint so I do think its something we need to solve (not just integration tests' fault).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding this does avoid it. ATM I'm reusing the store's connection manager code for validator.

pub(crate) fn configure_connection_on_creation(
    conn: &mut SqliteConnection,
) -> Result<(), ConnectionManagerError> {
    // Wait up to 5 seconds for writer locks before erroring.
    diesel::sql_query("PRAGMA busy_timeout=5000")
        .execute(conn)
        .map_err(ConnectionManagerError::ConnectionParamSetup)?;

Copy link
Contributor

@drahnr drahnr Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to serialize writes. By default you get an error with DB locked if a write is already in process, it's application level responsiblity to deal with this. The timeout fixes this on the surface, since almost all queries will finish under 5 seconds, only under load you'll see the locked again.
I suggest to add an internal bounded channel to serialize the queries (at least the write).

Copy link
Collaborator Author

@sergerad sergerad Feb 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a bounded channel would effectively add another "queue" on top of the one in the database only for the benefit of controlling capacity via the size of the channel (instead of timeout seconds or retry count). In both situations we would error out on the basis of too many writes (no difference for the client of the corresponding endpoint(s)).

I agree notionally about serializing writes, but I'm not sure we are actually better off adding the necessary code for the channel rather than relying on a timeout or retry count.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main difference is retaining control and the error information we can provide, which to me, given our latest debug adventures, is well worth the additional 100 (fair, 200) loc for queue management.
I see myself debugging something unrelated 6 months down the line, where all we get is a DB locked-error again, and the root cause being somewhere, but now we also not disambiguate overload based DB locked from bug-rooted DB locked.

@drahnr
Copy link
Contributor

drahnr commented Feb 17, 2026

LGTM, bar the pragma usage and the config around the validator socket address being used as a discriminator for spawning a task.

@sergerad sergerad merged commit 9bed52e into next Feb 17, 2026
19 checks passed
@sergerad sergerad deleted the sergerad-validator-db branch February 17, 2026 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants