Skip to content

⚡ perf: reduce startup and eval overhead across engine and CLI #1785

Merged
harehare merged 12 commits into
mainfrom
perf/startup-and-eval-improvements
May 24, 2026
Merged

⚡ perf: reduce startup and eval overhead across engine and CLI #1785
harehare merged 12 commits into
mainfrom
perf/startup-and-eval-improvements

Conversation

@harehare
Copy link
Copy Markdown
Owner

No description provided.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented May 24, 2026

Merging this PR will improve performance by ×2.7

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 22 improved benchmarks
✅ 7 untouched benchmarks

Performance Changes

Benchmark BASE HEAD Efficiency
eval_array_chained_operations 9.2 ms 3.9 ms ×2.4
eval_array_filter 9.8 ms 4.4 ms ×2.2
eval_nested_object_access 6.7 ms 1.3 ms ×5.1
eval_string_equality 6.7 ms 1.4 ms ×4.8
eval_array_fold 7.4 ms 2.1 ms ×3.5
eval_no_macro_large_program 288.3 µs 203.6 µs +41.59%
eval_variable_assignment_chain 672.4 µs 606 µs +10.96%
eval_array_map 8 ms 2.7 ms ×3
eval_nodes 6.4 ms 1 ms ×6.3
eval_yaml_parse 7.3 ms 2.1 ms ×3.5
parse_fibonacci 180 µs 93.1 µs +93.25%
eval_csv_parse 7.3 ms 2 ms ×3.6
eval_object_field_access 7.1 ms 1.7 ms ×4.2
eval_pipeline_with_conditionals 7 ms 1.7 ms ×4.2
eval_foreach 6.8 ms 1.5 ms ×4.6
eval_select_h 287.8 µs 200.1 µs +43.78%
eval_if_else_branching 8.1 ms 2.8 ms ×2.9
eval_json_parse 6.9 ms 1.6 ms ×4.4
eval_large_markdown_filtering 25.2 ms 20.1 ms +25.66%
eval_macro_expansion_simple 245.9 µs 137.9 µs +78.3%
... ... ... ... ...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.


Comparing perf/startup-and-eval-improvements (7a82dc5) with main (1d5fde9)1

Open in CodSpeed

Footnotes

  1. No successful run was found on main (ab78796) during the generation of this report, so 1d5fde9 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report.

@harehare harehare requested a review from Copilot May 24, 2026 02:53
@harehare harehare force-pushed the perf/startup-and-eval-improvements branch from 5a83ef4 to efba1b0 Compare May 24, 2026 02:55
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR focuses on reducing startup and evaluation overhead by reusing parsed artifacts in both the CLI and the language engine.

Changes:

  • Refactors mq-run CLI execution flow and adds an optimization to compile the effective query once per batch when safe.
  • Introduces a thread-local cache for the built-in module parse result to reduce repeated builtin parsing cost.
  • Adds a public Engine::compile + Engine::eval_compiled API and updates docs/tests accordingly.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
crates/mq-run/src/cli.rs Refactors execution helpers and adds a batch fast-path that uses compiled programs.
crates/mq-lang/src/module.rs Adds a thread-local cache intended to reuse parsed builtin module tokens/module across engines.
crates/mq-lang/src/lib.rs Updates ast-json feature documentation to reflect new execution API.
crates/mq-lang/src/eval.rs Adjusts evaluator env passing to reduce cloning (but still clones in hot paths).
crates/mq-lang/src/engine.rs Adds compile/eval_compiled, updates examples/tests, and removes eval_ast.
crates/mq-lang/src/arena.rs Adds extend_from_slice/as_slice helpers used by the builtin cache.
Comments suppressed due to low confidence (1)

crates/mq-run/src/cli.rs:1215

  • write_file is currently unused in this test module (no call sites found). Consider removing it or using it in tests to avoid dead-code warnings and keep the test helpers minimal.
    #[test]
    fn test_cli_null_input() {
        let cli = Cli {
            input: InputArgs {
                input_format: Some(InputFormat::Null),
                ..Default::default()
            },

Comment thread crates/mq-lang/src/module.rs Outdated
Comment thread crates/mq-lang/src/eval.rs
Comment thread crates/mq-lang/src/engine.rs Outdated
Comment thread crates/mq-lang/src/engine.rs Outdated
Comment thread crates/mq-lang/src/engine.rs Outdated
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Comment thread crates/mq-lang/src/engine.rs
Comment thread crates/mq-lang/src/engine.rs Outdated
Comment thread crates/mq-lang/src/engine.rs Outdated
Comment thread crates/mq-lang/src/eval.rs
Comment on lines 210 to +242
pub fn load_builtin(&mut self, token_arena: TokenArena) -> Result<Module, ModuleError> {
self.load(Module::BUILTIN_MODULE, BUILTIN_FILE, token_arena)
if self.loaded_modules.contains(Module::BUILTIN_MODULE.into()) {
return Err(ModuleError::AlreadyLoaded(Cow::Borrowed(Module::BUILTIN_MODULE)));
}

// Cache is only valid when both arenas are in their initial state (builtin
// module_id == 1, tokens right after the dummy EOF). Fall back to full parse otherwise.
let pristine = self.loaded_modules.len() == 1 && {
#[cfg(not(feature = "sync"))]
{
token_arena.borrow().len() == 1
}
#[cfg(feature = "sync")]
{
token_arena.read().unwrap().len() == 1
}
};

if pristine {
let cached =
BUILTIN_CACHE.with(|cache| cache.borrow().as_ref().map(|c| (c.tokens.clone(), c.module.clone())));

if let Some((tokens, module)) = cached {
{
#[cfg(not(feature = "sync"))]
token_arena.borrow_mut().extend_from_slice(&tokens);
#[cfg(feature = "sync")]
token_arena.write().unwrap().extend_from_slice(&tokens);
}
self.loaded_modules.alloc(Module::BUILTIN_MODULE.into());
return Ok(module);
}
}
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

crates/mq-lang/src/eval.rs:296

  • Same issue as above: &Shared::clone(&self.env) here adds unnecessary cloning/temporary creation. Use &self.env (or reuse an &Shared<_> binding) to avoid refcount churn in the hot path.
            } else {
                self.eval_program(&nodes_program, values?.into(), &Shared::clone(&self.env))
                    .map(|value| {

Comment on lines 283 to 288
let values: Result<Vec<RuntimeValue>, InnerError> = input
.map(|runtime_value| match &runtime_value {
RuntimeValue::Markdown(node, _) => self.eval_markdown_node(&program, node),
_ => self
.eval_program(&program, runtime_value, Shared::clone(&self.env))
.eval_program(&program, runtime_value, &Shared::clone(&self.env))
.map_err(|e| e.into_inner_error()),
Comment thread crates/mq-lang/src/engine.rs
Comment on lines +253 to 268
pub fn eval_compiled<I: Iterator<Item = RuntimeValue>>(
&mut self,
compiled: &CompiledProgram,
input: I,
) -> MqResult {
self.evaluator
.eval(&program, input.into_iter())
.eval(&compiled.program, input)
.map(|values| values.into())
.map_err(|e| Box::new(error::Error::from_error("", e, self.evaluator.module_loader.clone())))
.map_err(|e| {
Box::new(error::Error::from_error(
&compiled.source,
e,
self.evaluator.module_loader.clone(),
))
})
}
Comment on lines +220 to 241
/// Compiles mq code into a [`CompiledProgram`] that can be evaluated multiple times.
///
/// This is similar to `eval`, but takes an AST directly, skipping parsing.
/// The AST is typically obtained from deserializing a JSON AST.
/// Use this with `eval_compiled` to avoid re-parsing the same query for each input.
pub fn compile(&mut self, code: &str) -> Result<CompiledProgram, Box<error::Error>> {
if code.is_empty() {
return Ok(CompiledProgram {
source: String::new(),
program: vec![],
});
}
let program = parse(code, Shared::clone(&self.token_arena))?;
Ok(CompiledProgram {
source: code.to_string(),
program,
})
}

/// Evaluates a pre-compiled program against the given input.
///
/// Use with `compile` to avoid re-parsing the same query for each input file,
/// or with a [`CompiledProgram`] constructed from a deserialized JSON AST (`ast-json` feature).
///
Comment thread crates/mq-lang/src/lib.rs
harehare added 10 commits May 24, 2026 21:50
- Cache parsed builtin.mq tokens + Module in thread-local storage so
  each rayon worker thread parses builtin only once instead of once per
  file; adds Arena::extend_from_slice / as_slice helpers to support it
- Add Engine::compile and Engine::eval_compiled for pre-parsed query
  reuse; CLI sequential mode now compiles the query once and reuses it
  across all files when all share the same effective query prefix
- Change eval_program env parameter from owned Shared<SharedCell<Env>>
  to &Shared<SharedCell<Env>>, eliminating Rc clone on every loop
  iteration in for/foreach/map pipelines
Remove eval_ast (which was behind the ast-json feature gate and took
Program by value) and unify into eval_compiled (&Program, no feature
gate). Update the test and lib.rs docs accordingly.
…eferences

- Add AlreadyLoaded check at the top of load_builtin for explicit idempotency
- Only use the thread-local cache when both arenas are in pristine state
  (loaded_modules.len()==1 and token_arena.len()==1); fall back to full
  parse otherwise to avoid broken offsets when other modules are loaded first
- Handle AlreadyLoaded in Evaluator::load_builtin_module as Ok(()) so
  re-entrant calls (e.g. DAP adapter on re-launch) do not panic
- Add tests for idempotent and non-pristine load scenarios
… error diagnostics

Bundle the original query source with the compiled AST so that runtime
errors surfaced via eval_compiled can display the relevant code snippet
and source range, matching the diagnostic quality of the eval path.

- Add CompiledProgram { source, program } with source()/program() accessors
- Implement From<Program> for CompiledProgram for ast-json deserialization use cases
- Change Engine::compile to return CompiledProgram
- Change Engine::eval_compiled to accept &CompiledProgram and thread source into from_error
Cover source preservation, program access, clone, empty-code compile,
eval reuse across inputs, runtime error source propagation, and the
From<Program> path that produces empty source in diagnostics.
Cover the key invariants of the thread-local BUILTIN_CACHE:

module.rs (pristine_token_arena fixture required to hit the cache path):
- arena size is identical whether tokens come from cache or fresh parse
- module function/var/macro/module counts are consistent across both paths
- BUILTIN_MODULE is always registered at loaded_modules index 1
- all cached tokens carry module_id == BUILTIN_MODULE_ID (so error diagnostics
  resolve to the builtin source file, not garbage)

engine.rs:
- sequential engines calling the same builtin functions produce identical results
- eval_compiled on an engine that used the cache produces correct output
  (verifies token_id references in compiled programs remain valid after cache replay)
- runtime error source_code is correctly preserved on a cache-using engine
Add two tests that check the error *location* (token offset + span), not
just source_code, to catch token_id shifts introduced by cache replay:

- test_builtin_cache_runtime_error_token_location_correct: for queries
  where the erroring identifier is at different offsets (0, 4, 9 bytes),
  confirm that location.offset() == query.find(ident) and the slice at
  that offset equals the expected identifier.

- test_builtin_cache_and_fresh_parse_error_location_identical: two
  sequential engines must produce identical error.location, so that a
  cache-using engine cannot silently shift token_ids relative to a
  fresh-parse engine.
@harehare harehare force-pushed the perf/startup-and-eval-improvements branch from 42b0c1f to fd24083 Compare May 24, 2026 12:50
Set source code on the module loader inside eval_compiled when the
debugger feature is enabled, so token location info is available
during evaluation of pre-compiled programs.
@harehare harehare force-pushed the perf/startup-and-eval-improvements branch from b2d9733 to f8fb3a7 Compare May 24, 2026 13:06
@harehare harehare merged commit 6a69972 into main May 24, 2026
10 checks passed
@harehare harehare deleted the perf/startup-and-eval-improvements branch May 24, 2026 22:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants