⚡ perf: reduce startup and eval overhead across engine and CLI by harehare · Pull Request #1785 · harehare/mq

harehare · 2026-05-24T01:04:16Z

No description provided.

codspeed-hq · 2026-05-24T01:15:30Z

Merging this PR will improve performance by ×2.7

⚠️

Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 22 improved benchmarks
✅ 7 untouched benchmarks

Performance Changes

	Benchmark	`BASE`	`HEAD`	Efficiency
⚡	`eval_array_chained_operations`	9.2 ms	3.9 ms	×2.4
⚡	`eval_array_filter`	9.8 ms	4.4 ms	×2.2
⚡	`eval_nested_object_access`	6.7 ms	1.3 ms	×5.1
⚡	`eval_string_equality`	6.7 ms	1.4 ms	×4.8
⚡	`eval_array_fold`	7.4 ms	2.1 ms	×3.5
⚡	`eval_no_macro_large_program`	288.3 µs	203.6 µs	+41.59%
⚡	`eval_variable_assignment_chain`	672.4 µs	606 µs	+10.96%
⚡	`eval_array_map`	8 ms	2.7 ms	×3
⚡	`eval_nodes`	6.4 ms	1 ms	×6.3
⚡	`eval_yaml_parse`	7.3 ms	2.1 ms	×3.5
⚡	`parse_fibonacci`	180 µs	93.1 µs	+93.25%
⚡	`eval_csv_parse`	7.3 ms	2 ms	×3.6
⚡	`eval_object_field_access`	7.1 ms	1.7 ms	×4.2
⚡	`eval_pipeline_with_conditionals`	7 ms	1.7 ms	×4.2
⚡	`eval_foreach`	6.8 ms	1.5 ms	×4.6
⚡	`eval_select_h`	287.8 µs	200.1 µs	+43.78%
⚡	`eval_if_else_branching`	8.1 ms	2.8 ms	×2.9
⚡	`eval_json_parse`	6.9 ms	1.6 ms	×4.4
⚡	`eval_large_markdown_filtering`	25.2 ms	20.1 ms	+25.66%
⚡	`eval_macro_expansion_simple`	245.9 µs	137.9 µs	+78.3%
...	...	...	...	...

ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.

Tip

Curious why this is faster? Comment @codspeedbot explain why this is faster on this PR, or directly use the CodSpeed MCP with your agent.

_{Comparing perf/startup-and-eval-improvements (7a82dc5) with main (1d5fde9)¹}

No successful run was found on main (ab78796) during the generation of this report, so 1d5fde9 was used instead as the comparison base. There might be some changes unrelated to this pull request in this report. ↩

Copilot

Pull request overview

This PR focuses on reducing startup and evaluation overhead by reusing parsed artifacts in both the CLI and the language engine.

Changes:

Refactors mq-run CLI execution flow and adds an optimization to compile the effective query once per batch when safe.
Introduces a thread-local cache for the built-in module parse result to reduce repeated builtin parsing cost.
Adds a public Engine::compile + Engine::eval_compiled API and updates docs/tests accordingly.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
crates/mq-run/src/cli.rs	Refactors execution helpers and adds a batch fast-path that uses compiled programs.
crates/mq-lang/src/module.rs	Adds a thread-local cache intended to reuse parsed builtin module tokens/module across engines.
crates/mq-lang/src/lib.rs	Updates `ast-json` feature documentation to reflect new execution API.
crates/mq-lang/src/eval.rs	Adjusts evaluator env passing to reduce cloning (but still clones in hot paths).
crates/mq-lang/src/engine.rs	Adds `compile`/`eval_compiled`, updates examples/tests, and removes `eval_ast`.
crates/mq-lang/src/arena.rs	Adds `extend_from_slice`/`as_slice` helpers used by the builtin cache.

Comments suppressed due to low confidence (1)

crates/mq-run/src/cli.rs:1215

write_file is currently unused in this test module (no call sites found). Consider removing it or using it in tests to avoid dead-code warnings and keep the test helpers minimal.

    #[test]
    fn test_cli_null_input() {
        let cli = Cli {
            input: InputArgs {
                input_format: Some(InputFormat::Null),
                ..Default::default()
            },

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

    pub fn load_builtin(&mut self, token_arena: TokenArena) -> Result<Module, ModuleError> {
-        self.load(Module::BUILTIN_MODULE, BUILTIN_FILE, token_arena)
+        if self.loaded_modules.contains(Module::BUILTIN_MODULE.into()) {
+            return Err(ModuleError::AlreadyLoaded(Cow::Borrowed(Module::BUILTIN_MODULE)));
+        }
+
+        // Cache is only valid when both arenas are in their initial state (builtin
+        // module_id == 1, tokens right after the dummy EOF). Fall back to full parse otherwise.
+        let pristine = self.loaded_modules.len() == 1 && {
+            #[cfg(not(feature = "sync"))]
+            {
+                token_arena.borrow().len() == 1
+            }
+            #[cfg(feature = "sync")]
+            {
+                token_arena.read().unwrap().len() == 1
+            }
+        };
+
+        if pristine {
+            let cached =
+                BUILTIN_CACHE.with(|cache| cache.borrow().as_ref().map(|c| (c.tokens.clone(), c.module.clone())));
+
+            if let Some((tokens, module)) = cached {
+                {
+                    #[cfg(not(feature = "sync"))]
+                    token_arena.borrow_mut().extend_from_slice(&tokens);
+                    #[cfg(feature = "sync")]
+                    token_arena.write().unwrap().extend_from_slice(&tokens);
+                }
+                self.loaded_modules.alloc(Module::BUILTIN_MODULE.into());
+                return Ok(module);
+            }
+        }


Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (1)

crates/mq-lang/src/eval.rs:296

Same issue as above: &Shared::clone(&self.env) here adds unnecessary cloning/temporary creation. Use &self.env (or reuse an &Shared<_> binding) to avoid refcount churn in the hot path.

            } else {
                self.eval_program(&nodes_program, values?.into(), &Shared::clone(&self.env))
                    .map(|value| {

            let values: Result<Vec<RuntimeValue>, InnerError> = input
                .map(|runtime_value| match &runtime_value {
                    RuntimeValue::Markdown(node, _) => self.eval_markdown_node(&program, node),
                    _ => self
-                        .eval_program(&program, runtime_value, Shared::clone(&self.env))
+                        .eval_program(&program, runtime_value, &Shared::clone(&self.env))
                        .map_err(|e| e.into_inner_error()),


+    pub fn eval_compiled<I: Iterator<Item = RuntimeValue>>(
+        &mut self,
+        compiled: &CompiledProgram,
+        input: I,
+    ) -> MqResult {
        self.evaluator
-            .eval(&program, input.into_iter())
+            .eval(&compiled.program, input)
            .map(|values| values.into())
-            .map_err(|e| Box::new(error::Error::from_error("", e, self.evaluator.module_loader.clone())))
+            .map_err(|e| {
+                Box::new(error::Error::from_error(
+                    &compiled.source,
+                    e,
+                    self.evaluator.module_loader.clone(),
+                ))
+            })
    }


+    /// Compiles mq code into a [`CompiledProgram`] that can be evaluated multiple times.
    ///
-    /// This is similar to `eval`, but takes an AST directly, skipping parsing.
-    /// The AST is typically obtained from deserializing a JSON AST.
+    /// Use this with `eval_compiled` to avoid re-parsing the same query for each input.
+    pub fn compile(&mut self, code: &str) -> Result<CompiledProgram, Box<error::Error>> {
+        if code.is_empty() {
+            return Ok(CompiledProgram {
+                source: String::new(),
+                program: vec![],
+            });
+        }
+        let program = parse(code, Shared::clone(&self.token_arena))?;
+        Ok(CompiledProgram {
+            source: code.to_string(),
+            program,
+        })
+    }
+
+    /// Evaluates a pre-compiled program against the given input.
+    ///
+    /// Use with `compile` to avoid re-parsing the same query for each input file,
+    /// or with a [`CompiledProgram`] constructed from a deserialized JSON AST (`ast-json` feature).
    ///


- Cache parsed builtin.mq tokens + Module in thread-local storage so each rayon worker thread parses builtin only once instead of once per file; adds Arena::extend_from_slice / as_slice helpers to support it - Add Engine::compile and Engine::eval_compiled for pre-parsed query reuse; CLI sequential mode now compiles the query once and reuses it across all files when all share the same effective query prefix - Change eval_program env parameter from owned Shared<SharedCell<Env>> to &Shared<SharedCell<Env>>, eliminating Rc clone on every loop iteration in for/foreach/map pipelines

Remove eval_ast (which was behind the ast-json feature gate and took Program by value) and unify into eval_compiled (&Program, no feature gate). Update the test and lib.rs docs accordingly.

…env) for improved efficiency

…er methods

…eferences - Add AlreadyLoaded check at the top of load_builtin for explicit idempotency - Only use the thread-local cache when both arenas are in pristine state (loaded_modules.len()==1 and token_arena.len()==1); fall back to full parse otherwise to avoid broken offsets when other modules are loaded first - Handle AlreadyLoaded in Evaluator::load_builtin_module as Ok(()) so re-entrant calls (e.g. DAP adapter on re-launch) do not panic - Add tests for idempotent and non-pristine load scenarios

… error diagnostics Bundle the original query source with the compiled AST so that runtime errors surfaced via eval_compiled can display the relevant code snippet and source range, matching the diagnostic quality of the eval path. - Add CompiledProgram { source, program } with source()/program() accessors - Implement From<Program> for CompiledProgram for ast-json deserialization use cases - Change Engine::compile to return CompiledProgram - Change Engine::eval_compiled to accept &CompiledProgram and thread source into from_error

Cover source preservation, program access, clone, empty-code compile, eval reuse across inputs, runtime error source propagation, and the From<Program> path that produces empty source in diagnostics.

Cover the key invariants of the thread-local BUILTIN_CACHE: module.rs (pristine_token_arena fixture required to hit the cache path): - arena size is identical whether tokens come from cache or fresh parse - module function/var/macro/module counts are consistent across both paths - BUILTIN_MODULE is always registered at loaded_modules index 1 - all cached tokens carry module_id == BUILTIN_MODULE_ID (so error diagnostics resolve to the builtin source file, not garbage) engine.rs: - sequential engines calling the same builtin functions produce identical results - eval_compiled on an engine that used the cache produces correct output (verifies token_id references in compiled programs remain valid after cache replay) - runtime error source_code is correctly preserved on a cache-using engine

Add two tests that check the error *location* (token offset + span), not just source_code, to catch token_id shifts introduced by cache replay: - test_builtin_cache_runtime_error_token_location_correct: for queries where the erroring identifier is at different offsets (0, 4, 9 bytes), confirm that location.offset() == query.find(ident) and the slice at that offset equals the expected identifier. - test_builtin_cache_and_fresh_parse_error_location_identical: two sequential engines must produce identical error.location, so that a cache-using engine cannot silently shift token_ids relative to a fresh-parse engine.

Set source code on the module loader inside eval_compiled when the debugger feature is enabled, so token location info is available during evaluation of pre-compiled programs.

…stics

harehare requested a review from Copilot May 24, 2026 02:53

Copilot started reviewing on behalf of harehare May 24, 2026 02:53 View session

harehare force-pushed the perf/startup-and-eval-improvements branch from 5a83ef4 to efba1b0 Compare May 24, 2026 02:55

Copilot AI reviewed May 24, 2026

View reviewed changes

Comment thread crates/mq-lang/src/module.rs Outdated

Comment thread crates/mq-lang/src/eval.rs

Comment thread crates/mq-lang/src/engine.rs Outdated

Comment thread crates/mq-lang/src/engine.rs Outdated

Comment thread crates/mq-lang/src/engine.rs Outdated

harehare requested a review from Copilot May 24, 2026 09:37

Copilot started reviewing on behalf of harehare May 24, 2026 09:38 View session

Copilot AI reviewed May 24, 2026

View reviewed changes

harehare requested a review from Copilot May 24, 2026 12:43

Copilot started reviewing on behalf of harehare May 24, 2026 12:44 View session

Copilot AI reviewed May 24, 2026

View reviewed changes

harehare added 10 commits May 24, 2026 21:50

♻️ refactor(engine): consolidate eval_ast into eval_compiled

50e8e75

Remove eval_ast (which was behind the ast-json feature gate and took Program by value) and unify into eval_compiled (&Program, no feature gate). Update the test and lib.rs docs accordingly.

♻️ refactor(engine): remove unused Program import for cleaner code

40fcbdf

♻️ refactor(evaluator): replace env.clone() with Shared::clone(&self.…

28099e1

…env) for improved efficiency

✨ feat(cli): enhance input handling and result emission with new help…

4711b88

…er methods

✅ test(engine): add rstest parameterized tests for CompiledProgram

b88071e

Cover source preservation, program access, clone, empty-code compile, eval reuse across inputs, runtime error source propagation, and the From<Program> path that produces empty source in diagnostics.

harehare force-pushed the perf/startup-and-eval-improvements branch from 42b0c1f to fd24083 Compare May 24, 2026 12:50

🐛 fix(engine): propagate compiled source to module loader for debugger

f8fb3a7

Set source code on the module loader inside eval_compiled when the debugger feature is enabled, so token location info is available during evaluation of pre-compiled programs.

harehare force-pushed the perf/startup-and-eval-improvements branch from b2d9733 to f8fb3a7 Compare May 24, 2026 13:06

🐛 fix(error): update source_code to use NamedSource for better diagno…

7a82dc5

…stics

harehare merged commit 6a69972 into main May 24, 2026
10 checks passed

harehare deleted the perf/startup-and-eval-improvements branch May 24, 2026 22:22

BrewTestBot mentioned this pull request May 26, 2026

mq 0.5.31 Homebrew/homebrew-core#284775

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

⚡ perf: reduce startup and eval overhead across engine and CLI #1785

⚡ perf: reduce startup and eval overhead across engine and CLI #1785
harehare merged 12 commits into
mainfrom
perf/startup-and-eval-improvements

harehare commented May 24, 2026

Uh oh!

codspeed-hq Bot commented May 24, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

harehare commented May 24, 2026

Uh oh!

codspeed-hq Bot commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will improve performance by ×2.7

Performance Changes

Footnotes

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codspeed-hq Bot commented May 24, 2026 •

edited

Loading