Skip to content

perf(complete): bail out of type inference when completion budget expires#9247

Merged
mscolnick merged 2 commits intomainfrom
ms/jedi-completions
Apr 17, 2026
Merged

perf(complete): bail out of type inference when completion budget expires#9247
mscolnick merged 2 commits intomainfrom
ms/jedi-completions

Conversation

@mscolnick
Copy link
Copy Markdown
Contributor

Completing against heavy libraries (pandas, numpy, torch, aiohttp, ...) on a cold jedi cache was paying the full per-item inference cost for every completion, even when the list is large enough that we already skip docstrings. On molab this made the first pd. tab take ~11s; on a local Mac with a cleared cache, ~5.5s.

  • Add compute_type=True kwarg to _get_completion_option; when False, skip the completion.type access (and the docstring/signature path, which re-triggers the same inference).
  • Collapse the split oversize/normal branches of _get_completion_options into one loop that tracks a time budget. Once the budget expires, remaining completions are returned with name only.

Per-completion options cost: 29.6 ms (OLD) -> 19.0 ms (NEW), ~36% across the suite. Worst-case savings for heavy libs bigger:

  • aiohttp (169 attrs): ~9.6s projected -> 2.8s measured (~71%)
  • pandas cold (141 attrs): ~11.3s pre-patch -> ~2.0s capped (~82%)
  Projected savings, same-lib cold

  ┌────────────────────────────────────────────┬────────┬───────────┬─────────────────┬───────────────┐
  │                    lib                     │ #comps │    OLD    │       NEW       │     saved     │
  ├────────────────────────────────────────────┼────────┼───────────┼─────────────────┼───────────────┤
  │ pandas (molab, measured pre-patch)         │    141 │ 11,300 ms │ ~2,000 ms (cap) │ ~9.3 s (82 %) │
  ├────────────────────────────────────────────┼────────┼───────────┼─────────────────┼───────────────┤
  │ aiohttp (projected OLD = 169 × 57 ms/comp) │    169 │ ~9,600 ms │        2,839 ms │ ~6.8 s (71 %) │
  ├────────────────────────────────────────────┼────────┼───────────┼─────────────────┼───────────────┤
  │ rich (measured OLD, projected NEW cap)     │     98 │  5,547 ms │ ~2,000 ms (cap) │ ~3.5 s (64 %) │
  └────────────────────────────────────────────┴────────┴───────────┴─────────────────┴───────────────┘

  Local cold cache, same scenarios before vs after

  ┌──────────────────────────────────────┬──────────┬──────────┬──────────────────────────────────┐
  │               scenario               │  before  │  after   │                Δ                 │
  ├──────────────────────────────────────┼──────────┼──────────┼──────────────────────────────────┤
  │ import pandas as pd; pd. (141 attrs) │ 5,506 ms │ 2,520 ms │                            −54 % │
  ├──────────────────────────────────────┼──────────┼──────────┼──────────────────────────────────┤
  │ import numpy as np; np.lin (2 attrs) │ 1,002 ms │   902 ms │                            −10 % │
  ├──────────────────────────────────────┼──────────┼──────────┼──────────────────────────────────┤
  │ import os; os.path.join(             │   109 ms │   399 ms │ noise (jedi.Script, not options) │
  ├──────────────────────────────────────┼──────────┼──────────┼──────────────────────────────────┤
  │ d[" key completion                   │   1.0 ms │   0.7 ms │                            noise │
  ├──────────────────────────────────────┼──────────┼──────────┼──────────────────────────────────┤
  │ import sys; sys                      │  12.8 ms │  10.9 ms │                            noise │
  └──────────────────────────────────────┴──────────┴──────────┴──────────────────────────────────┘

Tradeoff: on cold heavy libs, only the first ~50-80 completions get type icons; the rest go out blank and fill in on subsequent requests as jedi's internal cache populates. Warm completions are unchanged.

…ires

Completing against heavy libraries (pandas, numpy, torch, aiohttp, ...) on a
cold jedi cache was paying the full per-item inference cost for every
completion, even when the list is large enough that we already skip
docstrings. On molab this made the first `pd.` tab take ~11s; on a local Mac
with a cleared cache, ~5.5s.

- Add `compute_type=True` kwarg to `_get_completion_option`; when False,
  skip the `completion.type` access (and the docstring/signature path,
  which re-triggers the same inference).
- Collapse the split oversize/normal branches of `_get_completion_options`
  into one loop that tracks a time budget. Once the budget expires,
  remaining completions are returned with name only.

Head-to-head on molab, 4 fresh cold libs each (older jedi cache behavior):

  impl  lib             n   options_ms  typed  info
  OLD   rich           98       5547     70     0
  OLD   requests       67       1313     51    48
  NEW   aiohttp       169       2839     50     0   (budget cap)
  NEW   bs4            78       1984     56    49

Per-completion options cost: 29.6 ms (OLD) -> 19.0 ms (NEW), ~36% across
the suite. Worst-case savings for heavy libs bigger:
  - aiohttp (169 attrs): ~9.6s projected -> 2.8s measured (~71%)
  - pandas cold (141 attrs): ~11.3s pre-patch -> ~2.0s capped (~82%)

Tradeoff: on cold heavy libs, only the first ~50-80 completions get type
icons; the rest go out blank and fill in on subsequent requests as jedi's
internal cache populates. Warm completions are unchanged.
Copilot AI review requested due to automatic review settings April 17, 2026 18:29
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 17, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
marimo-docs Ready Ready Preview, Comment Apr 17, 2026 9:39pm

Request Review

@mscolnick
Copy link
Copy Markdown
Contributor Author

mscolnick commented Apr 17, 2026

This analysis was done with marimo pair #notanad

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Improves runtime code completion performance by introducing a time-budgeted fast path that stops triggering expensive Jedi inference (especially on cold caches for heavy libraries), returning lightweight completion entries once the budget expires.

Changes:

  • Add a compute_type flag to _get_completion_option to skip completion.type (and associated inference) when over budget.
  • Rework _get_completion_options into a single loop that tracks a timeout budget and disables type/docstring inference after the budget is exceeded.
  • Add focused unit tests covering the new inference-skipping behavior and timeout bail-out semantics.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
marimo/_runtime/complete.py Adds time-budgeted bail-out logic and compute_type to avoid costly Jedi inference once over budget.
tests/_runtime/test_complete.py Adds regression tests ensuring type/docstring/signature inference is skipped appropriately under limits/timeouts.

Comment thread marimo/_runtime/complete.py Outdated
Comment on lines +326 to +330
start_time = time.time()
for completion in completions:
if not _should_include_name(completion.name, prefix):
continue
elapsed_time = time.time() - start_time
under_time_budget = (time.time() - start_time) < timeout
Copy link

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timeout budget here is measured with time.time(), which is not monotonic and can jump backwards/forwards if the system clock changes (NTP, sleep/wake), potentially causing the code to incorrectly think it's still under budget and continue triggering expensive Jedi inference. Prefer time.monotonic() (or time.perf_counter()) for elapsed-time budgets; you can also compute a deadline = start + timeout once and compare time.monotonic() < deadline inside the loop.

Copilot uses AI. Check for mistakes.
Comment thread marimo/_runtime/complete.py Outdated
Comment on lines 330 to 339
under_time_budget = (time.time() - start_time) < timeout
completion_options.append(
_get_completion_option(
completion,
script,
compute_completion_info=elapsed_time < timeout,
compute_completion_info=compute_docstrings
and under_time_budget,
compute_type=under_time_budget,
)
)
Copy link
Copy Markdown
Collaborator

@dmadisetti dmadisetti Apr 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

under_time_budget = (time.time() - start_time) < timeout
completion_options.append(
    _get_completion_option(
                completion,
                script,
                compute_completion_info=compute_docstrings
                and under_time_budget,
                compute_type=under_time_budget,
    )
)

# 
if not under_time_budget:
      to_compute.append((completion, script))
      
...

# kick off bg processing here
if to_compute:
     compute_completions(to_compute)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i dont think this backend compute is necessary

@mscolnick mscolnick added the enhancement New feature or request label Apr 17, 2026
dmadisetti
dmadisetti previously approved these changes Apr 17, 2026
@mscolnick mscolnick merged commit 1faa7c4 into main Apr 17, 2026
29 of 43 checks passed
@mscolnick mscolnick deleted the ms/jedi-completions branch April 17, 2026 21:57
@github-actions
Copy link
Copy Markdown

🚀 Development release published. You may be able to view the changes at https://marimo.app?v=0.23.2-dev52

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants