Skip to content

Code Completion FIM

refact-planner edited this page Jun 7, 2026 · 1 revision

Code-Completion-FIM

Refact’s fill-in-the-middle completion pipeline, including scratchpads, AST/RAG augmentation, and completion caching.

Refact serves code completion through the /code-completion endpoint, using a fill-in-the-middle scratchpad that builds prompts around the cursor and returns completion choices or streaming deltas.

Fill-in-the-middle completion

The FIM pipeline lives in crates/refact-scratchpads/src/code_completion_fim.rs. It builds a prompt from text before and after the cursor and inserts the model’s fill token at the completion position.

Two prompt orders are supported:

  • FIM-PSM: prefix, suffix, middle
  • FIM-SPM: suffix, prefix, middle

These names refer to the scratchpad order used when constructing the prompt.

Scratchpad patch tokens

The model adaptation patch can define the tokens and formatting used by the scratchpad:

  • fim_prefix
  • fim_suffix
  • fim_middle
  • eot
  • extra_stop_tokens
  • context_format
  • rag_ratio

The implementation also accepts eos, and it validates that configured special tokens are single tokens when a tokenizer is available.

AST + RAG context augmentation

When AST-aware retrieval is enabled, the scratchpad can augment completion context with AST-derived and postprocessed nearby usages. The RAG path is implemented in crates/refact-scratchpads/src/completon_rag.rs and uses context_format to render attached files in formats such as chat, starcoder, and qwen2.5.

The AST/RAG augmentation is only used when the scratchpad has a non-empty context format, a positive rag_ratio, AST is enabled for the request, and an AST index is available.

Caching

The FIM scratchpad records completion metadata for cache use, including:

  • the final completion text
  • finish reason
  • prompt context fields such as fim_ms, n_ctx, and rag_tokens_limit

The completion cache stores the first completion choice and its finish reason as the model response is built.

/code-completion

The HTTP API exposes code completion at /code-completion, and the engine routes that endpoint into the scratchpad-based completion flow.

Related pages

Clone this wiki locally