An offline, local-first AI writing assistant that translates selected text in-place with a single hotkey — no cloud, no copy-paste, no context switching.
Ewa runs silently in the background and intercepts a global hotkey. Highlight any text in any application, press the shortcut, and the selected text is replaced with the AI-translated result — all processed locally on your machine using a quantized LLaMA model.
- Inline translation — selected text is replaced in-place without leaving your current application
- Global hotkey — works in any text-editable field across all applications
- Fully offline — inference runs locally via LLamaSharp; no data ever leaves your machine
- GPU-accelerated — leverages CUDA on Windows (NVIDIA GPU) for fast inference
- Configurable — hotkey, injection delay, and model parameters are all adjustable
When you press the hotkey, Ewa executes a pipeline entirely on your local machine — no network calls, no external services.
sequenceDiagram
actor User
participant OS as Any Application
participant Ewa as Ewa (Background)
participant LLM as Local LLM<br/>(LLamaSharp + GGUF)
User->>OS: ① Select text with mouse
User->>Ewa: ② Press Ctrl+Alt+E
Ewa->>OS: ③ Simulate Ctrl+C → copy selected text
Ewa->>Ewa: ④ Back up original clipboard content
Ewa->>LLM: ⑤ Send selected text + translation prompt
LLM-->>Ewa: ⑥ Stream translated result (local inference)
Ewa->>OS: ⑦ Write result to clipboard → simulate Ctrl+V
Ewa->>Ewa: ⑧ Restore original clipboard content
graph TB
subgraph OS["OS Layer — Any Application"]
Text["Selected Text"]
Hotkey["Global Hotkey"]
end
subgraph AppHost["App Host — Ewa.App.Windows"]
HotkeyMgr["Hotkey Manager"]
Coord["App Coordinator <br> Routes events to handlers"]
ClipSvc["Clipboard Service"]
InputSim["Input Simulator"]
NotifySvc["Notification Service"]
end
subgraph Core["Core — Ewa.Core"]
Handler["Feature Handler"]
Feature["Inline Translation Feature"]
Engine["LLM Inference Engine"]
end
subgraph AI["Local AI"]
ModelSvc["Model Service"]
GGUF[".gguf Model File"]
end
Hotkey --> HotkeyMgr --> Coord --> Handler
Handler --> ClipSvc
Handler --> InputSim
Handler --> NotifySvc
Handler --> Feature --> Engine --> ModelSvc --> GGUF
Text -.->|"Ctrl+C (simulated)"| ClipSvc
ClipSvc -.->|"Ctrl+V (simulated)"| Text
classDef os fill:#f0f0f0,stroke:#999
classDef host fill:#dbeafe,stroke:#3b82f6
classDef core fill:#dcfce7,stroke:#16a34a
classDef ai fill:#fef9c3,stroke:#ca8a04
class Text,Hotkey os
class HotkeyMgr,Coord,ClipSvc,InputSim,NotifySvc host
class Handler,Feature,Engine core
class ModelSvc,GGUF ai
- Hotkey Manager: Registers a system-wide Win32 hotkey and fires an event when triggered
- App Coordinator: The composition root — wires events from the Hotkey Manager to the correct Feature Handler
- Clipboard Service: Backs up the current clipboard, reads the copied text, writes the result, then restores the original
- Input Simulator: Sends synthetic
Ctrl+CandCtrl+Vkeystrokes to the active application via Win32 - Notification Service: Plays non-intrusive system sounds to signal processing start, completion, or error
- Feature Handler: Owns the end-to-end pipeline: copy → infer → paste → restore. Platform-agnostic by design
- Inline Translation Feature: Holds the translation system prompt and delegates execution to the inference engine
- LLM Inference Engine: Streams tokens from the model service and aggregates them into a final string
- Model Service: Wraps LLamaSharp, applies the Llama 3 chat template, and streams raw tokens from the GGUF model
| Requirement | Details |
|---|---|
| OS | Windows 10 / 11 (64-bit) |
| Runtime | .NET 9 Desktop Runtime |
| GPU | NVIDIA GPU with CUDA 12 support (e.g. RTX 3060) |
| Model file | Llama 3.1 / 3.2 / 3.3 Model (Recommend: Meta-Llama-3.1-8B-Instruct-Q4_K_M.gguf) |
Go to the Releases page and download the latest Ewa .zip file.
Extract it to any folder, for example D:\Ewa\.
Download the quantized model file from HuggingFace.
Model compatibility: Ewa's inference engine uses the Llama 3 chat template. Currently only Llama 3.x Instruct models are supported (3.1, 3.2, 3.3).
Recommended model:
Meta-Llama-3.1-8B-Instruct (Q4_K_M version, ~5 GB)
Recommended source: https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF/tree/main
Place the .gguf file into the Models/ folder under the Ewa root directory.
- Double-click
Ewa.App.Windows.exeto open the startup window. - When loading finishes, press
Launch Ewa, the application will continue to run silently in the background. - Ewa is now active and listening for the global hotkey.
- Select any text by highlighting it with your mouse in any application (browser, Word, Notepad, email client, etc.)
- Press
Ctrl + Alt + E(default hotkey, you can change it inSettingswhich could be found in the system tray) - Wait briefly while Ewa processes the text locally
- The selected text is automatically replaced with the translated result
Right-click the Ewa icon in the system tray and select Settings.
| Setting | Description |
|---|---|
Modifiers |
Modifier keys for the global hotkey (e.g. Ctrl+Alt, Ctrl+Shift) |
Key |
Trigger key (e.g. E, T) |
KeepTranslationInClipboard |
If toggled, the translated text remains in the clipboard after pasting |
Injection Delay (ms) |
Delay (ms) between clipboard write and simulated paste; increase if text is not pasted correctly |
Context Window Size |
Maximum number of tokens the model can process per translation. Higher values support longer input text but consume more GPU memory. Default 4096 is sufficient for most use cases. |
GPU Acceleration Layers |
Number of model layers to offload to GPU |
Temperature |
Model creativity (lower = more deterministic) |
Right-click the Ewa icon in the system tray and select Exit Ewa.
This project is licensed under the MIT License. See the LICENSE file for details.



