vdavid
diff --git a/‎AGENTS.md‎
Lines changed: 3 additions & 0 deletions b/‎AGENTS.md‎
Lines changed: 3 additions & 0 deletions
diff --git a/‎Cargo.lock‎
Lines changed: 34 additions & 0 deletions b/‎Cargo.lock‎
Lines changed: 34 additions & 0 deletions
diff --git a/‎apps/desktop/knip.json‎
Lines changed: 2 additions & 7 deletions b/‎apps/desktop/knip.json‎
Lines changed: 2 additions & 7 deletions
diff --git a/‎apps/desktop/src-tauri/Cargo.toml‎
Lines changed: 1 addition & 0 deletions b/‎apps/desktop/src-tauri/Cargo.toml‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎apps/desktop/src-tauri/src/ai/CLAUDE.md‎
Lines changed: 12 additions & 1 deletion b/‎apps/desktop/src-tauri/src/ai/CLAUDE.md‎
Lines changed: 12 additions & 1 deletion
diff --git a/‎apps/desktop/src-tauri/src/ai/manager.rs‎
Lines changed: 143 additions & 2 deletions b/‎apps/desktop/src-tauri/src/ai/manager.rs‎
Lines changed: 143 additions & 2 deletions
diff --git a/‎apps/desktop/src-tauri/src/lib.rs‎
Lines changed: 2 additions & 0 deletions b/‎apps/desktop/src-tauri/src/lib.rs‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎apps/desktop/src/lib/ai/AiNotification.test.ts‎
Lines changed: 7 additions & 1 deletion b/‎apps/desktop/src/lib/ai/AiNotification.test.ts‎
Lines changed: 7 additions & 1 deletion
diff --git a/‎apps/desktop/src/lib/ai/ai-toast-sync.svelte.ts‎
Lines changed: 13 additions & 0 deletions b/‎apps/desktop/src/lib/ai/ai-toast-sync.svelte.ts‎
Lines changed: 13 additions & 0 deletions
diff --git a/‎apps/desktop/src/lib/settings/CLAUDE.md‎
Lines changed: 17 additions & 5 deletions b/‎apps/desktop/src/lib/settings/CLAUDE.md‎
Lines changed: 17 additions & 5 deletions
@@ -126,6 +126,9 @@ There are two MCP servers available to you:
   disabilities.
 - All actions longer than, say, 1 second should be immediately cancelable, canceling not just the UI but any background
   processes as well, to avoid wasting the user's resources.
+- Write _elegant_ code. Not quick code, not overengineered code, but elegant code. If you need to choose between a small
+  refactor that leads to a slightly better architecture or a larger refactor that leads to a near-perfect architecture,
+  choose the larger refactor.
 - When shortcuts are available for a feature, always display the shortcut in a tooltip or somewhere, less prominent than
   the main UI.
 - **Platform-native, not generic.** The app should look and feel as if it was specifically made for the user's OS. Never
 
@@ -1,16 +1,11 @@
 {
     "$schema": "https://unpkg.com/knip@5/schema.json",
     "ignoreBinaries": ["only-allow"],
-    "ignore": [
-        "src/routes/+layout.ts",
-        "src/lib/tauri-commands/**",
-        "src/lib/indexing/index-events.ts",
-        "src/lib/indexing/index-priority.ts"
-    ],
+    "ignore": ["src/lib/tauri-commands/**"],
     "ignoreDependencies": [
+        "oxlint",
         "@tauri-apps/cli",
         "@testing-library/svelte",
-        "@sveltejs/adapter-static",
         "prettier-plugin-svelte",
         "@wdio/globals",
         "@wdio/local-runner",
 
@@ -76,6 +76,7 @@ lru = "0.16"
 xattr = "1"
 filetime = "0.2"
 exacl = "0.12"
+sysinfo = { version = "0.38.4", default-features = false, features = ["system"] }
 
 [target.'cfg(unix)'.dependencies]
 libc = "0.2"
 
@@ -21,7 +21,7 @@ Three provider modes:
 
 ### Tauri commands
 
-Core: `get_ai_status`, `get_ai_model_info`, `get_ai_runtime_status`, `configure_ai`, `start_ai_server`, `stop_ai_server`, `start_ai_download`, `cancel_ai_download`, `get_folder_suggestions`.
+Core: `get_ai_status`, `get_ai_model_info`, `get_ai_runtime_status`, `configure_ai`, `start_ai_server`, `stop_ai_server`, `check_ai_connection`, `start_ai_download`, `cancel_ai_download`, `get_folder_suggestions`.
 Legacy (still wired, used by toast): `uninstall_ai`, `dismiss_ai_offer`, `opt_out_ai`, `opt_in_ai`, `is_ai_opted_out`.
 
 ## Startup flow
@@ -48,6 +48,17 @@ Frontend loads
 - `local` -> uses local llama-server (if running)
 - `openai-compatible` -> builds `AiBackend::OpenAi` from stored config, calls `chat_completion`
 
+## Download/install event sequence
+
+`do_download()` emits events for each install step so the frontend can show progress:
+1. `ai-extracting` -- binary extraction from bundled archive (usually instant)
+2. `ai-download-progress` (repeated) -- model download with bytes/total/speed/eta
+3. `ai-verifying` -- file size verification after download completes
+4. `ai-installing` -- server startup begins (health check polling)
+5. `ai-install-complete` -- server is healthy and ready
+
+The frontend (`AiSection.svelte`) tracks `installStep` state and displays "Step N of 4" labels.
+
 ## Key patterns
 
 - Two install flags: `AiState.installed` AND `AiState.model_download_complete` -- both must be true.
 
@@ -199,8 +199,13 @@ pub fn cancel_ai_download() {
 }
 
 /// Uninstalls the AI model and binary, resets state.
+/// Async because `stop_process` can block up to 5 seconds (SIGTERM + wait + SIGKILL).
 #[tauri::command]
-pub fn uninstall_ai() {
+pub async fn uninstall_ai() {
+    tauri::async_runtime::spawn_blocking(uninstall_ai_sync).await.ok();
+}
+
+fn uninstall_ai_sync() {
     let mut manager = MANAGER.lock_ignore_poison();
     if let Some(ref mut m) = *manager {
         // Stop server if running
@@ -359,6 +364,30 @@ pub fn get_ai_runtime_status() -> AiRuntimeStatus {
     }
 }
 
+/// System memory info returned to frontend for the RAM gauge.
+#[derive(Debug, Clone, serde::Serialize)]
+#[serde(rename_all = "camelCase")]
+pub struct SystemMemoryInfo {
+    pub total_bytes: u64,
+    /// Memory actively used by processes (app + wired + compressed on macOS).
+    pub used_bytes: u64,
+    /// Memory available for new allocations (free + inactive + purgeable on macOS).
+    pub available_bytes: u64,
+}
+
+/// Returns system memory info (total, used by processes, and available).
+/// Uses the `sysinfo` crate for cross-platform accuracy.
+#[tauri::command]
+pub fn get_system_memory_info() -> SystemMemoryInfo {
+    let mut sys = sysinfo::System::new();
+    sys.refresh_memory();
+    SystemMemoryInfo {
+        total_bytes: sys.total_memory(),
+        used_bytes: sys.used_memory(),
+        available_bytes: sys.available_memory(),
+    }
+}
+
 /// Stores provider + context size + OpenAI config in manager state.
 /// If provider is `local` and model is installed and hardware is supported, starts the server
 /// in a background task. If provider is NOT `local` and a server is running, stops it.
@@ -497,6 +526,116 @@ pub fn start_ai_server<R: Runtime>(app: AppHandle<R>, ctx_size: u32) -> Result<(
     Ok(())
 }
 
+/// Result of checking connectivity to an AI API endpoint.
+#[derive(Debug, Clone, serde::Serialize)]
+#[serde(rename_all = "camelCase")]
+pub struct AiConnectionCheckResult {
+    pub connected: bool,
+    pub auth_error: bool,
+    pub models: Vec<String>,
+    pub error: Option<String>,
+}
+
+/// Checks connectivity to an AI API endpoint by calling GET {base_url}/models.
+/// Returns connection status, auth status, and available model list.
+#[tauri::command]
+pub async fn check_ai_connection(base_url: String, api_key: String) -> AiConnectionCheckResult {
+    let url = format!("{}/models", base_url.trim_end_matches('/'));
+
+    let client = match reqwest::Client::builder()
+        .timeout(std::time::Duration::from_secs(10))
+        .build()
+    {
+        Ok(c) => c,
+        Err(e) => {
+            return AiConnectionCheckResult {
+                connected: false,
+                auth_error: false,
+                models: vec![],
+                error: Some(format!("Can't create HTTP client: {e}")),
+            };
+        }
+    };
+
+    let mut request = client.get(&url);
+    if !api_key.is_empty() {
+        request = request.header("Authorization", format!("Bearer {api_key}"));
+    }
+
+    let response = match request.send().await {
+        Ok(r) => r,
+        Err(e) => {
+            let msg = if e.is_timeout() {
+                String::from("Can't reach server (timed out)")
+            } else if e.is_connect() {
+                String::from("Can't reach server")
+            } else {
+                format!("Can't reach server: {e}")
+            };
+            return AiConnectionCheckResult {
+                connected: false,
+                auth_error: false,
+                models: vec![],
+                error: Some(msg),
+            };
+        }
+    };
+
+    let status = response.status();
+
+    if status == reqwest::StatusCode::UNAUTHORIZED || status == reqwest::StatusCode::FORBIDDEN {
+        return AiConnectionCheckResult {
+            connected: true,
+            auth_error: true,
+            models: vec![],
+            error: Some(String::from("API key is invalid")),
+        };
+    }
+
+    if status == reqwest::StatusCode::OK {
+        let body = response.text().await.unwrap_or_default();
+        // Try parsing OpenAI-style response: { "data": [{ "id": "model-name" }, ...] }
+        let models = parse_model_ids(&body);
+        return AiConnectionCheckResult {
+            connected: true,
+            auth_error: false,
+            models,
+            error: None,
+        };
+    }
+
+    // Other HTTP error
+    let body = response.text().await.unwrap_or_default();
+    let body_preview = if body.len() > 200 {
+        format!("{}...", &body[..200])
+    } else {
+        body
+    };
+    AiConnectionCheckResult {
+        connected: true,
+        auth_error: false,
+        models: vec![],
+        error: Some(format!("HTTP {status}: {body_preview}")),
+    }
+}
+
+/// Parses model IDs from an OpenAI-compatible /models response.
+/// Returns empty vec on parse failure (connected but can't list models).
+fn parse_model_ids(body: &str) -> Vec<String> {
+    #[derive(serde::Deserialize)]
+    struct ModelsResponse {
+        data: Vec<ModelEntry>,
+    }
+    #[derive(serde::Deserialize)]
+    struct ModelEntry {
+        id: String,
+    }
+
+    serde_json::from_str::<ModelsResponse>(body)
+        .map(|r| r.data.into_iter().map(|m| m.id).collect())
+        .unwrap_or_default()
+}
+
 /// Formats bytes as GB with one decimal place (like "4.3 GB").
 fn format_bytes_gb(bytes: u64) -> String {
     let gb = bytes as f64 / 1_000_000_000.0;
@@ -633,6 +772,7 @@ async fn do_download<R: Runtime>(app: &AppHandle<R>) -> Result<(), String> {
     // Step 1: Extract llama-server from bundled archive (instant, no download needed)
     let binary_path = ai_dir.join(LLAMA_SERVER_BINARY);
     if !binary_path.exists() {
+        let _ = app.emit("ai-extracting", ());
         extract_bundled_llama_server(app, &ai_dir)?;
     }
 
@@ -660,7 +800,8 @@ async fn do_download<R: Runtime>(app: &AppHandle<R>) -> Result<(), String> {
 
     download_file(app, model.url, &model_path, is_cancel_requested).await?;
 
-    // Verify download integrity by checking file size
+    // Step 3: Verify download integrity by checking file size
+    let _ = app.emit("ai-verifying", ());
     let actual_size = fs::metadata(&model_path)
         .map(|m| m.len())
         .map_err(|e| format!("Failed to read downloaded model file: {e}"))?;
 
@@ -823,6 +823,8 @@ pub fn run() {
             ai::manager::configure_ai,
             ai::manager::start_ai_server,
             ai::manager::stop_ai_server,
+            ai::manager::check_ai_connection,
+            ai::manager::get_system_memory_info,
             ai::manager::start_ai_download,
             ai::manager::cancel_ai_download,
             ai::manager::dismiss_ai_offer,
 
@@ -13,15 +13,19 @@ vi.mock('./ai-state.svelte', () => ({
 import { getAiState, handleDownload, handleCancel, handleDismiss, handleOptOut, handleGotIt } from './ai-state.svelte'
 import AiToastContent from './AiToastContent.svelte'
 
+type AiNotificationState = 'hidden' | 'offer' | 'downloading' | 'installing' | 'ready' | 'starting'
+
 let mockState = {
-    notificationState: 'hidden' as string,
+    notificationState: 'hidden' as AiNotificationState,
     downloadProgress: null as { bytesDownloaded: number; totalBytes: number; speed: number; etaSeconds: number } | null,
     progressText: '',
     modelInfo: {
         id: 'ministral-3b-instruct-q4km',
         displayName: 'Ministral 3B',
         sizeBytes: 2147023008,
         sizeFormatted: '2.1 GB',
+        kvBytesPerToken: 106496,
+        baseOverheadBytes: 3500000000,
     },
 }
 
@@ -43,6 +47,8 @@ describe('AiToastContent', () => {
                 displayName: 'Ministral 3B',
                 sizeBytes: 2147023008,
                 sizeFormatted: '2.1 GB',
+                kvBytesPerToken: 106496,
+                baseOverheadBytes: 3500000000,
             },
         }
         vi.mocked(getAiState).mockReturnValue(mockState)
 
@@ -0,0 +1,13 @@
+import AiToastContent from './AiToastContent.svelte'
+import { getAiState } from './ai-state.svelte'
+import { addToast, dismissToast } from '$lib/ui/toast'
+
+export function initAiToastSync(): void {
+    $effect(() => {
+        if (getAiState().notificationState === 'hidden') {
+            dismissToast('ai')
+        } else {
+            addToast(AiToastContent, { id: 'ai', dismissal: 'persistent' })
+        }
+    })
+}
@@ -38,19 +38,31 @@ Single source of truth for all settings. Each `SettingDefinition` contains:
 `UpdatesSection`, `ThemesSection`, `AdvancedSection`, `DriveIndexingSection`, `AiSection`, `LicenseSection`.
 
 `AiSection` is a hybrid special section (like `LicenseSection` above): it combines dynamic runtime state from the
-backend (via `getAiRuntimeStatus()` and Tauri events) with registry settings (`ai.provider`, `ai.openaiApiKey`, etc.).
-It conditionally renders provider-specific content, handles auto-stop/start of the local server on provider switch, and
-debounces context size changes with a 2-second restart delay.
+backend (via `getAiRuntimeStatus()` and Tauri events) with registry settings (`ai.provider`, `ai.cloudProvider`,
+`ai.cloudProviderConfigs`, etc.). It conditionally renders provider-specific content, handles auto-stop/start of the
+local server on provider switch. Context size changes are not auto-applied; the user must click an explicit "Apply"
+button, which triggers a server restart. A RAM gauge (stacked bar) shows memory usage relative to system total, with
+warning icons at >70% and >90% projected usage. System memory info is polled every 5 seconds via
+`get_system_memory_info`. The "Cloud / API" provider mode uses a preset dropdown (`cloud-providers.ts`) with
+per-provider API key storage in a JSON blob (`ai.cloudProviderConfigs`). Old flat settings (`ai.openaiApiKey`,
+`ai.openaiBaseUrl`, `ai.openaiModel`) are migrated on first load. The Cloud/API section includes a two-step connection
+check (`check_ai_connection` Tauri command) that auto-triggers on API key or base URL changes (1s debounce), fetches
+available models from the `/models` endpoint, and shows connection status (connected, auth error, unreachable). When
+models are available, the Model field becomes a combobox with filtered dropdown; otherwise it's a plain text input.
 
 ### Components (`components/`)
 
 11 reusable setting UI primitives used by section components: `SettingsSection` (wrapper providing shared section title
 and action button styles), `SettingRow`, `SettingSwitch`, `SettingSelect`, `SettingSlider`, `SettingNumberInput`,
-`SettingPasswordInput`, `SettingRadioGroup`, `SettingToggleGroup`, `SettingsSidebar`, `SettingsContent`. Also
-`SectionSummary` for collapsed-section previews.
+`SettingPasswordInput` (supports both settings-store-driven and controlled/external value+onchange modes),
+`SettingRadioGroup`, `SettingToggleGroup`, `SettingsSidebar`, `SettingsContent`. Also `SectionSummary` for
+collapsed-section previews.
 
 ### Other files
 
+- **cloud-providers.ts** — Cloud provider preset definitions (OpenAI, Anthropic, Groq, etc.) and per-provider config
+  helpers (`getProviderConfigs`, `setProviderConfig`, `resolveCloudConfig`). Used by `AiSection` and the startup flow in
+  `+layout.svelte` to resolve the effective API key, base URL, and model before calling `configureAi`.
 - **settings-search.ts** — Fuzzy search over setting definitions; returns ranked matches with highlight ranges
 - **settings-applier.ts** — Listens for setting changes and applies side effects (CSS vars, backend config sync)
 - **network-settings.ts** — Network-specific setting helpers (proxy config, SMB auth defaults)