v2026.05.22 by stephamie7 · Pull Request #312 · AgentFlocks/flocks

stephamie7 · 2026-05-22T10:07:50Z

No description provided.

Wire the OneSEC, NGTIP, and Qingteng API handlers to respect the configured SSL verification toggle so private deployments can bypass certificate checks when needed. Add regression tests that cover both enabled and disabled verification paths and isolate Qingteng tests from local machine config.

Introduce a unified entry skill for all `onesig_*` tools, mirroring the style of `onesec-use` / `qingteng-use`. Any task that mentions OneSIG / SIG / Secure Internet Gateway must load this skill first instead of calling the tools directly. - SKILL.md: API-vs-browser decision flow and write-action confirmation protocol; declares the skill as the single decision entry-point. - references/api-reference.md: action routing table for the six grouped tools (`onesig_login` / `assets` / `device` / `helper` / `monitoring` / `strategy`), business primary keys (uniqueId / uid / pid / groupId / ruleId / srcIp / server+port, etc.), high-frequency call examples, binary/file endpoints (`document_preview`, multipart uploads), the RSA-OAEP auto-encrypted fields (`password`, `dupPassword`), the mandatory `type=physical` flag for API Key endpoints, and the fact that `ips_rule_create` is actually a query. - references/browser-workflow.md: console navigation map and `agent-browser` operating rules for the fallback path when the API is unavailable. The reference has been cross-checked against the vendor SIGWEBAPI docs and `onesig.handler.py`, fixing legacy drifts such as assetType -> type, name -> username, fileName -> id, port_protect_group_port_list -> port_protect_port_list, and the top-level placement of `condition` / `comments` in `whitelist_add`. Made-with: Cursor

…pi_prefix Three interlocking changes that make OneSIG v2.5.x easier to bring online out of the box and align SSL / cookie behaviour with the other built-in providers (onesec / ngtip / qingteng). - Persistent login session: after a successful login the aiohttp CookieJar is serialised into `~/.flocks/config/.secret.json` under a key shaped like `onesig_session_cookie__<sha1[:12]>`. On process restart, if at least one cookie is still alive the jar is rehydrated in-place and the captcha -> pubkey -> /v3/login -> /v3/account chain is skipped entirely; the very first business request that returns 401 / responseCode 1019..1022 still triggers exactly one auto re-login to preserve previous behaviour. `logout()` wipes the on-disk entry, `close()` does not. The `persist_cookies` toggle defaults to True and honours camelCase / `custom_settings` / `ONESIG_PERSIST_COOKIES` fallbacks. - SSL verification defaults to OFF, matching onesec / ngtip / qingteng. Five sources are recognised in priority order: `verify_ssl`, `ssl_verify`, `verifySsl`, `custom_settings.verify_ssl`, and the `ONESIG_VERIFY_SSL` env var, falling back to False. The `verify_ssl` field is removed from `provider.credential_fields` so the global "SSL verify" switch on `ServiceDetailPanel` becomes the single source of truth. - `api_prefix` default changes from `"/api"` to `""`, matching the common v2.5.x deployments where nginx already routes `/v3/...` to the backend. Reverse-proxy deployments can set it back to `"/api"`. `_provider.yaml` notes are updated with the 404 -> flip-prefix troubleshooting tip. `tests/tool/test_onesig_api_tool.py` covers: the three `verify_ssl` aliases ending up as the right `ssl=` argument on `aiohttp.session.request`, cookie snapshot purity (round-trip and expired-cookie filtering), `__init__` load / `login()` save / `logout()` delete, and the persisted-cookie path bypassing the full captcha -> pubkey -> login -> account chain. Made-with: Cursor

Some providers (onesig in particular) reuse the persisted `base_url` as `default_value` on the metadata endpoint. The previous reload logic in `ServiceDetailPanel` treated `value === effectiveDefault` as a "placeholder" case and cleared the input, so reopening a configured service showed an empty API URL and saving immediately overwrote the backend record with an empty string. The form now renders whatever the backend returns under `fields` (falling back to legacy keys only) and never compares against `default_value`. `ServiceDetailPanelApi.test.tsx` adds a regression test where metadata returns the same value in both `default_value` and `fields`, asserting the form still shows the persisted URL. Made-with: Cursor

feat/onesig api integration

…irmware Older OneSIG v2.5 builds reject RSA-OAEP ciphertext on POST /v3/login and only accept the raw password, so we ship a sibling plugin (registered as service_id `onesig_v2_5_older_api`) that mirrors the standard `onesig` handler with one targeted change: the captcha → pubkey → encrypt → login chain is collapsed to captcha → login, sending `self.config.password` in the clear. All other paths — cookie session, persistence, captcha / TOTP fallback, 401 / 1019..1022 auto-relogin, and RSA-OAEP encryption of sensitive *non-login* write fields (change_password, user_create, user_delete, audit-log purge, interface password, device-upgrade password) — are kept identical to the encrypted variant so the two can coexist on the same flocks instance without rewriting business code. Namespacing keeps the two plugins isolated end-to-end: SERVICE_ID, secret_id (`onesig_v2_5_older_password`), persisted-cookie prefix (`onesig_v2_5_older_session_cookie__`), output filename prefix, and env vars (`ONESIG_V2_5_OLDER_*`, with `ONESIG_*` retained as fallback) are all distinct, so the persisted cookie jars and credentials of the new and legacy plugins never clobber each other. The skill (`plugins/skills/onesig-use`) and existing tests are left untouched — they continue to document the standard encrypted flow. Made-with: Cursor

…ount lockout The "Test connectivity" button looped over `service_tools[:5]` and ran `_build_param_sets()` (4–6 enum-driven actions per tool) for each one. For session-based services (OneSIG / OneSec / Qingteng / ...) every failed attempt triggers a fresh login round-trip, so a single click on wrong credentials could fire ~30 failed logins and trip the server-side account lockout. Changes * Add `_is_action_dispatch_login_probe()`: classifies tools whose name ends in `_login` and whose required `action` parameter is an enum (e.g. `onesig_login`, `onesec_login`, `onesig_v2_5_older_login`). * `_tool_sort_key()` now puts those at the top alongside parameter-free login probes (priority -1). * Replace the nested probe loop with a **single tool, single param set** call. For action-dispatch login tools we force `action="test"`, which the handler's `_dispatch_group` special-cases to a read-only call from `_CONNECTIVITY_TEST_ACTIONS[group]` (onesig → `get_account`, onesec → `common_threat_type_list`, ...). No `login` / `logout` / `change_password` / `get_pubkey` enum value is ever invoked. * Surface the lockout-prevention rationale in the failure message so users know a single failed probe is not a definitive verdict. Tradeoff: a non-auth failure on the chosen probe (e.g. a synthesized param value the endpoint rejects) shows as a false negative; the user re-tests after fixing config — strictly better than discovering the service account got locked. Tests * Replace `test_failed_attempts_are_aggregated_in_message` (which asserted the removed multi-attempt aggregation) with `test_single_attempt_only_to_avoid_account_lockout`, asserting `execute.await_count == 1` and the new "为避免连续失败导致账号锁定" message. * New `test_action_dispatch_login_tool_uses_test_action` pins the contract that `onesig_login`-style tools are picked over their sibling `_assets` / `_monitoring` groups and invoked with exactly `{"action": "test"}`. Made-with: Cursor

Avoid rendering generic MCP tools with category fields as delegate task cards, while preserving child-session links and URL-driven session selection behavior.

…pdate (#198) - Add normalizeNpmRegistry and resolveAgentBrowserNpmRegistry helpers - Pass explicit registry via npm_config_registry when spawning npm update -g - Refresh bundled agent-browser core skill content and add trust-boundaries ref - Add unit tests for registry normalization and resolution precedence

* add post-login notification modal * fix notification dismissal behavior

…ies (#194) - Default WebUI to same-origin /api proxy for non-loopback backends; opt-in direct VITE_* URLs via FLOCKS_WEBUI_DIRECT_BACKEND_URLS. - Resolve session from cookie early in apply_auth_for_request; broaden browser-like detection for SSE/reverse-proxy (session cookie, Mozilla UA). - Enable xfwd on Vite dev proxies so X-Forwarded-* reach the backend. - Document LAN/reverse-proxy behavior in README; extend CLI and auth tests.

* feat(tdp-api): semantic tool params and handler mappings - Expand TDP tool YAML schemas with top-level filters, keyword, pagination - Extend tdp.handler.py to map params to condition/page/fuzzy for APIs - Update tdp-use skill and api-reference for preferred calling patterns - Add Skyeye API plugin regression test * feat(tdp): semantic tool params, log SQL guard, handler fixes - Extend TDP YAML tools and api-reference for clearer agent-facing params - Handler: interface risk condition defaults, disposal_log_list action, incident timeline show_attack, log_search default sql + reject full SQL - service_asset_list: document time range as N/A for inventory APIs - Tests: tdp_api_tools, skyeye plugins; add integration live config test

- Increase file read defaults: 2000 lines, 2000 chars/line, 20 KB cap - Raise registry truncation: 1000 lines, 100 KB, 100k hard max chars - Add tests for read limits and truncation constants - Align skill tool tests with registry defaults

…tion (#201) Emit llm before/after hook stages around model calls and expose normalized query metadata. Improve tool discovery with canonical alias matching and select:name batching so known tools can be loaded deterministically in one call.

- Persist messages under message:<session_id> and message_parts:<session_id> instead of per-message keys so WebUI/Message.list_with_parts work. - Normalize legacy exported parts (tool state, reasoning time, etc.). - Add tests for aggregated storage format and legacy reasoning payloads.

Align the skill identifier and header with the agent-browser naming, and remove outdated quickstart/reference text to keep documentation focused on current usage.

…ks (#205) - Restart loads last recorded backend/webui host and port when CLI and env omit them; CLI and env still override runtime defaults. - On Windows, validate tracked PIDs against command line and image name so stale PID reuse does not kill unrelated processes or skip cleanup. - Add tests for restart defaults and Windows PID reuse scenarios.

Wires the QAX NGSOC-BD/NGSOC-LV V4.0 (R4.15.1) WebAPI into the flocks plugin framework, modelled after the onesig integration but simplified for NGSOC's static NGSOC-Access-Token header auth (no captcha / cookie / RSA negotiation). Coverage (manual sections 5.1 - 5.8): - alarms 14 actions (list / detail / dispose / PCAP export / AI judge SSE / judgment record) - assets 3 actions (asset detail, group id list, group list) - vuls 5 actions (per-asset vuls + config, vul / web / weak-pwd lists) - risks 2 actions (asset risk list, per-asset risk score) - users 1 action (account nicknames) - workorders 3 actions (status update, list, detail) - bigscreens 6 actions (vuln top, asset top, threat type, attack ip, attack list, victim survey) - storage 1 action (binary download persisted under ~/.flocks/workspace/outputs) Highlights of the handler design: - Declarative ActionSpec routing (rest_keys / query_keys / body_keys / passthrough_query / passthrough_body / binary / accept) shared by every group entry point; method-agnostic passthrough_query so POST endpoints that mix body + query (e.g. /risks/asset/asset-risks) ship every field correctly. - Long-lived aiohttp.ClientSession per device with double-checked locking; SSL verify defaults to False (private appliance reality) with verify_ssl / ssl_verify / verifySsl / NGSOC_VERIFY_SSL all honored for parity with sibling api plugins. - Boolean query params lowercased to "true" / "false" (NGSOC's Spring backend treats "True" as false) while JSON bodies keep native Python booleans. - Binary downloads prefer the user's fileName query param (sanitized against ../ traversal and Windows-style paths) and fall back to a Content-Type-derived extension. - action="test" routes to a no-arg connectivity probe per group (users / bigscreens / assets) for fast credential validation. - /app-ai-alarm-judgment/judge SSE response is surfaced as a raw event_stream string instead of forcing JSON envelope parsing. Tests: 67 pytest cases covering SSL precedence, token injection, URL composition + REST placeholder validation, ActionSpec request building, required-param validation, envelope unwrapping, binary persistence, group dispatch (action="test", unknown action), boolean coercion, body / query field separation, filename sanitization, and YAML manifest loading for all 8 groups. Made-with: Cursor

- Rename plugin dir sangfor_sip -> sangfor_sip_v92 (avoid `.` in dir name; service_id stays "sangfor_sip" so existing config & secrets continue to work unchanged). - Add `version: "9.2"` and `defaults.product_version: "9.2"` to _provider.yaml, mirroring ngsoc convention. - Fix login auth: `desc` field now sent as `auth_desc` (was empty string, breaking sha1 signature) and login URL appends `?verify=false` per 92-version spec. - Correct endpoint names to lowercase per 92-version spec: riskBusiness->riskbusiness, secEvent->riskevent, weakPassword->weakpasswd, vulInfo->hole, plainTextInfo->plaintexttransmission. - Add risk_terminal action (/data/riskterminal) and expose it via sangfor_sip_risk.yaml `action` parameter. - Cap maxCount in _fetch_data: 10000 normal / 5000 vulnerability; align assets.yaml default & description accordingly. - Fix terminal classfy1_id description: 2 -> 2,7,8.

- Add `_resolve_verify_ssl` helper that reads `verify_ssl` / `ssl_verify` / `custom_settings.verify_ssl` (matches the onesec/qingteng/ngtip pattern from PR #193). The bottom "SSL 验证" form toggle writes to `custom_settings.verify_ssl`, which the handler previously ignored — causing the toggle to have no effect at runtime. - Drop the standalone `verify_ssl` credential_field from _provider.yaml so the WebUI no longer renders a duplicate text input next to the bottom SSL toggle. - Refresh _provider.yaml notes to describe the new SSL toggle resolution precedence.

fix/sangfor sip endpoints

…ription) Surface the optional `version` field declared in `_provider.yaml` (e.g. `sangfor_sip_v92` → 9.2) to both the WebUI and the agent-facing tool schema, so operators see which upstream API version a service binds to and the model can pick version-appropriate parameters at call time. Backend: - Add reusable `extract_provider_version(provider_cfg)` helper that reads top-level `version`, falling back to `defaults.product_version` / `defaults.version`, and coerces to str (handles YAML floats). - `ToolInfo` gains optional `provider_version`; `yaml_to_tool` fills it from the loaded `_provider.yaml`. - `_load_provider_yaml_metadata` and `_build_api_service_summary` use the same helper so `GET /api/provider/{id}/metadata` and `GET /api/provider/api-services` both return `version`. - `APIServiceSummary` adds `version: Optional[str]`. - Session runner appends `\n\n[Provider: <name> | Version: <ver>]` to each tool's description before sending it to the LLM, gated on the tool actually carrying a `provider_version`. Annotation is built by the new module-level `_annotate_with_provider_version` helper to keep it pure and testable; original `ToolInfo` is never mutated. Frontend: - `APIServiceSummary` type gains `version?: string`. - API service card renders a `v9.2` badge next to the existing API tag, with `^v` prefix stripped to avoid a `vv9.2` double-prefix when an operator writes `version: "v9.2"` in the YAML. - Detail panel already supported `metadata?.version` via existing i18n key `serviceInfo.version`, no change needed there. Tests: - 3 new cases in `test_tool_plugin.py` cover top-level `version`, fallback to `defaults.product_version`, and the absent case. - New `tests/session/test_runner_provider_version.py` (7 cases) pins the description-annotation contract: presence/absence of version, empty/None description, missing provider, no mutation, whitespace handling, and tools that don't declare the attr at all (builtin/MCP). Made-with: Cursor

feat(provider): expose service version end-to-end (UI + LLM tool desc…

…ed names Following the `sangfor_sip_v92` precedent, declare the deployed product version on every supported third-party API plugin (top-level `version:` plus `defaults.product_version:` for backward compatibility) and rename the plugin directories to include a `_v<dot-replaced-version>` suffix so the on-disk layout reflects the targeted release. Versions captured: - qingteng -> qingteng_v3_4_1_66 - sangfor_xdr -> sangfor_xdr_v2_2 - ngtip_api -> ngtip_v5_1_5 - onesig -> onesig_v2_5_3_D20260321 - onesec -> onesec_v2_8_2 - tdp_api -> tdp_v3_3_10 - skyeye_api -> skyeye_v4_0_14_0_SP2 `service_id` (and the `provider:` references inside each tool YAML) is intentionally left unchanged so existing `flocks.json` configs and `{secret:*_api_key}` references keep working without migration. Tests that hard-code the old plugin paths are updated accordingly. Made-with: Cursor

- Updater: npm ci when package-lock exists; 300s timeouts with explicit TimeoutExpired handling; retry uv sync without default-index after mirror failure; surface restart/build failure exceptions in UI messages - Server: filter noisy successful polling from request logs; log duration and errors; stream tail reads for large log files - CLI: disable uvicorn access logs (app middleware owns request lines) - WeCom: bridge SDK logger to Flocks and drop debug/info noise - WebUI: increase focus-triggered update check min gap to 10 minutes - Tests: cover log routes, request filters, updater retries and timeouts

Allow multiple versions of the same API product to coexist in flocks.json under distinct `<service_id>_v<version>` storage keys, so updating a plugin to a new version no longer overwrites the previous version's credentials. Core changes: - Add `flocks/config/api_service_versioning.py` with derive/discover helpers, bidirectional legacy<->storage_key resolution, shadowing detection, and an idempotent copy-only migration that backs up flocks.json before its first write. - Promote `info.provider` to the storage_key in `tool_loader` while preserving the unversioned `service_id` on the Tool instance for legacy lookups. - Make `ConfigWriter.get_api_service_raw` version-aware: prefer the versioned shadow when an unversioned id is requested, fall back to the legacy id when a storage_key has no entry yet (covers partially-upgraded environments and isolated tests). Warn on asymmetric writes that target a shadowed legacy id. - Run migration in the lifespan startup before `ToolRegistry.init` so freshly loaded tools observe the post-migration layout. - Hide shadowed legacy entries from the API service listing endpoint to avoid duplicate rows in the WebUI after migration. - Make the provider-route metadata loader resolve a `provider_id` against each candidate `_provider.yaml`'s derived storage_key, not just its directory name. Without this, plugins whose dir was renamed to a shorter form (e.g. `tdp_v3_3_10` for service_id `tdp_api`) returned no metadata to the WebUI. Tests: - New `tests/config/test_api_service_versioning.py` (41 cases) covers derivation, descriptor discovery, legacy resolution, shadowing, migration idempotency + backup, ConfigWriter fallback (incl. the `"api_services": null` defensive path), and the regression where the metadata loader must accept storage_keys whose directory name differs. - Existing tool-plugin tests refreshed for the new `info.provider` values (`tdp_api_v3_3_10`, `qingteng_v3_4_1_66`, etc.) and the versioned plugin directory paths. Made-with: Cursor

…ovider-versions

Add the local Hub catalog, backend install APIs, WebUI browsing experience, and validation coverage so bundled plugins can be discovered and installed globally. Made-with: Cursor

Limit native tool discovery to direct payload files so provider metadata or nested files do not make a tool package appear installable by mistake. Made-with: Cursor

…versioning Centralize the storage-key resolution logic that used to be inlined in ConfigWriter and route handlers. Domain rules now live in a single module with a shorter, more intuitive name. Module rename: - flocks/config/api_service_versioning.py -> flocks/config/api_versioning.py (and the matching tests/config/test_*.py). Aligns length with sibling config.py / config_writer.py. New helpers in api_versioning: - resolve_api_service(service_id, services): three-step shadow / direct / legacy lookup; the only place this rule lives. - warn_if_shadowing_legacy(service_id, services): structured warning when a write targets a legacy id whose versioned shadow already exists. Slimmed call sites: - ConfigWriter.get_api_service_raw shrinks from ~45 lines to ~12, just reads flocks.json then delegates to resolve_api_service. Falls back to a plain dict lookup if the import fails so a versioning bug cannot break credential reads. - ConfigWriter.set_api_service uses setdefault and delegates the shadow-warning to the new helper. Log key renamed to api_service.write.shadowed_legacy. - _load_provider_yaml_metadata in server/routes/provider.py now reuses discover_api_service_descriptors instead of reimplementing the plugin directory walk. Drops ~45 lines and removes the duplicate matching logic between provider.py and api_versioning.py. Cleanups: - Drop is_legacy_shadowed (one-line wrapper around shadowed_legacy_ids with no production callers); tests use the batch API directly. - Tighten _extract_version's tail return. - Drop the verbose null-handling commentary in get_api_service_raw; a single isinstance check covers null / non-dict garbage. Net diff: +125 / -185 across 6 files, 41 versioning tests + 234 directly affected tests still pass, no lint regressions. Made-with: Cursor

- skill: block disabled skills from `delegate_task(load_skills=...)` so the toggle in the Skills UI can no longer be bypassed via subagent injection. Disabled skills are now reported as "not found" to avoid signalling their existence to the LLM. - device: persist `enabled=false` for storage_keys whose last device instance was just removed. `sync_service_tool_state` now accepts a `deleted_storage_keys` hint (used by `route_delete_device`) and also sweeps the api_services config for entries whose backing devices no longer exist, so historical dirty state self-heals on next sync. The startup `_sync_all` additionally walks api_services to pick up service_ids that have zero remaining DB rows. Idempotent: writes are skipped when the config already matches. - agent: invalidate the per-worker agent cache when another uvicorn worker toggles a skill. `Agent.state` now checks the mtime of `skill_settings.json` before serving cached prompts, so a PATCH from one worker is observed across all workers without IPC. Co-authored-by: Cursor <cursoragent@cursor.com>

Each chat part was wrapped in its own `<div key>`, so the thinking block's `mt-2 first:mt-0` always saw itself as the first child of its wrapper and collapsed the top gap to zero. The result was that a tool card followed by a thinking block (or vice versa) looked glued together, while two consecutive thinkings or two consecutive tools still had 8px between them — visibly uneven spacing. Move the inter-part gap to the wrapper itself (`mt-2 first:mt-0`) and strip the redundant `mt-2` from `ChatToolPart` (`<details>` + waiting- for-answer branch) and the thinking block. With a single source of truth at the wrapper level, every adjacent pair of parts now has a uniform 8px gap and the first part still sits flush with the message header. Co-authored-by: Cursor <cursoragent@cursor.com>

…ebar_and_add_shortcuts Merge `origin/dev` into the PR 131 sidebar/shortcuts branch. Nine files had textual conflicts; their resolution preserves PR 131's disabled-skill fixes and UI redesign while picking up dev's slash-command refactor and skill page throttled refresh. Conflict resolutions (9 files): - flocks/command/handler.py: drop legacy 240-line if-chain; route every direct slash command through `run_direct_command` (dev refactor). - flocks/command/direct.py (non-conflict, paired change): port PR 131's user-vs-agent visibility split to the new entry point. `/skills` shows the full inventory with `[disabled]` markers when called from a user surface (CLI/TUI/WebUI) and stays "enabled-only" when invoked by the agent via the `slash_command` tool. - flocks/session/runner.py: drop the obsolete `_build_system_prompts` method (now centralised in `SessionPrompt.build_system_prompts` on dev); migrate the PR 131 device-asset hint into a small `_build_device_asset_hint` helper, appended after the cached prompt list so live device state never pollutes the prompt cache. Keep PR 131's per-turn `skill` tool description refresh in the schema builder so disabled skills cannot leak into the tool index. - flocks/tool/system/skill.py: keep PR 131's enabled-only description refresh in the wrapper; merge dev's clarifying docstring. - flocks/tool/system/slash_command.py: drop inlined `/skills`/`/workflows` handlers in favour of `run_direct_command` (dev refactor). - webui/src/pages/Tool/index.tsx: use the narrower `refreshToolData()` on enabled-toggle (dev), avoiding redundant MCP refreshes. - webui/src/pages/Session/index.tsx: restore dev's `readLastSelectedSessionId` effect so the existing `writeLastSelectedSessionId` writer has a matching reader. - webui/src/pages/Skill/index.tsx: union of both sides — keep PR 131's pagination/source-filter/toggle state and lucide icons, adopt dev's throttled `refreshSkillsAndFetch` (already referenced downstream), and fix one stray `toast.error` → `showErrorToast` left over from the rename. - webui/src/pages/Workflow/index.tsx: keep PR 131's toolbar-based refresh/create actions and `WorkflowSection` component; drop dev's undefined `isUserManaged` helper in favour of the still-defined `isBuiltin`. - webui/src/components/common/SessionChat.tsx: keep PR 131's redesigned tool card (status pill + `ChevronDown` + dark code block) and absorb dev's `truncateToolDisplayText`/`buildToolInputSummary` helpers plus hover-title fallback for long inputs. Hidden semantic conflicts (silently auto-merged but failing tests): - webui/src/components/common/SessionChat.test.ts: update className assertions — PR 131 moved `max-w-2xl` from the inner bubble to the outer `max-w-[50%]` container, so the inner bubble now only owns `w-auto`/`w-full`. - webui/src/pages/Workflow/index.tsx (a11y): wrap each `WorkflowSection` in `<section aria-label={title}>` so the dev-side region-role tests keep passing without rolling back the redesign. Pre-existing PR 131 debt also tidied up in this commit: - webui/src/pages/Tool/ToolDetailDrawer.test.tsx: add the missing `listFixtures` stub (the new fixtures effect was added to `Tool/index.tsx` without a matching mock). - webui/src/pages/Skill/SkillSheet.test.tsx: change the Edit-mode fixture to `source: 'user'` so the editable code path is exercised. `source: 'project'` skills remain read-only on purpose to prevent the UI from overwriting repo-tracked files; the read-only branch is still covered by `should show name field in edit mode`. Verification: 192/192 webui vitests pass; `tsc --noEmit` clean; `py_compile` clean on all five touched Python modules. Co-authored-by: Cursor <cursoragent@cursor.com>

- Skill list: make the entire name+description+icon block clickable to open SkillSheet, matching the Hub catalog row pattern - Tool list (all tabs): make name+description blocks clickable links; use group/name scoped hover colour per tab accent (slate/blue/purple) - MCP / API / Local tabs: replace fixed `minmax(0,1fr)` name column with proportional `minmax(min, Xfr)` grid so excess width is shared across all columns instead of collecting as a single gap after the name; extract shared constant to `components/gridLayout.ts` so all three tabs stay in sync - ToolTable (all-tools tab): same proportional grid treatment; name capped at 420 px, font reduced one step (text-xs) to suit the denser row - SessionChat user bubble footer: wrap bubble+footer in a `flex flex-col` container so the footer stretches to the bubble's intrinsic width, keeping the timestamp on the bubble's left edge and action buttons on the right edge regardless of message length Co-authored-by: Cursor <cursoragent@cursor.com>

…rtcuts feat(webui): 侧边栏布局、版本信息与快捷键优化

* feat(provider): add interleaved reasoning replay across providers Support provider-specific reasoning field replay (interleaved/thinking) in Anthropic and OpenAI SDK paths, with catalog metadata and runner integration for multi-turn reasoning preservation. * fix(agent): harden agent.yaml loading against invalid configs Validate YAML mapping shape and wrap AgentInfo construction in try/except so malformed model configs are skipped without breaking agent scans. * feat(session): halt cross-step tool loops and cap default step budget Add runner-level guards for repeated exact tool calls and long same-tool streaks, apply a default max tool step limit when agents omit steps, and resolve message replay against the session's current model pin. * chore(session): raise default max tool steps to 1000 Agents without an explicit steps limit now get a 1000-step budget instead of 100 for longer coding and research tasks.

* fix(workflow): compact large outputs and trim execution history Persist compacted workflow outputs in storage and JSONL audit logs to avoid bloating SQLite rows. Cap per-workflow execution history at 30 and delete matching JSONL files when pruning old records. * fix(workflow): compact step inputs and tool result metadata Extend history compaction to step inputs via compact_step_for_storage, and return compacted outputs/history in run_workflow ToolResult metadata so agent context stays bounded alongside SQLite/JSONL storage.

* Classify Python plugin tools as native by source path. Discover python tool file origins and mark project-scoped tools native while user ~/.flocks plugins remain non-native; update tool-builder smoke-test docs. * Rename skill tool to skill_load and refine plugin metadata. Split on-demand skill loading from flocks_skills management, reconcile python plugin source/native flags from disk, move tool-builder validator under scripts/, and update agents, compaction, and tests accordingly.

…ecycle (#298) Extract shared ToolContext builder, support local/docker publish drivers with health-aware service status, and make workflow/bash cancellation more reliable.

) Unify interleaved capability inference across OpenAI-compatible and Anthropic models, add reasoning transport resolution, remove unused Bedrock SDK, and improve thinking-block replay in runner and provider options.

…der singleton The workflow `llm.ask` path shares the process-wide `Provider._providers` registry with the session/agent runner. Under concurrent load each workflow LLM call could destabilize an in-flight session call: * `_prepare_provider` unconditionally rewrote `provider._config` and forced `provider._client = None`, dropping the session's live httpx connection pool and silently flipping `custom_settings` (notably `trust_env`) that the session had set. * `Provider._ensure_initialized` flipped `_initialized = True` before the registry was populated, so a concurrent caller could take the fast path and observe `Provider.get(...) is None` for built-in providers. * `Provider.apply_config` is called on every session step and on every workflow `llm.ask`; it unconditionally re-`configure`d the provider and rebuilt `provider._config_models`, racing readers on the mutable list across event loops. * The same `httpx.AsyncClient` could be inherited across event loops (session: uvicorn main loop, workflow: `flocks-workflow-llm-loop`), triggering "got Future attached to a different loop" or silent hangs. Changes ------- `flocks/workflow/llm.py`: * Serialize `_prepare_provider` per-provider via a `threading.Lock` keyed by `provider_id`. * Make reconfigure idempotent: skip `provider.configure(...)` and the client reset when the desired `ProviderConfig` (api_key / base_url / custom_settings) already matches what the provider holds. * Only override `custom_settings['trust_env']` when the user explicitly set `workflow.llm.trust_env` in flocks config. * Track the owning event loop of `provider._client` in a `WeakKeyDictionary` and force a client reset when the workflow loop id differs from the marker, even if the config is unchanged. `flocks/provider/provider.py`: * `_ensure_initialized` uses double-checked locking with a `threading.Lock` and flips `_initialized = True` only after `_load_dynamic_providers()` returns, so concurrent callers never observe a partially populated registry. * `apply_config` compares the desired `ProviderConfig` against the existing `provider._config` and skips `provider.configure(...)` when unchanged. The `_config_models` rebuild is gated by a signature-based equality check and assigned atomically. Tests ----- 13 new tests, all passing: * `tests/workflow/test_llm_provider_isolation.py`: idempotency, trust_env inheritance vs override, api_key / base_url change triggering reconfigure, cross-loop client reset, same-loop client reuse, per-provider lock identity. * `tests/provider/test_provider_lazy_init_thread_safe.py`: races 20 threads through `_ensure_initialized` behind a `Barrier` and asserts every observer sees a fully populated registry. * `tests/provider/test_provider_apply_config_idempotent.py`: no-op path on unchanged config; mutation path on api_key change. Co-authored-by: Cursor <cursoragent@cursor.com>

Keep SkyEye alarm filtering canonical on `hazard_level` while accepting the legacy `threat_level` input to avoid schema precheck failures. Refresh the affected tests and README Docker mirror examples so the branch reflects the current device plugin layout and usage guidance.

…dless (#301) * feat(browser,web2cli): managed tab lifecycle and multi-operation CLI Track agent-created tabs in the daemon, expose open_or_attach_tab and managed_tabs helpers, and refuse closing unmanaged tabs by default. Extend web2cli spec/CLI generators for multi-operation captures with subcommands, and restart stale daemons when the IPC protocol is outdated. Co-authored-by: Cursor <cursoragent@cursor.com> * feat(browser-use): add cdp-headless mode and BU_CDP_URL support Document headless CDP workflow in browser-use skill, treat BU_CDP_WS and BU_CDP_URL as explicit remote endpoints in setup/doctor, and improve handshake errors for dedicated headless Chromium instances. Co-authored-by: Cursor <cursoragent@cursor.com> * docs(browser-use): clarify headless port and process lifecycle Require dedicated remote-debugging ports, background browser startup so the process outlives the shell, and explicit cleanup rules for task-owned headless Chromium instances. * fix(browser): reconnect setup for explicit CDP endpoints Restart the daemon when setup runs with BU_CDP_WS/BU_CDP_URL while an old daemon is still alive, document PowerShell -c quoting guidance, and remove the background agent-browser npm auto-update from installation. Co-authored-by: Cursor <cursoragent@cursor.com>

…olation fix/workflow llm provider isolation

* fix(session): preserve streamed reasoning content for replay Accumulate streamed reasoning metadata so provider-facing reasoning_content keeps the full replay payload instead of the last chunk only. Co-authored-by: Cursor <cursoragent@cursor.com> * fix(session): remove same-tool streak loop guard Keep the loop guard focused on repeated identical tool calls so sessions can continue when the same tool is reused with different arguments.

Move browser daemon socket/port/pid/log paths from system temp to a stable per-user directory, update related skills/scripts, and fix tool hook workspace context in stream_processor.

… derivation Introduce a first-class "device" plugin type in the Hub marketplace, classified by `integration_type: device` in `_provider.yaml` and installed under `~/.flocks/plugins/tools/device/`. Device plugins now surface in the marketplace listing, get recognized during device setup, and are uninstalled with the correct type. Also fixes two regressions surfaced by versioned device plugins whose own `service_id` already contains a `_v...` token (e.g. `onesig_api` for both v2.5.3 D20260321 and D20250710): * `storage_key_to_service_id` previously stripped trailing version suffixes with a greedy regex, collapsing e.g. `onesig_v2_5_3_D20250710_api_v2_5_3_D20250710` to `onesig`. It now prefers the exact `ApiServiceDescriptor` cache mapping and falls back to a non-greedy regex that removes only the last suffix. * `row_to_device` recomputes `service_id` on read so historical rows with a corrupted column self-heal in the response. * `device_startup` adds a one-shot migration that rewrites stale `device_integrations.service_id` rows. * Device CRUD routes now derive `service_id` from the row's `storage_key` instead of trusting the stored column, keeping `sync_service_tool_state` aligned with the descriptor-aware logic. * Frontend tool filter in the device detail panel now matches on the exact versioned `storage_key`, so two versions of the same product no longer cross-contaminate the displayed tool list. Bundled `_provider.yaml` files for `onesig_v2_5_3_D20250710`, `sangfor_af_v8_0_48` and `sangfor_af_v8_0_85` are tagged with `integration_type: device`; their legacy `tools/api/` copies are removed in favour of the canonical `tools/device/` layout. Tests cover both the new plugin-type discovery path and the service_id derivation fix. Co-authored-by: Cursor <cursoragent@cursor.com>

* feat(provider,webui): enable vision for Qwen/Kimi and refresh SessionChat UI Mark qwen3.6-plus and kimi-k2.6 as vision-capable in the provider catalog and align WebUI vision gating with those models. Refresh SessionChat bubble layout, preserve partial streamed text on abort, and add regression tests. * fix(webui): polish chat scroll, thinking indicator, and sidebar padding Use stable scrollbar gutter in SessionChat, replace thinking dots with a Brain icon, and align sidebar logo padding with the chat content column.

…ook-io-llm Add deepseek-v4-flash model to both ThreatBook LLM providers (CN and IO) with 200K context window, 128K max output, and CNY ¥1/¥2 per million tokens pricing (input/output). Update catalog tests to assert the new model's limits and pricing. Co-authored-by: Cursor <cursoragent@cursor.com>

feat(catalog): add deepseek-v4-flash to threatbook-cn-llm and threatb…

feat(hub,device): add device plugin type and fix versioned service_id…

#309) * fix(browser): improve setup flow for stale daemon and remote debugging Restart stale local daemons during --setup, only prompt inspect on 403 handshake failures, and document the attach/reload workflow in browser-use. * fix(webui): preserve aborted assistant output and simplify thinking indicator Mark in-flight assistant messages as stopped on abort or session idle, freeze running tools, and keep partial streamed text across refetch.

#313) The fetchData useCallback no longer needs react-hooks/exhaustive-deps suppression.

xiami762 and others added 30 commits April 27, 2026 13:11

Merge pull request #195 from AgentFlocks/feature/onesig-api-integration

25ccbb0

feat/onesig api integration

fix(webui): tighten delegate task card detection (#196)

2dcccd4

Avoid rendering generic MCP tools with category fields as delegate task cards, while preserving child-session links and URL-driven session selection behavior.

feat: add post-login notification modal (#197)

b995d4b

* add post-login notification modal * fix notification dismissal behavior

docs(skill): rename agent-browser skill metadata (#204)

7122c3f

Align the skill identifier and header with the agent-browser naming, and remove outdated quickstart/reference text to keep documentation focused on current usage.

Merge pull request #206 from AgentFlocks/fix/sangfor-sip-endpoints

a819ce6

fix/sangfor sip endpoints

Merge pull request #208 from AgentFlocks/feat/provider-version-display

32ed32d

feat(provider): expose service version end-to-end (UI + LLM tool desc…

Merge branch 'main' of github.com:AgentFlocks/flocks into feat/api-pr…

8212cb2

…ovider-versions

feat(hub): add bundled Flocks Hub catalog

334b10a

Add the local Hub catalog, backend install APIs, WebUI browsing experience, and validation coverage so bundled plugins can be discovered and installed globally. Made-with: Cursor

fix(hub): ignore nested files during native tool discovery

b0ebcf5

Limit native tool discovery to direct payload files so provider metadata or nested files do not make a tool package appear installable by mistake. Made-with: Cursor

duguwanglong and others added 25 commits May 18, 2026 11:55

Merge pull request #131 from DearEmma/feat/Modify_sidebar_and_add_sho…

e4ec266

…rtcuts feat(webui): 侧边栏布局、版本信息与快捷键优化

feat(workflow): add local service driver and improve cancellation lif…

3a5bb9f

…ecycle (#298) Extract shared ToolContext builder, support local/docker publish drivers with health-aware service status, and make workflow/bash cancellation more reliable.

fix(provider): preserve Azure streaming tool calls (#299)

6cd2ba4

Merge pull request #300 from AgentFlocks/fix/workflow-llm-provider-is…

602fa04

…olation fix/workflow llm provider isolation

fix(browser): store daemon IPC files under ~/.flocks/browser (#304)

d9269fa

Move browser daemon socket/port/pid/log paths from system temp to a stable per-user directory, update related skills/scripts, and fix tool hook workspace context in stream_processor.

Merge pull request #307 from AgentFlocks/feat/add-deepseek-v4-flash

c80c5a8

feat(catalog): add deepseek-v4-flash to threatbook-cn-llm and threatb…

Merge pull request #306 from AgentFlocks/feat/hub-device-plugin-type

e7988a2

feat(hub,device): add device plugin type and fix versioned service_id…

fix(provider): preserve Azure tool call history (#308)

53a9726

chore: bump package version to v2026.5.22 (#311)

0af2942

stephamie7 requested review from duguwanglong and xiami762 May 22, 2026 10:07

fix(webui): remove stale eslint-disable in DeviceIntegration fetchData (

709706e

#313) The fetchData useCallback no longer needs react-hooks/exhaustive-deps suppression.

xiami762 approved these changes May 22, 2026

View reviewed changes

xiami762 merged commit bfaa415 into main May 22, 2026
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v2026.05.22#312

v2026.05.22#312
xiami762 merged 526 commits into
mainfrom
dev

stephamie7 commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

stephamie7 commented May 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants