Skip to content

Release: open-source-readiness pass + CVE clear + captcha primitive#283

Open
djl11 wants to merge 28 commits into
mainfrom
staging
Open

Release: open-source-readiness pass + CVE clear + captcha primitive#283
djl11 wants to merge 28 commits into
mainfrom
staging

Conversation

@djl11
Copy link
Copy Markdown
Member

@djl11 djl11 commented May 26, 2026

Promotes 6 commits from staging to main. Two themes plus one feature.

Open-source-readiness pass (3 commits)

  • aaabf3d46 chore(repo): tighten .gitignore for build artifacts and add AGENTS.md

    • .gitignore now covers build/, dist/, *.egg-info/, Local/
    • Removed ~12MB of build artifacts from the working tree
    • New AGENTS.md distilled from .cursor/rules/ so Claude Code, Codex, Aider, Cline, etc. pick up the same conventions Cursor does
  • bfe44c46f chore(github): add CODEOWNERS, PR/issue templates, dependabot, OSV scanner

    • CODEOWNERS@unifyai/Engineers as catch-all + explicit ownership of security-sensitive paths
    • PULL_REQUEST_TEMPLATE.md — references the .cursor/rules invariants
    • ISSUE_TEMPLATE/{config,bug_report,feature_request}.yml — routes bugs by surface; steers "please add this skill" feature requests toward GuidanceManager/FunctionManager
    • dependabot.yml — github-actions weekly (grouped) + agent-service/ npm weekly; deliberately skips scheduled pip per the editable-sibling install model
    • workflows/osv-scanner.yml — Google's reusable workflow pinned by SHA, SARIF to Security tab

Dependabot CVE triage (1 commit + 5 dismissals)

Captcha primitive + docs (2 commits)

  • c9ba90982 feat(computer): add solve_captcha primitive for reCAPTCHA v2 via AntiCaptcha
  • 39fe85099 docs(env): document ANTICAPTCHA_KEY placeholder in .env.example

Other in-flight work picked up incidentally

  • bd001c346 test(task_scheduler): pin Communication env-builder equivalence in shared contract tests — landed on staging before this session.

Test plan

The full test suite auto-runs on staging→main PRs (tests.yml line 130). No tags needed. Auto-merge on green.

djl11 added 7 commits May 26, 2026 00:37
…ared contract tests

Adds 4 new tests in test_offline_runner_contract.py that prove
field-for-field that Communication's NEW _build_offline_runner_env
composition (shared Unity contract + hosted-only assistant-identity
layer) produces dicts identical to the OLD monolithic Communication
builder, across the scheduled, triggered, entrypoint-override, and
sparse-assistant-data scenarios.

The golden reference function is a verbatim copy of Communication's
pre-refactor builder inlined into the test file. If anything in the
shared contract drifts from the old behaviour, these tests fail
loudly here, in Unity's test suite, before reaching Communication's
deployment.

Brings total contract-module test count to 35 (up from 31).
…Captcha

Exposes a deterministic, Python-callable primitive
`WebSessionHandle.solve_captcha()` on every web session created via
`cp.web.new_session(...)`. The primitive delegates the visible
reCAPTCHA v2 challenge to the AntiCaptcha worker pool and injects the
returned Google-signed token back into the live page so the page's
own submit flow accepts the verification.

Layers wired:
- agent-service: new `POST /captcha/solve` handler (sitekey extraction
  + createTask/getTaskResult polling + page.evaluate injection).
  Reads `ANTICAPTCHA_KEY` only from `process.env`; token is never
  logged or echoed in the response.
- Python: `ComputerSession.solve_captcha` (+ matching mock-backend
  and `_MockSession` stubs) with rich docstring on
  `_LowLevelActionsMixin`. `ComputerSession._request` gains a keyword-
  only `timeout` parameter (default preserves existing behaviour).
- Runtime exposure: `"solve_captcha"` appended to `_COMPUTER_METHODS`
  and `ComputerPrimitives._LOW_LEVEL_METHODS`; excluded from
  `_DESKTOP_METHODS` (desktop sessions have no DOM target).
- Config: optional `ANTICAPTCHA_KEY` documented in
  `agent-service/README.md`; missing key surfaces as 503
  `anticaptcha_key_missing`.
- Tests: mock-backend coverage in `test_computer_multimode.py`
  guarding the auto-wiring and the default/invisible variant paths.

Magnitude-core is intentionally untouched: the primitive is not in
the LLM action vocabulary. Callers reach for it from their own
orchestration code after a prior `observe()` has confirmed a CAPTCHA
is on screen.

Out of scope: v3/Enterprise reCAPTCHA, hCaptcha, Turnstile,
FunCaptcha, GeeTest, desktop-mode equivalents, and wiring into
specific actor/extractor flows.
Clean up the open-source-ready repo surface:
- .gitignore now covers build/, dist/, *.egg-info/ (any name), and Local/
  so setuptools/uv build output and personal workspace dirs stay out of
  git status. Deleted ~12MB of build/, dist/, unity.egg-info/,
  unify_agent.egg-info/, Local/, __pycache__/, .cache.ndjson from disk.
- AGENTS.md distilled from .cursor/rules/ so Claude Code, Codex, Aider,
  Cline, and other assistants pick up the same conventions Cursor does
  (testing philosophy, no-defensive-coding, explicit-path commits,
  state-manager design rules, repo map).

No code changes.
…anner

Brings .github/ in line with peer open-source AI-assistant repos
(NousResearch/hermes-agent, openclaw/openclaw) so contributors land
on a familiar surface and supply-chain hygiene is visible.

Added:
- CODEOWNERS — @unifyai/engineers as catch-all + explicit ownership
  of security-sensitive paths (CODEOWNERS itself, dependabot.yml,
  workflows/, SECURITY.md, AGENTS.md, ARCHITECTURE.md, secret_manager/).
- PULL_REQUEST_TEMPLATE.md — Summary / type / areas / test plan /
  migration / checklist. References the .cursor/rules invariants
  (no-defensive-coding, no-temporal-comments, zero-backcompat target).
- ISSUE_TEMPLATE/{config,bug_report,feature_request}.yml — bug template
  routes by surface (CLI / voice / installer / specific manager /
  ConversationManager / etc.) and asks for `unity doctor` output; feature
  template explicitly steers users toward GuidanceManager/FunctionManager
  for runtime-extension requests so the issue queue isn't drowned in
  "please add this skill" tickets.
- dependabot.yml — github-actions weekly (grouped minor/patch) +
  agent-service npm weekly. Deliberately skips scheduled pip updates
  per the editable-sibling install model (unify/unillm/orchestra-core);
  CVE-driven pip security updates remain enabled at the repo-settings
  level. Comment explains the rationale.
- workflows/osv-scanner.yml — Google's reusable workflow pinned by SHA.
  Scans uv.lock + agent-service/package-lock.json on lockfile changes,
  push to main/staging, and weekly. SARIF results land in the Security
  tab; fail-on-vuln disabled so pre-existing CVEs don't block merges.
Lockfile bumps only — no pyproject.toml / package.json changes.
Triggered by the 15 open Dependabot alerts on the default branch
(see https://github.com/unifyai/unity/security/dependabot).

uv.lock (7 bumps):
- urllib3   2.6.3  -> 2.7.0    CVE-2026-44431 (high)  cross-origin header
                                  leak in proxied redirects
- urllib3   2.6.3  -> 2.7.0    CVE-2026-44432 (high)  decompression-bomb
                                  bypass in streaming API
- langchain-core 1.3.0 -> 1.4.0  CVE-2026-44843 (high)  unsafe deserialization
                                  via overly broad load() allowlists
                                  (pulls in new transitive langchain-protocol 0.0.15)
- python-multipart 0.0.26 -> 0.0.29  CVE-2026-42561 (high)  DoS via unbounded
                                       multipart part headers
- lxml      6.0.3  -> 6.1.1    CVE-2026-41066 (high)  XXE in default config of
                                  iterparse() and ETCompatXMLParser()
- langsmith 0.7.33 -> 0.8.5    CVE-2026-45134 (high)  public prompt pull
                                  deserializes untrusted manifests
- authlib   1.7.0  -> 1.7.2    CVE-2026-44681 (medium)  OIDC implicit/hybrid
                                  open redirect (not reachable — we don't run
                                  an OIDC provider — but bumped for hygiene)
- idna      3.11   -> 3.16     CVE-2026-45409 (medium)  IDNA encode() bypass
                                  of CVE-2024-3651 fix

agent-service/package-lock.json (2 bumps, via npm audit fix):
- qs        6.15.0 -> 6.15.2   CVE-2026-8723  (medium)  qs.stringify DoS on
                                  null/undefined entries in comma-format arrays
- ws        8.18.3 -> 8.21.0   CVE-2026-45736 (medium)  uninitialized memory
                                  disclosure

Not addressed in this commit (blocked on sibling repos):
- litellm 1.83.4 -> 1.83.10  (clears 4 alerts: 1 critical SQLi in proxy,
   3 high — sandbox escape, RCE via MCP stdio, SSTI in /prompts/test).
   All four CVEs are in the LiteLLM *proxy server* surface, which Unity
   does not run; reachability is effectively zero, but the bump should land
   for defense in depth. BLOCKED: unillm pins litellm==1.83.4 exactly.
   The unillm Dependabot PR is already open at unifyai/unillm#54.
- python-dotenv 1.0.1 -> 1.2.2  (CVE-2026-28684, medium — symlink-following
   in set_key; Unity only reads .env so not reachable). BLOCKED: litellm
   1.83.4 ships an unusual pin (python-dotenv>=1.0.1,<1.0.1+) that effectively
   freezes python-dotenv at 1.0.1. Will unblock once unillm#54 lands and
   `uv sync` brings litellm 1.83.10 in.
The agent-service /captcha/solve handler (added in c9ba909) reads
process.env.ANTICAPTCHA_KEY at request time and returns 503
anticaptcha_key_missing if it's unset. Document the env var alongside
the other optional integration keys so operators know where to put it
without having to read the agent-service README.

The actual key value lives in GCP Secret Manager under
projects/responsive-city-458413-a2/secrets/ANTICAPTCHA_KEY, alongside
the other runtime API keys (ANTHROPIC_API_KEY, DEEPGRAM_API_KEY,
LIVEKIT_API_KEY, etc.). The companion unity-deploy commit adds
ANTICAPTCHA_KEY to setup_k8s_config.py's required_secrets list so
the unity-secrets K8s Secret picks it up automatically on cluster
setup.
@djl11 djl11 temporarily deployed to unity-testing May 26, 2026 13:04 — with GitHub Actions Inactive
@djl11 djl11 temporarily deployed to unity-testing May 26, 2026 13:04 — with GitHub Actions Inactive
@github-advanced-security
Copy link
Copy Markdown

You are seeing this message because GitHub Code Scanning has recently been set up for this repository, or this pull request contains the workflow file for the Code Scanning tool.

What Enabling Code Scanning Means:

  • The 'Security' tab will display more code scanning analysis results (e.g., for the default branch).
  • Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results.
  • You will be able to see the analysis results for the pull request's branch on this overview once the scans have completed and the checks have passed.

For more information about GitHub Code Scanning, check out the documentation.

The script's `discover_all()` was only recursing into top-level tests/
sub-directories whose names start with `test` — but Unity's convention
is to name per-manager test directories after the manager itself
(contact_manager/, knowledge_manager/, actor/, task_scheduler/,
conversation_manager/, etc.) without the `test_` prefix.

Effect: the staging→main CI matrix was silently collapsing to just 2
entries (tests/test_integration_status/ and tests/test_session_details.py
— the only top-level paths starting with `test`) instead of the ~67
leaf paths that actually exist. Every prior release went green on a
hollow signal exercising none of the manager test suites.

Fix: replace `item.name.startswith("test")` with
`item.name not in EXCLUDE_DIRS`. Safe because `collect_paths()` is
itself gated by `has_test_files`/`has_test_subdirs`, so recursing into
a non-test directory is a no-op. EXCLUDE_DIRS already covers
__pycache__, .pytest_cache, .venv, etc.

Verified locally: `python3 .github/scripts/discover_test_paths.py | wc -l`
returns 67 (was 2), and the output now includes tests/contact_manager,
tests/task_scheduler, tests/actor/*, tests/conversation_manager/*, etc.
@djl11 djl11 temporarily deployed to unity-testing May 26, 2026 13:21 — with GitHub Actions Inactive
@djl11 djl11 temporarily deployed to unity-testing May 26, 2026 13:21 — with GitHub Actions Inactive
djl11 added 8 commits May 26, 2026 16:32
Two related changes to the web-session primitive surface that have been
operating in concert on a feat branch and are now landing together:

1. **storageStateName plumbing through web sessions.**
   ``/start`` now optionally accepts a ``storageStateName`` body field
   (forwarded through ``startBrowser`` to magnitude-core's
   BrowserProvider), so a brand-new agent-service web session can boot
   pre-loaded with cookies + localStorage + sessionStorage from a
   previously-saved storage state.  Python: ``cp.web.new_session()``
   and ``ComputerSession.create_session`` accept ``storage_state_name``.
   Only honoured for ``mode="web"``.

   Callers use this to persist authenticated sessions across
   processes — e.g. log into Google once interactively
   (vision-driven email + password + operator-collaborative 2FA),
   save the state, then run all subsequent headless extractions
   pre-loaded with the storage state so they're already-signed-in.

2. **Event-based wait inside /captcha/solve replaces caller-side sleeps.**
   The handler previously injected the AntiCaptcha token + fired the
   callback and returned immediately, leaving the caller responsible
   for an arbitrary ``await asyncio.sleep(N)`` before observing whether
   the page had progressed past the captcha.  Brittle: the sleep is
   either too short (page hasn't settled, observe misses the revealed
   content) or too long (every solve eats N seconds even when the
   page settled instantly).

   The handler now blocks until the page has verifiably progressed,
   using three deterministic signals:

   - **Widget acceptance**: after injecting the token + firing
     callbacks, polls ``window.grecaptcha.getResponse()`` until it
     returns the token (the widget's own JS API confirming it has
     internalised the verification).  Up to 5s.
   - **Server-side verification**: races
     ``page.waitForResponse(/recaptcha\/(api2|enterprise)\/userverify/)``
     against
   - **Network idle**: ``page.waitForLoadState('networkidle')`` (500ms
     of zero in-flight requests).  Whichever lands first latches the
     wait; both bounded at 15s.

   Response shape gains three fields so callers can branch
   intelligently: ``widget_acked`` (bool), ``settled`` (bool),
   ``settled_via`` (``"userverify" | "networkidle" | "timeout"``).
   The token is still never echoed or logged.

   Callers (brain.influencers.youtube and onwards) can now drop the
   ``await asyncio.sleep(...)`` after ``solve_captcha`` entirely; the
   primitive returns only once the page is in a trustworthy
   post-captcha state, so the next ``observe()`` reflects it.

Mock backends (``MockComputerBackend`` + ``_MockSession``) return the
optimistic case for the new fields (``widget_acked=True``,
``settled=True``, ``settled_via="networkidle"``) so existing tests
that instantiate the mock keep working and new callers that branch on
these flags get a deterministic happy-path stub.

Test coverage in tests/function_manager/storage/test_computer_multimode.py
gains ``test_handle_solve_captcha_settle_fields`` to lock in the
response-shape contract.

Out of scope (unchanged): hCaptcha, Turnstile, FunCaptcha, GeeTest,
reCAPTCHA v3/Enterprise, web-vm-mode storageStateName.
When orchestra's local.sh start fails in CI, parallel_run.sh was
suppressing its output via >/dev/null 2>&1, leaving no trace of why.
The "Dump orchestra logs on failure" step in tests.yml tails
/tmp/orchestra-local-server.log — but that file is only written once
start_orchestra_server reaches its background-exec line. Earlier
failures (check_docker, start_db_container's pgvector docker pull,
run_migrations, etc.) leave zero breadcrumbs.

Symptom: the 48 jobs that need Orchestra all log
  "Warning: Could not start local orchestra"
with no follow-up diagnostic. Same root cause on rerun → chronic
infrastructure issue. Hypothesis: Docker Hub rate-limit on
pgvector/pgvector:pg15 image pull, hit by 67 parallel runners after
the discover_test_paths.py fix expanded the matrix from 2 → 67.

Fix: tee local.sh's output to /tmp/orchestra-startup.log and cat it
to stderr when start fails (or when start "succeeded" but check fails).
This is diagnostic only — no behaviour change on the happy path.

Once we have a real error in the next CI run, this can either stay
in place permanently (low cost, high signal) or be reverted.
After fixing orchestra's missing _platform_initial_schema.sql, alembic
now succeeds in CI and orchestra starts. But tests fail with HTTP 401
"Invalid API key" from local orchestra. Orchestra's auth_api_key looks
up the bearer in `api_key` table; 401 means no matching row.

seed_test_user runs at orchestra startup, but it's invoked via
\`if ! seed_test_user; then log_warn ...\` in local.sh — failure logs
a warning but doesn't fail orchestra startup. So if the seed silently
errored, parallel_run.sh's success path doesn't dump anything.

This diag adds two probes on the orchestra-started success path:
1. Grep the orchestra-startup.log for seed-related output lines
2. Query the api_key table directly, redacting all but first 8 chars
   of each key; print UNIFY_KEY's first 8 chars + length for comparison.

If api_key has 0 rows → seed_test_user failed; the grep output should
show why. If api_key has a row whose first 8 chars differ from UNIFY_KEY
→ key truncation / encoding / heredoc mangling in seed_test_user.
Either way, the next CI cycle will give us a definitive answer in a
single workflow_dispatch run.
Unity has not yet cut a tagged release; the new file scaffolds an
Unreleased section capturing recent open-source-launch milestones
(install script, local voice, deploy_runtime SPI, comparative
architecture docs) and notes that entries will be regrouped under
[0.1.0] at first tag. Signals that the project is alive and shipping
to OSS observers.
Maps personal emails, university emails, and lowercase-username
variants to each contributor's canonical GitHub noreply address so
git shortlog -sn and GitHub's contributor graph deduplicate
correctly. Covers all 9 current committers; collapses 19 distinct
name+email tuples into 9 rows.
Static analysis via GitHub-hosted CodeQL on the same triggers as the
existing OSV-Scanner workflow (push/PR to main and staging, weekly
cron). Languages scoped to Unity's actual surface — python for the
unity/ package, javascript-typescript for agent-service/, and
actions for the workflow pipeline itself. Findings upload to the
repository Security tab. Complements OSV-Scanner (which covers
dependency CVEs) by scanning Unity's own source for common
vulnerability patterns.
Runs uv lock --check on every PR/push that touches pyproject.toml or
uv.lock against main/staging. Catches the case where pyproject.toml
moves without a corresponding lockfile regeneration (or vice versa)
before it can break scripts/install.sh and deploy/Dockerfile on the
target branch. Failure summary explains the merged-state gotcha and
the rebase + uv lock recovery flow.
…ccess path (replace narrow grep which missed psql errors)
Comment thread .github/workflows/uv-lockfile-check.yml Fixed
djl11 and others added 12 commits May 26, 2026 18:06
Wires the Yelp detect-secrets hook (v1.5.0) into the pre-commit
chain alongside black, autoflake, and pretty-format-yaml. New
commits that introduce secrets not present in .secrets.baseline
will be rejected locally; existing repo content is captured in
the baseline as known false positives (53 findings across 31
files — placeholder values in docs, dev-mode credentials in
voice.sh, OAuth keyword-style identifiers, etc.).

Baseline regeneration when intentionally adding new safe matches:

  git ls-files -z | xargs -0 -s 900000 \
    uvx detect-secrets@1.5.0 scan > .secrets.baseline
  git commit .secrets.baseline -m "chore: refresh secrets baseline"

This is a local checkpoint complementing GitHub's push-time secret
scanning — catches accidents before they leave the laptop.
Companion to the README's peer-comparison section. Records the two
architectural bets Unity is making (persistent reasoning loop above
the tool-caller; steerable handles all the way down) and lists the
things Unity is deliberately NOT trying to be — channel-breadth
product, single monolithic agent loop, coding agent, regex-routed,
cron/webhook-configured. Also captures the open/closed-source split
between this repo and the hosted product at console.unify.ai.

Most "why isn't there a PR for X?" routing questions become
self-answering once contributors can see the non-goals listed
explicitly.
The previous SECURITY.md was a 15-line "email security@unify.ai" stub.
The replacement keeps that reporting flow but adds an explicit trust
model section that names what Unity actually defends and what it
doesn't:

  - the operator/assistant/inbound-surface/action-surface vocabulary
  - the load-bearing fact that the Actor writes and executes Python
    inside a process-level (not OS-level) boundary
  - credential surfaces (.env, SecretManager's narrow API, the Actor
    subprocess environment)
  - in-process heuristics that are useful but NOT boundaries
  - inbound-surface posture (email/SMS/web/files all untrusted)
  - bundled Orchestra Postgres assumptions

Scope section names what's in (boundary bypasses, SecretManager
exposure, parsing-surface bugs, hard-coded credentials, supply
chain) and what's out (prompt injection alone, hosted product,
sibling repos, operator-chosen exposures, provider-side findings).

Modelled on hermes-agent's structure but tightened to Unity's
single-tenant local-install posture; roughly half the length of
the peer document, which feels proportional to the project's
current surface area.
Adds two things to CONTRIBUTING.md:

  - A maintainers list naming the eight current team members with
    GitHub handles (commit-count ordering, deduplicated via the
    .mailmap landed earlier).
  - An area-familiarity table mapping subsystems to the maintainers
    who have the deepest history there. Derived from `git log
    --use-mailmap` against each package directory; team members
    overlap and rotate, so the table is presented as a routing
    hint rather than an access-control claim. CODEOWNERS remains
    the canonical access-control file.

Also points to VISION.md from the design-principles section so new
contributors can find the explicit non-goals before opening a PR.

External contributors previously had only "@unifyai/engineers" via
CODEOWNERS as a routing target, which reads as a closed team from
the outside; this gives them named people to ping for fast review
when it matters.
Daily run (03:17 UTC) using actions/stale@v9 with conservative
timings appropriate to a small OSS team:

  Issues   warn after 60 days idle  ->  close 14 days after warning
  PRs      warn after 30 days idle  ->  close 14 days after warning

Exempt labels (`pinned`, `security`, `no-stale`, `help wanted`,
`work-in-progress`) skip staling entirely. Any new activity clears
the timer.

Uses the default GITHUB_TOKEN — no GitHub App, no secrets beyond
what the workflow is already granted. Becomes relevant the moment
external contributor traffic kicks in; benign no-op until then.
The api_key/UNIFY_KEY/orchestra-startup.log dumps from c19e6a8 and
f33c1e9 served their debugging purpose (revealed seed_test_user
failing on plan_group FK then on missing project '_'). With both
underlying bugs now fixed in orchestra/staging (commits aeb60607 +
8639062e), these dumps add noise to every successful CI run.

Keep the failure-path dump from 24c952a — that's permanent
observability for the next operator who hits an orchestra-start
issue; without it the failure mode is "Warning: Could not start local
orchestra" with no further context, exactly the dead-end we just
spent hours digging out of.
…ual-write

create_blacklist_entry calls unity_log(add_to_all_context=True), which
references the new log from 3 contexts (the assistant-specific one
plus the two All/* aggregation contexts). delete_blacklist_entry only
called unify.delete_logs(context=self._ctx, ...), which per the unify
API contract only decrements the log's reference count for that one
context — leaving the references in the All/* contexts intact. So a
"deleted" entry kept showing up in queries against aggregation
contexts.

Symptom: test_deleting_blacklist_entry_removes_from_all_ctxs fails
with `assert 1 == 0  where 1 = len([Log(id=3)])` against
.../All/BlackList after delete.

Why this is recent
- The dual-write on create was added in 458543f (Dec 5, 2025) by
  @juliagsy along with test_all_ctx.py covering all four CRUD paths.
- The matching dual-delete was never added.
- Unity CI's matrix-discovery bug (introduced Jan 26 when test_
  prefix was dropped) immediately excluded tests/blacklist_manager/
  from the matrix, so the failing test never ran in CI for ~5 months.
- Fixed today's discover bug (75d3921) exposed this and 2 missing
  orchestra seeds (plan_group, default '_' project); after fixing
  those, this is the last remaining failure in tests/blacklist_manager
  (16/17 pass).

Fix: mirror the create path's `add_to_all_context` semantic in
delete by iterating over [self._ctx, *_derive_all_contexts(self._ctx)]
and calling unify.delete_logs for each. Gated on
include_in_multi_assistant_table so it matches what the create side
actually wrote.
The uv.lock check workflow (added in b942b5a) was failing on every
PR because unity's pyproject.toml editable-installs unify + unillm
from `../unify` and `../unillm`:

  error: Failed to generate package metadata for `unify==0.9.10 @
         editable+../unify`
  Caused by: Distribution not found at: file:///home/runner/work/unity/unify

The main tests.yml workflow handles this by cloning both repos into
the workspace and sed-rewriting pyproject.toml paths from `../X` to
`./X` — but that approach also rewrites the lockfile-relative paths,
which would itself trip `uv lock --check`. Use the cleaner
nested-checkout pattern: check unity into a `./unity/` subdir so
`../unify` and `../unillm` from there resolve to sibling checkouts
at the workspace root, leaving pyproject.toml unmodified.

Also pin astral-sh/setup-uv to a SHA (was @v5 tag) per the CodeQL
"Unpinned tag for a non-immutable Action" rule. Same SHA as used in
the main tests.yml workflow.

Sibling-repo branch selection mirrors orchestra's: main->main,
everything else->staging. Uses CLONE_TOKEN secret when available,
falls back to GITHUB_TOKEN for forks (where CLONE_TOKEN isn't
exposed). Fork PRs will fail at sibling-clone time if those repos
are private, which is the existing behaviour for tests.yml.
…, trim verbosity

Rework the intro to lead with the three superset hooks (typed persistent
memory, auto-grown skill library, natural-language schedules + triggers)
and reframe examples for technical individuals on personal projects
rather than office workers delegating outreach. Add an at-a-glance
comparison table near the top so casual visitors see the differentiation
without scrolling to the full architectural comparison, which is now
moved below Under-the-hood / Architecture so readers absorb the design
before evaluating it against OpenClaw and Hermes.

Drop the Alternatives section (no realistic "your own backend"
audience), collapse Voice setup behind a details block, merge the
standalone Architecture section into Under-the-hood as a closing
manager-map recap, and tighten the highlights table, comparison cards,
and project-structure tree to remove the bloat that had built up.

Net: 4714 -> 4156 words, 474 -> 424 lines, 15 -> 14 H2 sections, while
preserving every architectural claim.
test_ensure_creates_and_idempotent stubs unify.create_context with a
local _create_context. Production's _create_context_with_retry
(unity/common/context_store.py:_create_context_with_retry) gained an
optional `project=` parameter in cb8b006 (Yusha Arif,
"fix(tasks): read projected activations from the Unity project",
2026-04-13) — and now passes `project=project` unconditionally to
unify.create_context, even when the caller didn't supply one (passes
project=None).

The mock signature didn't include `project=` and was unreachable
through CI for 43 days (matrix-discovery bug from 499de17 on
2026-01-26 excluded tests/local_storage/* from CI until today's
75d3921 fix). So this test silently broke on 2026-04-13 and only
surfaced now.

Fix: add `project=None` to the mock signature so it matches what the
production wrapper actually passes through. No production change —
the wrapper passing project=None to the SDK is the correct behavior
(matches unify.create_context's own default).
…ed URLs

When parallel_run.sh exports ORCHESTRA_URL=http://127.0.0.1:8000/v0
(via local.sh's cmd_check), the helper was concatenating
f"{orchestra_url}/v0/projects" → http://127.0.0.1:8000/v0/v0/projects
→ 404 → returns False. pytest_sessionstart then short-circuits on
"Orchestra unreachable", skips unity.init(), leaves the EVENT_BUS
proxy uninitialized, and every eval test that publishes events later
crashes with:

  RuntimeError: EVENT_BUS has not been initialised yet – call unity.init() first.

Affected tests/actor/state_managers/simulated/{knowledge,tasks,
web_search,contacts,dashboards,files} and similar eval suites — they
ran through actor.act(), hit execute_function's event publish path,
and the proxy raised on first attribute access.

Why this hid in plain sight
- e47d5c6 (2026-05-23) added the correct `if base.endswith("/v0")`
  URL probe to tests/_prepare_shared_project.py and the commit
  message claimed pytest_sessionstart got the same handling — but
  sessionstart was wired through this helper, and this helper kept
  the year-old broken concat from dfcfe0c (2025-04-26).
- The matrix-discovery bug (Jan 26 → fixed today in 75d3921) had
  excluded actor/state_managers/simulated/* from CI for months, so
  the failure mode never ran in CI.
- Pure unify-SDK tests (contact_manager, etc.) don't go through
  EVENT_BUS, so they kept passing — masking how broken the eval
  paths actually were.

Fix: copy the same `endswith("/v0")` handling from
_prepare_shared_project.py:_orchestra_reachable. Both probes now
agree on URL construction.
Forward offline delivery through batched task creation and align scheduler tests with the invariant that delivery mode and symbolic entrypoint are independent. Normalize the Orchestra availability probe so remote /v0 endpoints still activate per-test random projects.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants