From 94d8cea9e34740ba8e615bbc61cd9da35b912cb7 Mon Sep 17 00:00:00 2001 From: Brandon Werner Date: Tue, 26 May 2026 15:16:17 -0700 Subject: [PATCH 01/13] docs: restore canonical GitHub Pages URL Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- CHANGELOG.md | 2 +- README.md | 2 +- docs/developer/docs-site.md | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index b5e4e9c..29d88d5 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -41,7 +41,7 @@ First public release. Reference implementation for Microsoft Entra Agent ID and **Discipline** - 1,237 tests, ruff clean, 80% coverage threshold enforced. - 66 hard-won learnings at `docs/runbooks/hard-won-learnings.md`. -- Full docs site at with API reference, script reference, ADRs, platform learnings, and runbooks. +- Full docs site at with API reference, script reference, ADRs, platform learnings, and runbooks. ### Known limitations diff --git a/README.md b/README.md index 4ec76b3..dc5bbff 100644 --- a/README.md +++ b/README.md @@ -110,7 +110,7 @@ After setup, use `./status.sh` as the canonical health and identity check: ## Documentation -The full doc site: **** +The full doc site: **** Direct pointers: diff --git a/docs/developer/docs-site.md b/docs/developer/docs-site.md index 751397a..9214af0 100644 --- a/docs/developer/docs-site.md +++ b/docs/developer/docs-site.md @@ -21,7 +21,7 @@ The workflow: 4. Uploads the `site/` artifact to GitHub Pages. 5. Deploys via `actions/deploy-pages@v4`. -Published at . GitHub Pages is configured with `build_type=workflow`; re-enable via: +Published at . GitHub Pages is configured with `build_type=workflow`; re-enable via: ```bash gh api -X POST repos///pages -f 'build_type=workflow' From 5c7bd59e93dd3a017ec0de17a32aa30720c1af25 Mon Sep 17 00:00:00 2001 From: Brandon Werner Date: Tue, 26 May 2026 15:41:45 -0700 Subject: [PATCH 02/13] docs: use lowercase microsoft repo URL Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- CHANGELOG.md | 2 +- README.md | 4 ++-- docs/developer/docs-site.md | 2 +- docs/getting-started/quickstart.md | 4 ++-- docs/index.md | 2 +- manifests/teams-app/manifest.json | 6 +++--- 6 files changed, 10 insertions(+), 10 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 29d88d5..9b3fe57 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -41,7 +41,7 @@ First public release. Reference implementation for Microsoft Entra Agent ID and **Discipline** - 1,237 tests, ruff clean, 80% coverage threshold enforced. - 66 hard-won learnings at `docs/runbooks/hard-won-learnings.md`. -- Full docs site at with API reference, script reference, ADRs, platform learnings, and runbooks. +- Full docs site at with API reference, script reference, ADRs, platform learnings, and runbooks. ### Known limitations diff --git a/README.md b/README.md index dc5bbff..80d1acf 100644 --- a/README.md +++ b/README.md @@ -89,7 +89,7 @@ Full walkthrough in [`docs/architecture/system-overview.md`](docs/architecture/s Mac or Linux: ```bash -git clone https://github.com/microsoft/Entraclaw.git +git clone https://github.com/microsoft/entraclaw.git cd Entraclaw ./scripts/setup.sh --new --with-upn-suffix=yourname source .venv/bin/activate @@ -110,7 +110,7 @@ After setup, use `./status.sh` as the canonical health and identity check: ## Documentation -The full doc site: **** +The full doc site: **** Direct pointers: diff --git a/docs/developer/docs-site.md b/docs/developer/docs-site.md index 9214af0..ec981da 100644 --- a/docs/developer/docs-site.md +++ b/docs/developer/docs-site.md @@ -21,7 +21,7 @@ The workflow: 4. Uploads the `site/` artifact to GitHub Pages. 5. Deploys via `actions/deploy-pages@v4`. -Published at . GitHub Pages is configured with `build_type=workflow`; re-enable via: +Published at . GitHub Pages is configured with `build_type=workflow`; re-enable via: ```bash gh api -X POST repos///pages -f 'build_type=workflow' diff --git a/docs/getting-started/quickstart.md b/docs/getting-started/quickstart.md index be69249..f945224 100644 --- a/docs/getting-started/quickstart.md +++ b/docs/getting-started/quickstart.md @@ -1,6 +1,6 @@ # Quickstart -**Source:** +**Source:** ## Prerequisites @@ -14,7 +14,7 @@ ## One-Command Setup (macOS/Linux) ```bash -git clone https://github.com/microsoft/Entraclaw.git +git clone https://github.com/microsoft/entraclaw.git cd Entraclaw ./scripts/setup.sh ``` diff --git a/docs/index.md b/docs/index.md index b599948..673ffa6 100644 --- a/docs/index.md +++ b/docs/index.md @@ -1,6 +1,6 @@ # Entraclaw Identity Research -**Source:** · **License:** MIT +**Source:** · **License:** MIT Entraclaw is a Python MCP server that gives a device-local agent its own Entra **Agent ID** and an **Agent User** that has all the capabilities of a human user in a Microsoft tenant. It can have a Teams presence and be invited to meetings to chat with your colleagues 1:1, a mailbox it can monitor and respond to, create and edit Word documents, make PowerPoint presentations, and allows you to access your CLI. The agent signs in autonomously, sends Teams messages from its own account, and writes audit events against its own object ID. It runs on macOS, Linux, and Windows, and works with Claude Code, Copilot CLI, or any MCP-speaking client. diff --git a/manifests/teams-app/manifest.json b/manifests/teams-app/manifest.json index 00daa73..b1ee469 100644 --- a/manifests/teams-app/manifest.json +++ b/manifests/teams-app/manifest.json @@ -5,9 +5,9 @@ "id": "00000000-0000-0000-0000-000000000000", "developer": { "name": "EntraClaw Research", - "websiteUrl": "https://github.com/microsoft/Entraclaw", - "privacyUrl": "https://github.com/microsoft/Entraclaw", - "termsOfUseUrl": "https://github.com/microsoft/Entraclaw" + "websiteUrl": "https://github.com/microsoft/entraclaw", + "privacyUrl": "https://github.com/microsoft/entraclaw", + "termsOfUseUrl": "https://github.com/microsoft/entraclaw" }, "name": { "short": "EntraClaw Agent", From b2cefe89853f338e7bc11695f7cc63d3fba0db2e Mon Sep 17 00:00:00 2001 From: Brandon Werner Date: Tue, 26 May 2026 15:58:29 -0700 Subject: [PATCH 03/13] docs: clarify research project framing Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 80d1acf..b5d130c 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ # Entraclaw: Identity Research for Microsoft 365 Agents -Entraclaw is a Python MCP server that gives a device-local agent its own Entra **Agent ID** and an **Agent User** that has all the capabilities of a human user in a Microsoft tenant. It can have a Teams presence and be invited to meetings to chat with your colleagues 1:1, a mailbox it can monitor and respond to, create and edit Word documents, make PowerPoint presentations, and allows you to access your CLI. The agent signs in autonomously, sends Teams messages from its own account, and writes audit events against its own object ID. It runs on macOS, Linux, and Windows, and works with Claude Code, Copilot CLI, or any MCP-speaking client. +Entraclaw is a research project. It is a Python MCP server that gives a device-local agent its own Entra **Agent ID** and an **Agent User** that has all the capabilities of a human user in a Microsoft tenant. It can have a Teams presence and be invited to meetings to chat with your colleagues 1:1, a mailbox it can monitor and respond to, create and edit Word documents, make PowerPoint presentations, and allows you to access your CLI. The agent signs in autonomously, sends Teams messages from its own account, and writes audit events against its own object ID. It runs on macOS, Linux, and Windows, and works with Claude Code, Copilot CLI, or any MCP-speaking client. **All you need to get started is:** From b0faeb24d41c6e82f49bfd208e20d7cc3353a34d Mon Sep 17 00:00:00 2001 From: Brandon Werner Date: Tue, 26 May 2026 16:10:03 -0700 Subject: [PATCH 04/13] docs: clarify prototype status Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index b5d130c..0fdc1af 100644 --- a/README.md +++ b/README.md @@ -165,4 +165,6 @@ This is a research repo, not a production service. It runs reliably on a develop Test discipline is the contract. TDD: failing test first, implementation second. `pytest -v && ruff check .` must pass before every commit; coverage threshold is 80%. -File issues for bugs and platform questions. PRs welcome — for anything touching auth, Teams, or the body prompt, read [`docs/runbooks/hard-won-learnings.md`](docs/runbooks/hard-won-learnings.md) first. The hard-won learnings file is append-only; new gotchas get numbered entries, never deletions. \ No newline at end of file +File issues for bugs and platform questions. PRs welcome — for anything touching auth, Teams, or the body prompt, read [`docs/runbooks/hard-won-learnings.md`](docs/runbooks/hard-won-learnings.md) first. The hard-won learnings file is append-only; new gotchas get numbered entries, never deletions. + +*This is a prototype. Flexible FIC and Entra Agent Identity are preview surfaces — APIs may change. The platform is designed to show the pattern and to be copyable, not to be run as-is in production.* From bfeae42eb0061e40fee96c81bdb0aa01b0c059ef Mon Sep 17 00:00:00 2001 From: Brandon Werner Date: Tue, 26 May 2026 16:46:23 -0700 Subject: [PATCH 05/13] docs: add Microsoft contribution links Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- README.md | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/README.md b/README.md index 0fdc1af..a4c168f 100644 --- a/README.md +++ b/README.md @@ -163,8 +163,25 @@ This is a research repo, not a production service. It runs reliably on a develop ## Contributing +This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and do grant us the rights to use your contribution. For details, visit . + +When you submit a pull request, a CLA bot automatically determines whether you need to provide a CLA and decorates the PR appropriately. Follow the bot's instructions. You only need to do this once across all repositories using the Microsoft CLA. + +This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). For more information, see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with questions or comments. + Test discipline is the contract. TDD: failing test first, implementation second. `pytest -v && ruff check .` must pass before every commit; coverage threshold is 80%. File issues for bugs and platform questions. PRs welcome — for anything touching auth, Teams, or the body prompt, read [`docs/runbooks/hard-won-learnings.md`](docs/runbooks/hard-won-learnings.md) first. The hard-won learnings file is append-only; new gotchas get numbered entries, never deletions. +Useful links: + +- [Code of Conduct](https://opensource.microsoft.com/codeofconduct/) +- [Security policy](SECURITY.md) +- [MIT License](LICENSE) +- [Microsoft Open Source](https://opensource.microsoft.com/) +- [Microsoft Privacy Statement](https://privacy.microsoft.com/privacystatement) +- [Microsoft Trademarks](https://www.microsoft.com/legal/intellectualproperty/trademarks) + +## Disclaimer + *This is a prototype. Flexible FIC and Entra Agent Identity are preview surfaces — APIs may change. The platform is designed to show the pattern and to be copyable, not to be run as-is in production.* From 83a9f31b13060875cfd884ea8a84be077f5733bf Mon Sep 17 00:00:00 2001 From: Brandon Werner Date: Tue, 26 May 2026 17:18:14 -0700 Subject: [PATCH 06/13] Add CodeQL analysis workflow configuration --- .github/workflows/codeql.yml | 101 +++++++++++++++++++++++++++++++++++ 1 file changed, 101 insertions(+) create mode 100644 .github/workflows/codeql.yml diff --git a/.github/workflows/codeql.yml b/.github/workflows/codeql.yml new file mode 100644 index 0000000..c67705c --- /dev/null +++ b/.github/workflows/codeql.yml @@ -0,0 +1,101 @@ +# For most projects, this workflow file will not need changing; you simply need +# to commit it to your repository. +# +# You may wish to alter this file to override the set of languages analyzed, +# or to provide custom queries or build logic. +# +# ******** NOTE ******** +# We have attempted to detect the languages in your repository. Please check +# the `language` matrix defined below to confirm you have the correct set of +# supported CodeQL languages. +# +name: "CodeQL Advanced" + +on: + push: + branches: [ "main" ] + pull_request: + branches: [ "main" ] + schedule: + - cron: '41 11 * * 4' + +jobs: + analyze: + name: Analyze (${{ matrix.language }}) + # Runner size impacts CodeQL analysis time. To learn more, please see: + # - https://gh.io/recommended-hardware-resources-for-running-codeql + # - https://gh.io/supported-runners-and-hardware-resources + # - https://gh.io/using-larger-runners (GitHub.com only) + # Consider using larger runners or machines with greater resources for possible analysis time improvements. + runs-on: ${{ (matrix.language == 'swift' && 'macos-latest') || 'ubuntu-latest' }} + permissions: + # required for all workflows + security-events: write + + # required to fetch internal or private CodeQL packs + packages: read + + # only required for workflows in private repositories + actions: read + contents: read + + strategy: + fail-fast: false + matrix: + include: + - language: actions + build-mode: none + - language: python + build-mode: none + # CodeQL supports the following values keywords for 'language': 'actions', 'c-cpp', 'csharp', 'go', 'java-kotlin', 'javascript-typescript', 'python', 'ruby', 'rust', 'swift' + # Use `c-cpp` to analyze code written in C, C++ or both + # Use 'java-kotlin' to analyze code written in Java, Kotlin or both + # Use 'javascript-typescript' to analyze code written in JavaScript, TypeScript or both + # To learn more about changing the languages that are analyzed or customizing the build mode for your analysis, + # see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/customizing-your-advanced-setup-for-code-scanning. + # If you are analyzing a compiled language, you can modify the 'build-mode' for that language to customize how + # your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages + steps: + - name: Checkout repository + uses: actions/checkout@v4 + + # Add any setup steps before running the `github/codeql-action/init` action. + # This includes steps like installing compilers or runtimes (`actions/setup-node` + # or others). This is typically only required for manual builds. + # - name: Setup runtime (example) + # uses: actions/setup-example@v1 + + # Initializes the CodeQL tools for scanning. + - name: Initialize CodeQL + uses: github/codeql-action/init@v4 + with: + languages: ${{ matrix.language }} + build-mode: ${{ matrix.build-mode }} + # If you wish to specify custom queries, you can do so here or in a config file. + # By default, queries listed here will override any specified in a config file. + # Prefix the list here with "+" to use these queries and those in the config file. + + # For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs + # queries: security-extended,security-and-quality + + # If the analyze step fails for one of the languages you are analyzing with + # "We were unable to automatically build your code", modify the matrix above + # to set the build mode to "manual" for that language. Then modify this step + # to build your code. + # ℹ️ Command-line programs to run using the OS shell. + # 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun + - name: Run manual build steps + if: matrix.build-mode == 'manual' + shell: bash + run: | + echo 'If you are using a "manual" build mode for one or more of the' \ + 'languages you are analyzing, replace this with the commands to build' \ + 'your code, for example:' + echo ' make bootstrap' + echo ' make release' + exit 1 + + - name: Perform CodeQL Analysis + uses: github/codeql-action/analyze@v4 + with: + category: "/language:${{matrix.language}}" From 00e8b1fe61fb0cae3871fba07cd18538aa8ee857 Mon Sep 17 00:00:00 2001 From: Brandon Werner Date: Tue, 26 May 2026 17:20:47 -0700 Subject: [PATCH 07/13] ci: run CodeQL on dev Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/workflows/codeql.yml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/codeql.yml b/.github/workflows/codeql.yml index c67705c..cd9f500 100644 --- a/.github/workflows/codeql.yml +++ b/.github/workflows/codeql.yml @@ -13,9 +13,9 @@ name: "CodeQL Advanced" on: push: - branches: [ "main" ] + branches: [ "main", "dev" ] pull_request: - branches: [ "main" ] + branches: [ "main", "dev" ] schedule: - cron: '41 11 * * 4' From f79bce7f5c5242c57dd028724df2e9b1debcb287 Mon Sep 17 00:00:00 2001 From: Brandon Werner Date: Tue, 26 May 2026 17:29:47 -0700 Subject: [PATCH 08/13] fix: address CodeQL alerts on dev Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/workflows/test-windows.yml | 3 +++ .../agent-foundry-entra-provisioning.py | 4 ++-- scripts/entra_provisioning.py | 4 ++-- src/entraclaw/bot/server.py | 2 +- tests/test_mcp_server_integration.py | 9 ++++++-- tests/test_preflight.py | 7 +++++- tests/tools/test_teams.py | 22 +++++++++---------- 7 files changed, 32 insertions(+), 19 deletions(-) diff --git a/.github/workflows/test-windows.yml b/.github/workflows/test-windows.yml index 0f71d26..fcb0b77 100644 --- a/.github/workflows/test-windows.yml +++ b/.github/workflows/test-windows.yml @@ -11,6 +11,9 @@ on: pull_request: branches: [main] +permissions: + contents: read + jobs: test-windows: runs-on: windows-latest diff --git a/docs/reference/agent-foundry-entra-provisioning.py b/docs/reference/agent-foundry-entra-provisioning.py index 2f50bed..c7b5e6d 100644 --- a/docs/reference/agent-foundry-entra-provisioning.py +++ b/docs/reference/agent-foundry-entra-provisioning.py @@ -469,11 +469,11 @@ def verify_graph_preflight(token, checks): def bootstrap_cli(): try: required_values = build_required_permission_values(include_ca=True) - client_id, _, tenant_id = ensure_app_registration(required_values) + provisioner_app_id, _, tenant_id = ensure_app_registration(required_values) print("") print("Provisioner bootstrap complete") print(f" Tenant: {tenant_id}") - print(f" Client ID: {client_id}") + print(f" App ID: {provisioner_app_id}") print(f" Permissions: {', '.join(required_values)}") return 0 except ProvisionerBootstrapError as exc: diff --git a/scripts/entra_provisioning.py b/scripts/entra_provisioning.py index 27e348b..75daf9d 100644 --- a/scripts/entra_provisioning.py +++ b/scripts/entra_provisioning.py @@ -668,11 +668,11 @@ def ensure_app_registration( removed = _remove_legacy_password_credentials(client_id) if removed: print( - f" Removed {removed} legacy password credential(s) from " + f" Removed {removed} legacy app credential(s) from " f"Provisioner app (cert-auth only from here on)." ) except ProvisionerBootstrapError as exc: - print(f" WARN: could not enumerate/delete legacy password creds: {exc}") + print(f" WARN: could not enumerate/delete legacy app credentials: {type(exc).__name__}") # Cert-auth path: generate + upload + Keychain-store if absent if not pem_bundle: diff --git a/src/entraclaw/bot/server.py b/src/entraclaw/bot/server.py index 426956f..79eac44 100644 --- a/src/entraclaw/bot/server.py +++ b/src/entraclaw/bot/server.py @@ -247,7 +247,7 @@ async def handle_messages(request: web.Request) -> web.Response: return web.Response(status=401, text="Unauthorized") except Exception as exc: logger.error("Failed to process activity: %s", exc) - return web.Response(status=500, text=str(exc)) + return web.Response(status=500, text="Internal server error") async def health(request: web.Request) -> web.Response: return web.json_response({"status": "ok"}) diff --git a/tests/test_mcp_server_integration.py b/tests/test_mcp_server_integration.py index 8d696d1..c0d9779 100644 --- a/tests/test_mcp_server_integration.py +++ b/tests/test_mcp_server_integration.py @@ -324,9 +324,14 @@ def test_persona_load_success_is_logged( with caplog.at_level(logging.INFO, logger="entraclaw"): _load_agent_instructions() - msgs = [r.getMessage() for r in caplog.records if r.name == "entraclaw"] + records = [r for r in caplog.records if r.name == "entraclaw"] + msgs = [r.getMessage() for r in records] assert any("persona-sati prompt loaded" in m for m in msgs), msgs - assert any("https://persona.example" in m for m in msgs), msgs + assert any( + r.msg == "persona-sati prompt loaded (url=%s, body_chars=%d, persona_chars=%d)" + and r.args[0] == "https://persona.example" + for r in records + ), msgs def test_persona_env_unset_is_logged( self, diff --git a/tests/test_preflight.py b/tests/test_preflight.py index 9475601..6c9ef4a 100644 --- a/tests/test_preflight.py +++ b/tests/test_preflight.py @@ -83,7 +83,12 @@ def test_warn_when_no_teams_capable_sku_available(self) -> None: result = check_teams_license_availability("fake-token") assert result.status == "warn" - assert "admin.microsoft.com" in (result.remediation or "") + assert result.remediation == ( + "Buy a Teams-capable license at " + "https://admin.microsoft.com/Adminportal/Home#/catalog " + "(or any Teams-capable license: M365 Business Premium, E3, E5, etc.)" + " and re-run setup.sh, or assign an existing license manually before testing." + ) def test_warn_when_teams_sku_exists_but_all_seats_consumed(self) -> None: with respx.mock: diff --git a/tests/tools/test_teams.py b/tests/tools/test_teams.py index 507e978..2f32043 100644 --- a/tests/tools/test_teams.py +++ b/tests/tools/test_teams.py @@ -10,6 +10,7 @@ import os from unittest.mock import MagicMock, patch +from urllib.parse import parse_qs import httpx import pytest @@ -234,9 +235,8 @@ def test_default_resource_scope_is_graph(self) -> None: acquire_agent_user_token(get_config()) - hop3_body = dict(x.split("=") for x in route.calls[2].request.content.decode().split("&")) - # URL-encoded "https://graph.microsoft.com/.default" - assert "graph.microsoft.com" in hop3_body["scope"] + hop3_body = parse_qs(route.calls[2].request.content.decode()) + assert hop3_body["scope"] == ["https://graph.microsoft.com/.default"] @respx.mock def test_resource_scope_override(self) -> None: @@ -261,13 +261,13 @@ def test_resource_scope_override(self) -> None: assert token == "storage-token" # Hops 1 and 2 still target the FIC exchange scope - hop1_body = dict(x.split("=") for x in route.calls[0].request.content.decode().split("&")) - hop2_body = dict(x.split("=") for x in route.calls[1].request.content.decode().split("&")) - assert "AzureADTokenExchange" in hop1_body["scope"] - assert "AzureADTokenExchange" in hop2_body["scope"] + hop1_body = parse_qs(route.calls[0].request.content.decode()) + hop2_body = parse_qs(route.calls[1].request.content.decode()) + assert hop1_body["scope"] == ["api://AzureADTokenExchange/.default"] + assert hop2_body["scope"] == ["api://AzureADTokenExchange/.default"] # Hop 3 carries the storage resource - hop3_body = dict(x.split("=") for x in route.calls[2].request.content.decode().split("&")) - assert "storage.azure.com" in hop3_body["scope"] + hop3_body = parse_qs(route.calls[2].request.content.decode()) + assert hop3_body["scope"] == ["https://storage.azure.com/.default"] class TestAcquireAgentUserStorageToken: @@ -292,8 +292,8 @@ def test_uses_storage_scope(self) -> None: token = acquire_agent_user_storage_token(get_config()) assert token == "storage-tok" - hop3_body = dict(x.split("=") for x in route.calls[2].request.content.decode().split("&")) - assert "storage.azure.com" in hop3_body["scope"] + hop3_body = parse_qs(route.calls[2].request.content.decode()) + assert hop3_body["scope"] == ["https://storage.azure.com/.default"] # --------------------------------------------------------------------------- From 4c4914ec0da70b1cd35230574fd3cf1b4de8f0f5 Mon Sep 17 00:00:00 2001 From: Brandon Werner Date: Tue, 26 May 2026 17:33:42 -0700 Subject: [PATCH 09/13] fix: avoid credential-adjacent provisioning output Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/reference/agent-foundry-entra-provisioning.py | 4 +--- scripts/entra_provisioning.py | 12 +++--------- 2 files changed, 4 insertions(+), 12 deletions(-) diff --git a/docs/reference/agent-foundry-entra-provisioning.py b/docs/reference/agent-foundry-entra-provisioning.py index c7b5e6d..110696b 100644 --- a/docs/reference/agent-foundry-entra-provisioning.py +++ b/docs/reference/agent-foundry-entra-provisioning.py @@ -469,11 +469,9 @@ def verify_graph_preflight(token, checks): def bootstrap_cli(): try: required_values = build_required_permission_values(include_ca=True) - provisioner_app_id, _, tenant_id = ensure_app_registration(required_values) + ensure_app_registration(required_values) print("") print("Provisioner bootstrap complete") - print(f" Tenant: {tenant_id}") - print(f" App ID: {provisioner_app_id}") print(f" Permissions: {', '.join(required_values)}") return 0 except ProvisionerBootstrapError as exc: diff --git a/scripts/entra_provisioning.py b/scripts/entra_provisioning.py index 75daf9d..5202bda 100644 --- a/scripts/entra_provisioning.py +++ b/scripts/entra_provisioning.py @@ -665,12 +665,8 @@ def ensure_app_registration( # backdoor — remove them unconditionally. This closes the window # opened by the secret-auth era. try: - removed = _remove_legacy_password_credentials(client_id) - if removed: - print( - f" Removed {removed} legacy app credential(s) from " - f"Provisioner app (cert-auth only from here on)." - ) + if _remove_legacy_password_credentials(client_id): + print(" Removed legacy app credentials from Provisioner app.") except ProvisionerBootstrapError as exc: print(f" WARN: could not enumerate/delete legacy app credentials: {type(exc).__name__}") @@ -797,11 +793,9 @@ def bootstrap_cli() -> int: """Bootstrap the provisioner app and print summary.""" try: required_values = build_required_permission_values() - client_id, _, tenant_id = ensure_app_registration(required_values) + ensure_app_registration(required_values) print("") print("Provisioner bootstrap complete") - print(f" Tenant: {tenant_id}") - print(f" Client ID: {client_id}") print(f" Permissions: {', '.join(required_values)}") return 0 except ProvisionerBootstrapError as exc: From 821a6b62db97d973ebf2c27c0f69cf8685401630 Mon Sep 17 00:00:00 2001 From: Evan Alferez Date: Wed, 27 May 2026 09:43:15 +0900 Subject: [PATCH 10/13] fix: advance email poll cursor by 1 ms to prevent restart re-push Graph's receivedDateTime gt filter can re-deliver messages at the cursor's exact second after a server restart when per-session dedup is lost. Bump the watermark by 1 ms after each poll batch and isolate email poll tests from blob env leakage. --- TODOS.md | 4 +-- docs/engineering-status.md | 4 +-- src/entraclaw/tools/email_poll.py | 20 ++++++++++++++ tests/tools/test_email_poll.py | 43 +++++++++++++++++++++++++++++-- 4 files changed, 65 insertions(+), 6 deletions(-) diff --git a/TODOS.md b/TODOS.md index 9acb1fd..f4daf80 100644 --- a/TODOS.md +++ b/TODOS.md @@ -59,9 +59,9 @@ Two bugs, both observed at 2026-04-17T17:00:00 PDT (= 00:00:01 UTC 2026-04-18): - **Effort:** S (~30 LOC + tests for both) - **Source:** Live observation 2026-04-17 evening (first real scheduled fire) -### Email cursor sub-second precision +### ~~Email cursor sub-second precision~~ ✅ DONE `email_poll.poll_once` returns `latest_ts` verbatim from Graph; the cursor file may end up at second precision while Graph internally compares with sub-second. Result: an email at the cursor's exact second gets re-returned every poll. Per-session dedup in `_background_poll_email` handles within-session, but the email re-pushes once on every server restart. Real fix: bump cursor by 1ms when it equals the latest receivedDateTime, or store sub-second precision unconditionally. -- **Effort:** XS (~10 LOC + 1 test) +- **Shipped:** `advance_cursor()` bumps the poll watermark by 1 ms after each batch (PR pending). - **Source:** Live observation 2026-04-17 (a teammate's "Ball game tonight" loop) ### ~~Token auto-refresh in teams_send~~ ✅ DONE diff --git a/docs/engineering-status.md b/docs/engineering-status.md index 7970362..692761e 100644 --- a/docs/engineering-status.md +++ b/docs/engineering-status.md @@ -10,15 +10,15 @@ Source of truth for detail: `TODOS.md` in the repository root. One line each below. - **Script-toolkit docs closeout** — `./status.sh` is the canonical entry; finish the remaining script-reference polish and smoke verification. See `TODOS.md` P1. -- **Test isolation: blob env leakage** — `tmp_data_dir` fixture in `tests/tools/test_interaction_log.py` doesn't clear `ENTRACLAW_BLOB_ENDPOINT`; 10 tests fail on any machine with blob env configured. +- **Test isolation: blob env leakage** — `tmp_data_dir` fixture in `tests/tools/test_interaction_log.py` doesn't clear `ENTRACLAW_BLOB_ENDPOINT`; 10 tests fail on any machine with blob env configured. Partially addressed: `test_interaction_log.py`, `test_daily_summary.py`, and `test_email_poll.py` fixtures now unset blob env; session-scoped autouse fixture still open. - **MCP server orphans on Claude Code exit** — background poll tasks sit outside FastMCP's lifespan cancel scope; new sessions spawn a second server, both poll Graph independently. - **Daily summary scheduler — wrong day + double-fire** — UTC-based `target_day` summarizes the brand-new UTC day at 5pm PDT; scheduler fired twice at the same second on 2026-04-17. -- **Email cursor sub-second precision** — cursor file at second precision; an email at the cursor's exact second gets re-pushed once on every server restart. ## Recently Shipped Last ~30 days. Full diff: `git log --since="2026-04-21"`. +- **Email cursor sub-second precision** (2026-05-27) — `advance_cursor()` bumps the poll watermark by 1 ms so Graph's `gt` filter does not re-fetch messages at the cursor's exact second after a server restart. - **README + docs-site refresh** (2026-05-21, ff9a8dd, 9b73dee, b495073) — developer-first README rewrite, GitHub Pages auto-deploy, nav restructure. - **OSS sanitization passes** (2026-05-21, f2a3c18; 2026-05-18, 6cff243) — PII scrub, personal data and private identifiers removed from repo. - **Script toolkit refactor + E2E smoke harness** (2026-05-19, PR #77) — `./status.sh` consolidated; `setup.sh --status` delegates to the same implementation. diff --git a/src/entraclaw/tools/email_poll.py b/src/entraclaw/tools/email_poll.py index 70b7380..8d50eac 100644 --- a/src/entraclaw/tools/email_poll.py +++ b/src/entraclaw/tools/email_poll.py @@ -14,6 +14,7 @@ from __future__ import annotations import logging +from datetime import UTC, datetime, timedelta from pathlib import Path import httpx @@ -59,6 +60,23 @@ def save_cursor(ts: str) -> None: _cursor_path().write_text(ts.strip()) +def advance_cursor(ts: str) -> str: + """Return a timestamp strictly after *ts* for the next poll watermark. + + Graph's ``receivedDateTime gt {cursor}`` filter excludes messages at or + before the cursor. When timestamps lack sub-second precision, bumping by + 1 ms prevents messages at the cursor's exact second from being re-fetched + after a server restart (per-session dedup is lost on restart). + """ + dt = datetime.fromisoformat(ts.replace("Z", "+00:00")) + if dt.tzinfo is None: + dt = dt.replace(tzinfo=UTC) + advanced = (dt + timedelta(milliseconds=1)).astimezone(UTC).replace(tzinfo=None) + if advanced.microsecond: + return advanced.strftime("%Y-%m-%dT%H:%M:%S.") + f"{advanced.microsecond // 1000:03d}Z" + return advanced.strftime("%Y-%m-%dT%H:%M:%SZ") + + def is_substantive(address: str | None) -> bool: """Return True if *address* looks like a real human sender. @@ -127,6 +145,8 @@ async def poll_once( substantive.append(msg) + if latest_ts is not None: + latest_ts = advance_cursor(latest_ts) return substantive, latest_ts diff --git a/tests/tools/test_email_poll.py b/tests/tools/test_email_poll.py index 1873de5..c98c579 100644 --- a/tests/tools/test_email_poll.py +++ b/tests/tools/test_email_poll.py @@ -21,6 +21,7 @@ from entraclaw.tools.email_poll import ( GRAPH_MESSAGES_URL, + advance_cursor, is_substantive, load_cursor, poll_once, @@ -31,6 +32,8 @@ @pytest.fixture def tmp_data_dir(tmp_path, monkeypatch): monkeypatch.setenv("ENTRACLAW_DATA_DIR", str(tmp_path)) + monkeypatch.delenv("ENTRACLAW_BLOB_ENDPOINT", raising=False) + monkeypatch.delenv("ENTRACLAW_BLOB_CONTAINER", raising=False) return tmp_path @@ -65,6 +68,20 @@ def test_case_insensitive(self) -> None: assert not is_substantive("NO-REPLY@Teams.Mail.Microsoft") +# --------------------------------------------------------------------------- +# advance_cursor +# --------------------------------------------------------------------------- +class TestAdvanceCursor: + def test_bumps_second_precision_timestamp_by_one_millisecond(self) -> None: + assert advance_cursor("2026-04-16T19:00:00Z") == "2026-04-16T19:00:00.001Z" + + def test_bumps_subsecond_timestamp_by_one_millisecond(self) -> None: + assert advance_cursor("2026-04-16T19:00:00.500Z") == "2026-04-16T19:00:00.501Z" + + def test_carries_overflow_into_next_second(self) -> None: + assert advance_cursor("2026-04-16T19:00:00.999Z") == "2026-04-16T19:00:01Z" + + # --------------------------------------------------------------------------- # cursor persistence # --------------------------------------------------------------------------- @@ -151,7 +168,7 @@ async def test_filters_non_substantive_senders(self) -> None: assert msgs[0]["id"] == "2" # Cursor advances to the latest received message across ALL returned # (including filtered) so we don't re-scan the same noise next poll. - assert new_cursor == "2026-04-16T19:07:00Z" + assert new_cursor == "2026-04-16T19:07:00.001Z" @pytest.mark.asyncio async def test_advances_cursor_to_latest(self) -> None: @@ -166,7 +183,29 @@ async def test_advances_cursor_to_latest(self) -> None: ) msgs, new_cursor = await poll_once(token="tok", cursor="2026-04-16T19:00:00Z") assert len(msgs) == 3 - assert new_cursor == "2026-04-16T19:09:00Z" + assert new_cursor == "2026-04-16T19:09:00.001Z" + + @pytest.mark.asyncio + async def test_same_second_message_not_re_fetched_on_next_poll(self) -> None: + """Cursor must advance past the latest message so gt filter skips it.""" + ts = "2026-04-16T19:00:00Z" + value = [_msg(msg_id="same-sec", sender="u@example.com", received=ts)] + captured_filters: list[str] = [] + + def handler(request): + captured_filters.append(dict(request.url.params).get("$filter", "")) + payload = {"value": value} if not captured_filters[0] else {"value": []} + return httpx.Response(200, json=payload) + + with respx.mock: + respx.get(GRAPH_MESSAGES_URL).mock(side_effect=handler) + _, first_cursor = await poll_once(token="tok", cursor=None) + _, second_cursor = await poll_once(token="tok", cursor=first_cursor) + + assert first_cursor == "2026-04-16T19:00:00.001Z" + assert len(captured_filters) == 2 + assert captured_filters[1] == "receivedDateTime gt 2026-04-16T19:00:00.001Z" + assert second_cursor == first_cursor @pytest.mark.asyncio async def test_cursor_filter_passed_to_graph(self) -> None: From 28fdfc095bef8e523f1ff3a7bdaabe4f21d70c75 Mon Sep 17 00:00:00 2001 From: "B. Brandon Werner" Date: Wed, 27 May 2026 14:41:05 -0700 Subject: [PATCH 11/13] feat(tools): add read_email MCP tool for fetching full inbound mail bodies MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The 60-second email-poll channel push truncates the body preview of inbound mail. Long forwards (recipient lists, threaded replies, attached metadata) get cut off mid-content, so the agent can't read past the cut even when the message_id is right there in the push. Adds `read_email(message_id, mailbox="")` which calls Graph `GET /me/messages/{message_id}` (or `/users/{mailbox}/messages/{id}` for shared mailboxes) with `$select` covering body (text + HTML), all recipient lists, sender, subject, internetMessageHeaders, and hasAttachments. Reuses the same Agent User token chain + `Mail.Read` scope as `email_poll`. Errors mirror `send_email`: 401 → TokenExpiredError (auto-refresh + retry); 404/403/5xx → clean {"error", "status", "message_id"} dict; bearer token never echoed. +7 tests (happy path + verbatim long body + shared mailbox + 401/404/500 + no-token-leak). Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/engineering-status.md | 5 +- src/entraclaw/mcp_server.py | 44 ++++++ src/entraclaw/tools/email.py | 70 ++++++++++ tests/tools/test_read_email.py | 239 +++++++++++++++++++++++++++++++++ 4 files changed, 356 insertions(+), 2 deletions(-) create mode 100644 tests/tools/test_read_email.py diff --git a/docs/engineering-status.md b/docs/engineering-status.md index 692761e..6d58404 100644 --- a/docs/engineering-status.md +++ b/docs/engineering-status.md @@ -1,7 +1,7 @@ # Engineering Status -**Last updated:** 2026-05-21 -**Status:** v1 released. Three auth modes (Agent User / Delegated / Bot Gateway) running locally on macOS, Linux, and ARM64 Windows 11. **1,237 tests** across the suite, ruff clean. Body-first prompt architecture loads at boot; persona-sati MCP wires personality and memory when configured. ADR-005 cloud-memory Phases 1, 2, 5, 6a shipped — blob storage is opt-in via `setup.sh --use-cloud-memory`. Work IQ Word migration landed (PR #75) and the `send_teams_message` auto-wait pattern is host-gated and deterministic. README, docs site, and GitHub Pages auto-deploy refreshed 2026-05-21. +**Last updated:** 2026-05-27 +**Status:** v1 released. Three auth modes (Agent User / Delegated / Bot Gateway) running locally on macOS, Linux, and ARM64 Windows 11. **1,248 tests** across the suite, ruff clean. Body-first prompt architecture loads at boot; persona-sati MCP wires personality and memory when configured. ADR-005 cloud-memory Phases 1, 2, 5, 6a shipped — blob storage is opt-in via `setup.sh --use-cloud-memory`. Work IQ Word migration landed (PR #75) and the `send_teams_message` auto-wait pattern is host-gated and deterministic. README, docs site, and GitHub Pages auto-deploy refreshed 2026-05-21. --- @@ -18,6 +18,7 @@ Source of truth for detail: `TODOS.md` in the repository root. One line each bel Last ~30 days. Full diff: `git log --since="2026-04-21"`. +- **`read_email` MCP tool** (2026-05-27) — fetches the full body + all recipient lists + headers of an inbound mail by `message_id`. Fixes the gap where the 60s email-poll channel push truncates the preview of long forwarded mails. Same three-hop Agent User token + `Mail.Read` scope as the poll. +7 tests. - **Email cursor sub-second precision** (2026-05-27) — `advance_cursor()` bumps the poll watermark by 1 ms so Graph's `gt` filter does not re-fetch messages at the cursor's exact second after a server restart. - **README + docs-site refresh** (2026-05-21, ff9a8dd, 9b73dee, b495073) — developer-first README rewrite, GitHub Pages auto-deploy, nav restructure. - **OSS sanitization passes** (2026-05-21, f2a3c18; 2026-05-18, 6cff243) — PII scrub, personal data and private identifiers removed from repo. diff --git a/src/entraclaw/mcp_server.py b/src/entraclaw/mcp_server.py index fb53758..4f30a24 100644 --- a/src/entraclaw/mcp_server.py +++ b/src/entraclaw/mcp_server.py @@ -2456,6 +2456,50 @@ async def send_email( return json.dumps(result, indent=2) +@mcp.tool() +async def read_email( + message_id: str, + mailbox: str = "", +) -> str: + """Fetch the full body + recipients + headers of an inbound email. + + Use when the 60-second email-poll channel push has truncated the + body (forwarded recipient lists, threaded replies, attached + metadata) and the agent needs to read past where the preview cut + off. Calls Graph ``GET /me/messages/{message_id}`` with a + ``$select`` covering body (text + HTML), all recipient lists, + sender, subject, message headers, and ``hasAttachments``. + + Args: + message_id: The Graph ``id`` of the message (from the channel + push or a prior ``read_email`` / poll result). + mailbox: Optional shared mailbox UPN/address. Empty (default) + reads from the Agent User's own mailbox. + + Returns: + JSON with the full Graph message object on success, or + ``{"error": "...", "status": , "message_id": "..."}`` on + 404 / 403 / 5xx. 401 triggers an automatic token refresh and + retry; only an exhausted retry surfaces. + """ + await _initialize() + + if not message_id or not message_id.strip(): + return json.dumps({"error": "message_id is required."}) + + from entraclaw.tools.email import read_email as _read + + await _ensure_valid_token() + + result = await _with_token_retry( + _read, + message_id=message_id, + mailbox=mailbox, + ) + + return json.dumps(result, indent=2) + + @mcp.tool() async def send_card( card_type: str, diff --git a/src/entraclaw/tools/email.py b/src/entraclaw/tools/email.py index 166c4bb..2e0e129 100644 --- a/src/entraclaw/tools/email.py +++ b/src/entraclaw/tools/email.py @@ -25,6 +25,16 @@ GRAPH_BASE = "https://graph.microsoft.com/v1.0" GRAPH_SENDMAIL_URL = f"{GRAPH_BASE}/me/sendMail" +# Fields the agent needs when re-reading a poll-pushed message: full body +# (text + HTML), all recipient lists, message headers, attachment flag. +# The 60s email poll only ships a short preview via the channel push; +# longer mails truncate. ``read_email`` fixes that. +READ_EMAIL_SELECT = ( + "body,bodyPreview,toRecipients,ccRecipients,bccRecipients," + "from,sender,subject,internetMessageHeaders,hasAttachments," + "receivedDateTime,id" +) + class EmailSendError(EntraClawError): """Graph ``sendMail`` / reply endpoint returned a non-2xx (non-401/429).""" @@ -149,3 +159,63 @@ async def send_email( f"Graph rejected mail send ({resp.status_code}): {err_body}", status_code=resp.status_code, ) + + +async def read_email( + *, + message_id: str, + mailbox: str = "", + token: str, + transport: httpx.AsyncBaseTransport | None = None, +) -> dict: + """Fetch the full body + recipients + headers of a message by id. + + The 60-second email poll only ships a short body preview via the + channel push; longer mails (forwarded recipient lists, threaded + replies, attached metadata) get truncated. This helper calls + ``GET /me/messages/{id}`` (or ``/users/{mailbox}/messages/{id}`` for + shared mailboxes) with ``$select`` covering every field the agent + needs to act on a real inbound mail. + + Returns the Graph message JSON unchanged on success — body is + returned verbatim with no truncation on our side. + + Errors: + * 401 → ``TokenExpiredError`` (caller refreshes + retries via + ``_with_token_retry``). + * 404 / 403 / 5xx → ``{"error": "...", "status": }`` so + the caller can surface "no such message" without an + exception. The bearer token is never echoed into the result. + """ + if mailbox: + url = f"{GRAPH_BASE}/users/{mailbox}/messages/{message_id}" + else: + url = f"{GRAPH_BASE}/me/messages/{message_id}" + + headers = {"Authorization": f"Bearer {token}"} + params = {"$select": READ_EMAIL_SELECT} + + async with httpx.AsyncClient(transport=transport) as client: + resp = await client.get(url, params=params, headers=headers) + + if resp.status_code == 200: + return resp.json() + + if resp.status_code == 401: + raise TokenExpiredError("Agent User token expired — re-acquire via three-hop flow") + + # Non-2xx, non-401: surface the Graph error body so the operator + # can see *why* the read failed, but never include the token. + try: + err_body = resp.json() + err_msg = err_body.get("error", {}).get("message") or str(err_body) + except Exception: + err_msg = resp.text or f"Graph returned HTTP {resp.status_code}" + + logger.info( + "read_email failed for %s (mailbox=%s): status=%d", + message_id, + mailbox or "", + resp.status_code, + ) + return {"error": err_msg, "status": resp.status_code, "message_id": message_id} diff --git a/tests/tools/test_read_email.py b/tests/tools/test_read_email.py new file mode 100644 index 0000000..d34f602 --- /dev/null +++ b/tests/tools/test_read_email.py @@ -0,0 +1,239 @@ +"""Tests for entraclaw.tools.email.read_email. + +``read_email`` wraps Graph ``GET /me/messages/{message_id}`` (or +``/users/{mailbox}/messages/{message_id}`` for shared mailboxes) so the +agent can fetch the FULL body of an inbound message by id. The +60-second email poll only ships a short preview via the channel push; +longer mails (forwarded recipient lists, threaded replies) truncate +mid-content. This helper fixes that gap. + +Error shape mirrors ``send_email``: 401 → ``TokenExpiredError``, +404 returns a clean ``{"error": ..., "status": 404}`` dict so the +caller can handle missing-mail without an exception. +""" + +from __future__ import annotations + +import httpx +import pytest +import respx + +GRAPH_BASE = "https://graph.microsoft.com/v1.0" +GRAPH_MESSAGE_URL_TMPL = GRAPH_BASE + "/me/messages/{message_id}" +GRAPH_MESSAGE_USER_URL_TMPL = GRAPH_BASE + "/users/{mailbox}/messages/{message_id}" + + +# --------------------------------------------------------------------------- +# happy path +# --------------------------------------------------------------------------- +class TestReadEmailHappyPath: + @pytest.mark.asyncio + async def test_returns_full_body_recipients_and_headers(self) -> None: + from entraclaw.tools.email import read_email + + message_id = "AAMkADf0==" + url = GRAPH_MESSAGE_URL_TMPL.format(message_id=message_id) + + graph_payload = { + "id": message_id, + "subject": "Fwd: contributor list", + "body": { + "contentType": "html", + "content": "

full long body that would otherwise truncate ...

", + }, + "bodyPreview": "full long body that would otherwise tru", # 40-char preview + "from": {"emailAddress": {"address": "alice@example.com", "name": "Alice"}}, + "sender": {"emailAddress": {"address": "alice@example.com", "name": "Alice"}}, + "toRecipients": [ + {"emailAddress": {"address": "bob@example.com"}}, + {"emailAddress": {"address": "carol@example.com"}}, + ], + "ccRecipients": [{"emailAddress": {"address": "dave@example.com"}}], + "bccRecipients": [], + "hasAttachments": False, + "internetMessageHeaders": [ + {"name": "Message-ID", "value": ""}, + ], + } + + captured: dict = {} + + def handler(request): + captured["url"] = str(request.url) + captured["headers"] = dict(request.headers) + return httpx.Response(200, json=graph_payload) + + with respx.mock: + respx.get(url).mock(side_effect=handler) + result = await read_email(message_id=message_id, token="tok") + + # Auth header carried through. + assert captured["headers"]["authorization"] == "Bearer tok" + # $select carries every field we care about. + for f in ( + "body", + "toRecipients", + "ccRecipients", + "bccRecipients", + "from", + "sender", + "subject", + "internetMessageHeaders", + "hasAttachments", + ): + assert f in captured["url"], f"missing field {f} in $select: {captured['url']}" + + # Full body returned verbatim (not the truncated preview). + assert result["id"] == message_id + assert result["subject"] == "Fwd: contributor list" + assert result["body"]["content"] == ( + "

full long body that would otherwise truncate ...

" + ) + assert result["body"]["contentType"] == "html" + assert result["toRecipients"] == graph_payload["toRecipients"] + assert result["ccRecipients"] == graph_payload["ccRecipients"] + assert result["bccRecipients"] == [] + assert result["from"]["emailAddress"]["address"] == "alice@example.com" + assert result["hasAttachments"] is False + assert result["internetMessageHeaders"][0]["value"] == "" + + @pytest.mark.asyncio + async def test_body_returned_verbatim_no_truncation_on_our_side(self) -> None: + """Long bodies must not be clipped — this is the whole point of the tool.""" + from entraclaw.tools.email import read_email + + message_id = "BIG==" + url = GRAPH_MESSAGE_URL_TMPL.format(message_id=message_id) + # 50 KB body — well past any preview cap. + long_body = "x" * 50_000 + + with respx.mock: + respx.get(url).mock( + return_value=httpx.Response( + 200, + json={ + "id": message_id, + "subject": "big", + "body": {"contentType": "text", "content": long_body}, + "from": {"emailAddress": {"address": "a@example.com"}}, + "toRecipients": [], + "hasAttachments": False, + }, + ) + ) + result = await read_email(message_id=message_id, token="tok") + + assert len(result["body"]["content"]) == 50_000 + assert result["body"]["content"] == long_body + + +# --------------------------------------------------------------------------- +# shared mailbox +# --------------------------------------------------------------------------- +class TestReadEmailSharedMailbox: + @pytest.mark.asyncio + async def test_mailbox_param_routes_to_users_endpoint(self) -> None: + from entraclaw.tools.email import read_email + + message_id = "SHARED==" + mailbox = "shared@example.com" + url = GRAPH_MESSAGE_USER_URL_TMPL.format(mailbox=mailbox, message_id=message_id) + + captured: dict = {} + + def handler(request): + captured["url"] = str(request.url) + return httpx.Response( + 200, + json={ + "id": message_id, + "subject": "from shared", + "body": {"contentType": "text", "content": "hi"}, + "from": {"emailAddress": {"address": "a@example.com"}}, + "toRecipients": [], + "hasAttachments": False, + }, + ) + + with respx.mock: + respx.get(url).mock(side_effect=handler) + result = await read_email(message_id=message_id, mailbox=mailbox, token="tok") + + assert f"/users/{mailbox}/messages/{message_id}" in captured["url"] + assert "/me/messages" not in captured["url"] + assert result["id"] == message_id + + +# --------------------------------------------------------------------------- +# error paths +# --------------------------------------------------------------------------- +class TestReadEmailErrors: + @pytest.mark.asyncio + async def test_404_returns_clean_error_dict(self) -> None: + from entraclaw.tools.email import read_email + + message_id = "MISSING==" + url = GRAPH_MESSAGE_URL_TMPL.format(message_id=message_id) + + with respx.mock: + respx.get(url).mock( + return_value=httpx.Response( + 404, + json={ + "error": { + "code": "ErrorItemNotFound", + "message": "The specified object was not found in the store.", + } + }, + ) + ) + result = await read_email(message_id=message_id, token="tok") + + assert result["status"] == 404 + assert "error" in result + # Don't echo the bearer token back, ever. + assert "tok" not in str(result) + + @pytest.mark.asyncio + async def test_401_raises_token_expired(self) -> None: + from entraclaw.errors import TokenExpiredError + from entraclaw.tools.email import read_email + + message_id = "ANY==" + url = GRAPH_MESSAGE_URL_TMPL.format(message_id=message_id) + + with respx.mock: + respx.get(url).mock(return_value=httpx.Response(401)) + with pytest.raises(TokenExpiredError): + await read_email(message_id=message_id, token="expired-tok") + + @pytest.mark.asyncio + async def test_500_returns_error_dict_with_status(self) -> None: + from entraclaw.tools.email import read_email + + message_id = "BOOM==" + url = GRAPH_MESSAGE_URL_TMPL.format(message_id=message_id) + + with respx.mock: + respx.get(url).mock(return_value=httpx.Response(500, text="server boom")) + result = await read_email(message_id=message_id, token="tok") + + assert result["status"] == 500 + assert "error" in result + + @pytest.mark.asyncio + async def test_token_not_leaked_in_error_message_on_failure(self) -> None: + """If anything goes wrong, the token must not surface in the result.""" + from entraclaw.tools.email import read_email + + message_id = "ANY==" + url = GRAPH_MESSAGE_URL_TMPL.format(message_id=message_id) + secret = "super-secret-bearer-token-do-not-log" + + with respx.mock: + respx.get(url).mock( + return_value=httpx.Response(403, json={"error": {"code": "Forbidden"}}) + ) + result = await read_email(message_id=message_id, token=secret) + + assert secret not in str(result) From 8c8da64a979c2bac05868c25a09f100e63588bbf Mon Sep 17 00:00:00 2001 From: "B. Brandon Werner" Date: Wed, 27 May 2026 15:01:12 -0700 Subject: [PATCH 12/13] feat(scripts): add prereqs-macos.sh + setup.sh pwsh/.NET install fixes Homebrew-based prerequisite installer for macOS, mirroring prereqs-windows.ps1. Installs Xcode CLT, Python 3.12+, git, Azure CLI by default; .NET SDK + a365 CLI + PowerShell 7 are on by default with --skip-a365, --skip-pwsh, --core-only opt-outs. Idempotent. Prints a per-tool already/installed/failed summary at the end. setup.sh's prereq-missing error now points macOS users at the script (or the manual `brew install python@3.12 git azure-cli` line). Includes fixes from real install failures: PowerShell on macOS is the `powershell` formula in homebrew-core (the cask was retired), and the .NET SDK is the `dotnet` formula (not the `dotnet-sdk` cask) since the powershell formula depends on dotnet (formula). Co-Authored-By: Claude Opus 4.7 (1M context) --- scripts/prereqs-macos.sh | 361 +++++++++++++++++++++++++++++++++++++++ scripts/setup.sh | 24 +++ 2 files changed, 385 insertions(+) create mode 100755 scripts/prereqs-macos.sh diff --git a/scripts/prereqs-macos.sh b/scripts/prereqs-macos.sh new file mode 100755 index 0000000..f2509af --- /dev/null +++ b/scripts/prereqs-macos.sh @@ -0,0 +1,361 @@ +#!/usr/bin/env bash +# EntraClaw — macOS prerequisite installer. +# +# Checks for and installs (via Homebrew) everything needed to run setup.sh: +# 1. Homebrew (instructions only — installer needs sudo and EULA) +# 2. Xcode Command Line Tools (clang, git, headers for native Python pkgs) +# 3. Python 3.12+ +# 4. Git (usually shipped with Xcode CLT; brew install as fallback) +# 5. Azure CLI +# 6. .NET SDK [optional, --skip-a365 to opt out] +# 7. Microsoft Agent 365 DevTools CLI (a365) [needs .NET SDK] +# 8. PowerShell 7+ [optional, --skip-pwsh to opt out] +# +# Safe to re-run — skips anything already installed. Run BEFORE setup.sh. +# +# Usage: +# ./scripts/prereqs-macos.sh # install everything +# ./scripts/prereqs-macos.sh --skip-a365 # skip .NET SDK and a365 CLI +# ./scripts/prereqs-macos.sh --skip-pwsh # skip PowerShell 7 +# ./scripts/prereqs-macos.sh --core-only # skip both a365 and pwsh +# +set -euo pipefail + +SKIP_A365=false +SKIP_PWSH=false +SHOW_HELP=false + +for arg in "$@"; do + case $arg in + --skip-a365) SKIP_A365=true ;; + --skip-pwsh) SKIP_PWSH=true ;; + --core-only) SKIP_A365=true; SKIP_PWSH=true ;; + -h|--help) SHOW_HELP=true ;; + *) echo "ERROR: Unknown argument: $arg" >&2; SHOW_HELP=true ;; + esac +done + +if [ "$SHOW_HELP" = true ]; then + sed -n '2,20p' "$0" | sed 's/^# \{0,1\}//' + exit 0 +fi + +# ── Colored output (matches setup.sh) ────────────────────────────────────── +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +RED='\033[0;31m' +CYAN='\033[0;36m' +GRAY='\033[0;90m' +NC='\033[0m' + +step() { echo -e "\n${CYAN}══ $1${NC}"; } +ok() { echo -e " ${GREEN}✓ $1${NC}"; } +skip() { echo -e " ${GRAY}○ $1${NC}"; } +do_install() { echo -e " ${YELLOW}→ $1${NC}"; } +warn() { echo -e " ${YELLOW}⚠ $1${NC}"; } +err() { echo -e " ${RED}✗ $1${NC}"; } + +INSTALLED=() +ALREADY=() +FAILED=() + +is_macos() { [ "$(uname -s)" = "Darwin" ]; } + +if ! is_macos; then + err "This script only runs on macOS. For Linux, use your package manager directly; for Windows, see scripts/prereqs-windows.ps1." + exit 1 +fi + +# ─────────────────────────────────────────────────────────────────────────── +# 0. Homebrew +# ─────────────────────────────────────────────────────────────────────────── +step "Homebrew (package manager)" + +# Detect arch-appropriate brew prefix and add to PATH for this session. +BREW_PREFIX="" +if [ -x /opt/homebrew/bin/brew ]; then + BREW_PREFIX="/opt/homebrew" +elif [ -x /usr/local/bin/brew ]; then + BREW_PREFIX="/usr/local" +fi + +if [ -n "$BREW_PREFIX" ] && [ -z "${HOMEBREW_PREFIX:-}" ]; then + eval "$($BREW_PREFIX/bin/brew shellenv)" +fi + +if ! command -v brew &>/dev/null; then + err "Homebrew not found." + echo "" + echo " Install it first (needs sudo + your password):" + echo "" + echo " /bin/bash -c \"\$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"" + echo "" + echo " Then re-run this script." + exit 1 +fi +ok "brew available ($(brew --version | head -n1))" + +# ─────────────────────────────────────────────────────────────────────────── +# 1. Xcode Command Line Tools (clang, git, headers for native Python deps) +# ─────────────────────────────────────────────────────────────────────────── +step "Xcode Command Line Tools (clang + headers for cryptography/cffi)" + +if xcode-select -p &>/dev/null; then + ok "Xcode CLT already installed ($(xcode-select -p))" + ALREADY+=("Xcode CLT") +else + do_install "Triggering Xcode CLT install (a system dialog will appear)..." + # `xcode-select --install` is interactive; it pops a GUI dialog. + # The script can't complete the install for the user, so we instruct + # and let them confirm before continuing. + xcode-select --install || true + echo "" + warn "Accept the system dialog to install Xcode CLT, then press Enter to continue (or Ctrl-C to abort)." + read -r _ + if xcode-select -p &>/dev/null; then + ok "Xcode CLT installed" + INSTALLED+=("Xcode CLT") + else + FAILED+=("Xcode CLT") + fi +fi + +# ─────────────────────────────────────────────────────────────────────────── +# 2. Python 3.12+ +# ─────────────────────────────────────────────────────────────────────────── +step "Python 3.12+" + +py_ver_ok() { + local cmd="$1" + command -v "$cmd" &>/dev/null || return 1 + "$cmd" -c "import sys; sys.exit(0 if sys.version_info >= (3, 12) else 1)" &>/dev/null +} + +PY_FOUND="" +for candidate in python3.13 python3.12 python3; do + if py_ver_ok "$candidate"; then + PY_FOUND="$candidate" + break + fi +done + +if [ -n "$PY_FOUND" ]; then + PY_VER=$("$PY_FOUND" -c "import sys; print(f'{sys.version_info.major}.{sys.version_info.minor}.{sys.version_info.micro}')") + ok "Python $PY_VER already installed ($PY_FOUND → $(command -v "$PY_FOUND"))" + ALREADY+=("Python $PY_VER") +else + do_install "Installing Python 3.12 via Homebrew..." + if brew install python@3.12; then + # brew links python@3.12 as `python3.12` only; setup.sh probes for that. + if py_ver_ok python3.12; then + ok "Python 3.12 installed" + INSTALLED+=("Python 3.12") + else + warn "python@3.12 installed but python3.12 not on PATH. Open a new shell and re-run." + FAILED+=("Python 3.12 (PATH)") + fi + else + FAILED+=("Python 3.12") + fi +fi + +# ─────────────────────────────────────────────────────────────────────────── +# 3. Git (almost always present via Xcode CLT; brew install as a safety net) +# ─────────────────────────────────────────────────────────────────────────── +step "Git" + +if command -v git &>/dev/null; then + ok "Git already installed ($(git --version))" + ALREADY+=("Git") +else + do_install "Installing Git via Homebrew..." + if brew install git; then + INSTALLED+=("Git") + ok "Git installed" + else + FAILED+=("Git") + fi +fi + +# ─────────────────────────────────────────────────────────────────────────── +# 4. Azure CLI +# ─────────────────────────────────────────────────────────────────────────── +step "Azure CLI" + +if command -v az &>/dev/null; then + AZ_VER=$(az version --query '"azure-cli"' -o tsv 2>/dev/null || echo "?") + ok "Azure CLI $AZ_VER already installed" + ALREADY+=("Azure CLI") +else + do_install "Installing Azure CLI via Homebrew..." + if brew install azure-cli; then + INSTALLED+=("Azure CLI") + ok "Azure CLI installed" + else + FAILED+=("Azure CLI") + fi +fi + +# ─────────────────────────────────────────────────────────────────────────── +# 5. .NET SDK (optional — only needed for the Microsoft Agent 365 DevTools CLI) +# ─────────────────────────────────────────────────────────────────────────── +if [ "$SKIP_A365" = true ]; then + step ".NET SDK" + skip "Skipped (--skip-a365 / --core-only). Pass --with-a365-work-iq to setup.sh and re-run prereqs to add it later." +else + step ".NET SDK (for Microsoft Agent 365 DevTools CLI)" + + if command -v dotnet &>/dev/null; then + ok ".NET SDK already installed ($(dotnet --version))" + ALREADY+=(".NET SDK") + else + do_install "Installing .NET SDK via Homebrew formula..." + # Use the `dotnet` formula (not the `dotnet-sdk` cask). The + # `powershell` formula installed below depends on this same + # formula; mixing the cask and formula causes a symlink conflict + # on /opt/homebrew/bin/dotnet. + if brew install dotnet; then + INSTALLED+=(".NET SDK") + ok ".NET SDK installed" + else + FAILED+=(".NET SDK") + fi + fi + + # ── 6. a365 CLI (depends on dotnet) ──────────────────────────────────── + step "Microsoft Agent 365 DevTools CLI (a365)" + + # The dotnet global tools dir is not on PATH by default on macOS. + export PATH="$PATH:$HOME/.dotnet/tools" + + if ! command -v dotnet &>/dev/null; then + err "dotnet not found; cannot install a365" + FAILED+=("Agent 365 DevTools CLI") + elif command -v a365 &>/dev/null; then + ok "a365 already installed" + ALREADY+=("Agent 365 DevTools CLI") + do_install "Checking for a365 update..." + if dotnet tool update --global Microsoft.Agents.A365.DevTools.Cli 2>/dev/null; then + ok "a365 updated" + else + skip "a365 update skipped (already current or not managed by dotnet global tools)" + fi + else + do_install "Installing a365 via 'dotnet tool install --global'..." + if dotnet tool install --global Microsoft.Agents.A365.DevTools.Cli; then + export PATH="$PATH:$HOME/.dotnet/tools" + if command -v a365 &>/dev/null; then + INSTALLED+=("Agent 365 DevTools CLI") + ok "a365 installed" + else + warn "a365 installed but ~/.dotnet/tools is not on PATH for this shell." + warn "Add this to ~/.zshrc (or ~/.bash_profile) and open a new terminal:" + warn " export PATH=\"\$PATH:\$HOME/.dotnet/tools\"" + INSTALLED+=("Agent 365 DevTools CLI (needs PATH update)") + fi + else + FAILED+=("Agent 365 DevTools CLI") + fi + fi +fi + +# ─────────────────────────────────────────────────────────────────────────── +# 7. PowerShell 7+ (optional — only needed for --configure-a365-work-iq) +# ─────────────────────────────────────────────────────────────────────────── +if [ "$SKIP_PWSH" = true ]; then + step "PowerShell 7+" + skip "Skipped (--skip-pwsh / --core-only). Only needed for setup.sh --configure-a365-work-iq." +else + step "PowerShell 7+ (for setup.sh --configure-a365-work-iq)" + + if command -v pwsh &>/dev/null; then + PWSH_VER=$(pwsh -NoProfile -Command '$PSVersionTable.PSVersion.ToString()' 2>/dev/null || echo "?") + ok "PowerShell $PWSH_VER already installed" + ALREADY+=("PowerShell 7") + else + do_install "Installing PowerShell via Homebrew formula..." + # `powershell` is a formula in homebrew-core (depends on dotnet, + # which we've already installed by this point). The older cask + # of the same name was retired — `brew install --cask powershell` + # now fails with "No Cask with this name exists". + if brew install powershell; then + INSTALLED+=("PowerShell 7") + ok "PowerShell installed" + else + FAILED+=("PowerShell 7") + fi + fi +fi + +# ─────────────────────────────────────────────────────────────────────────── +# Final validation — re-probe everything setup.sh needs +# ─────────────────────────────────────────────────────────────────────────── +step "Final validation" + +export PATH="$PATH:$HOME/.dotnet/tools" + +all_good=true +check() { + local name="$1"; local cmd="$2"; local required="${3:-true}" + if command -v "$cmd" &>/dev/null; then + ok "$name found ($(command -v "$cmd"))" + else + if [ "$required" = "true" ]; then + err "$name NOT FOUND — open a new terminal and re-check" + all_good=false + else + skip "$name not installed (optional)" + fi + fi +} + +check "Homebrew" brew true +check "git" git true +check "Python 3.12+" python3.12 true +check "Azure CLI" az true +if [ "$SKIP_A365" = false ]; then + check ".NET SDK" dotnet true + check "a365 CLI" a365 true +fi +if [ "$SKIP_PWSH" = false ]; then + check "pwsh" pwsh false +fi + +# ─────────────────────────────────────────────────────────────────────────── +# Summary +# ─────────────────────────────────────────────────────────────────────────── +echo "" +echo -e "${CYAN}═══════════════════════════════════════════════════════${NC}" +echo -e "${CYAN} PREREQUISITE CHECK COMPLETE${NC}" +echo -e "${CYAN}═══════════════════════════════════════════════════════${NC}" + +if [ ${#ALREADY[@]} -gt 0 ]; then + echo -e "\n ${GREEN}Already installed:${NC}" + for i in "${ALREADY[@]}"; do echo -e " ${GREEN}• $i${NC}"; done +fi +if [ ${#INSTALLED[@]} -gt 0 ]; then + echo -e "\n ${YELLOW}Newly installed:${NC}" + for i in "${INSTALLED[@]}"; do echo -e " ${YELLOW}• $i${NC}"; done +fi +if [ ${#FAILED[@]} -gt 0 ]; then + echo -e "\n ${RED}FAILED to install:${NC}" + for i in "${FAILED[@]}"; do echo -e " ${RED}• $i${NC}"; done +fi + +echo "" +if [ ${#FAILED[@]} -gt 0 ]; then + err "Some prerequisites failed to install. Fix them manually and re-run." + exit 1 +elif [ "$all_good" = false ]; then + warn "Installs succeeded but some tools aren't on PATH yet." + warn "Open a NEW terminal, then run:" + warn " ./scripts/setup.sh --new --with-upn-suffix=" + exit 0 +else + ok "All prerequisites ready!" + echo "" + echo " Next step:" + echo " ./scripts/setup.sh --new --with-upn-suffix=" + echo "" + exit 0 +fi diff --git a/scripts/setup.sh b/scripts/setup.sh index 977b7d9..257bbd7 100755 --- a/scripts/setup.sh +++ b/scripts/setup.sh @@ -136,6 +136,11 @@ fi if [ "$SHOW_HELP" = true ]; then echo "Usage: ./scripts/setup.sh [OPTIONS]" echo "" + echo "Prerequisites (az CLI, Python 3.12+, git):" + echo " macOS: ./scripts/prereqs-macos.sh" + echo " Windows: .\\scripts\\prereqs-windows.ps1" + echo " Linux: install python3.12, git, and the Azure CLI via your package manager" + echo "" echo "Options:" echo "" echo "Identity mode (one required):" @@ -353,6 +358,25 @@ if [ ${#MISSING[@]} -gt 0 ]; then for m in "${MISSING[@]}"; do echo -e " ${RED}✗ $m${NC}" done + echo "" + if [ "$(uname -s)" = "Darwin" ]; then + echo -e " ${YELLOW}One-shot installer for macOS:${NC}" + echo -e " ${YELLOW}./scripts/prereqs-macos.sh${NC}" + echo -e " ${YELLOW}(installs Xcode CLT, Python 3.12+, git, Azure CLI, and optionally${NC}" + echo -e " ${YELLOW}.NET SDK + a365 CLI + PowerShell via Homebrew. Pass --core-only${NC}" + echo -e " ${YELLOW}to skip the optional Microsoft Agent 365 tooling.)${NC}" + echo "" + echo -e " ${YELLOW}Or install manually with Homebrew:${NC}" + echo -e " ${YELLOW}brew install python@3.12 git azure-cli${NC}" + elif [ "$(uname -s)" = "Linux" ]; then + echo -e " ${YELLOW}Install via your distro's package manager:${NC}" + echo -e " ${YELLOW}# Debian/Ubuntu${NC}" + echo -e " ${YELLOW}sudo apt install python3.12 git${NC}" + echo -e " ${YELLOW}curl -sL https://aka.ms/InstallAzureCLIDeb | sudo bash${NC}" + else + echo -e " ${YELLOW}For Windows, see scripts/prereqs-windows.ps1.${NC}" + fi + echo "" fail "Install the missing prerequisites above and re-run." fi From 47ef155d226b77296b7743b12ec86835a0223dda Mon Sep 17 00:00:00 2001 From: "B. Brandon Werner" Date: Wed, 27 May 2026 15:38:56 -0700 Subject: [PATCH 13/13] =?UTF-8?q?docs:=20launching=20the=20agent=20?= =?UTF-8?q?=E2=80=94=20host-by-host=20channel-push=20differences?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Documents how to launch the entraclaw stdio MCP server from different host CLIs and the inbound-message delivery model that differs between them — channel push (Claude Code) vs auto-blocking send_teams_message (Copilot CLI, Codex, Cursor). - README "Launching the agent" section with the `--dangerously-load-development-channels server:entraclaw` invocation and a callout that the double-dash matters (Learning #44). - quickstart.md expanded "Launching the Agent" section with the same content plus the dog-ASCII heartbeat preview shown by hosts that block on a Teams reply. - TODOS P0 entry tracking the persona-sati OAuth /authorize + /token PKCE gap that blocks SSE-native auth on Claude Code v2.1.152; the current workaround uses persona-sati's stdio shim. Fix lives in persona-sati, not here. --- README.md | 60 ++++++++++++++++++------------ TODOS.md | 19 +++++++++- docs/getting-started/quickstart.md | 50 +++++++++++++++++++++++-- 3 files changed, 100 insertions(+), 29 deletions(-) diff --git a/README.md b/README.md index a4c168f..594009e 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ # Entraclaw: Identity Research for Microsoft 365 Agents -Entraclaw is a research project. It is a Python MCP server that gives a device-local agent its own Entra **Agent ID** and an **Agent User** that has all the capabilities of a human user in a Microsoft tenant. It can have a Teams presence and be invited to meetings to chat with your colleagues 1:1, a mailbox it can monitor and respond to, create and edit Word documents, make PowerPoint presentations, and allows you to access your CLI. The agent signs in autonomously, sends Teams messages from its own account, and writes audit events against its own object ID. It runs on macOS, Linux, and Windows, and works with Claude Code, Copilot CLI, or any MCP-speaking client. +Entraclaw is a Python MCP server that gives a device-local agent its own Entra **Agent ID** and an **Agent User** that has all the capabilities of a human user in a Microsoft tenant. It can have a Teams presence and be invited to meetings to chat with your colleagues 1:1, a mailbox it can monitor and respond to, create and edit Word documents, make PowerPoint presentations, and allows you to access your CLI. The agent signs in autonomously, sends Teams messages from its own account, and writes audit events against its own object ID. It runs on macOS, Linux, and Windows, and works with Claude Code, Copilot CLI, or any MCP-speaking client. **All you need to get started is:** @@ -89,8 +89,8 @@ Full walkthrough in [`docs/architecture/system-overview.md`](docs/architecture/s Mac or Linux: ```bash -git clone https://github.com/microsoft/entraclaw.git -cd Entraclaw +git clone https://github.com/brandwe/entraclaw-identity-research.git +cd entraclaw-identity-research ./scripts/setup.sh --new --with-upn-suffix=yourname source .venv/bin/activate claude --dangerously-load-development-channels server:entraclaw @@ -98,6 +98,37 @@ claude --dangerously-load-development-channels server:entraclaw `setup.sh` is idempotent. It provisions the Blueprint, BlueprintPrincipal, Agent Identity, and Agent User; assigns a Teams-capable license; uploads a self-signed certificate to Entra; and writes `.env` plus `.mcp.json` with no secrets on disk. Full walkthrough — including Windows, cloud memory, cross-tenant group chats, and the Work IQ Word setup — is in [`docs/getting-started/quickstart.md`](docs/getting-started/quickstart.md) and [`INSTALL.md`](INSTALL.md). +### Launching the agent + +The repo isn't published to npm/pypi — your host CLI loads the local stdio MCP server from `.mcp.json` in the cwd. No flag needed for that; it's auto-discovered. What differs between hosts is **how inbound Teams DMs reach the agent**. + +**Claude Code (recommended).** Channel push: inbound Teams messages and emails arrive as next-turn system reminders without a tool call. Requires the dev-channel allowlist flag: + +```bash +claude --dangerously-load-development-channels server:entraclaw +``` + +The double-dash matters — single-dash silently treats `server:entraclaw` as prompt text (Learning #44). `server:entraclaw` is the MCP server name from `.mcp.json`, not a publication identifier. + +**GitHub Copilot CLI, Codex, Cursor, other non-Claude hosts.** MCP tools work, but there's no `notifications/claude/channel` equivalent — channel push is silently absent. Inbound Teams messages instead arrive **inline** as `sponsor_reply` on `send_teams_message`, which auto-blocks until the sponsor replies (host-detected, server-side). + +```bash +copilot # or: codex, cursor, etc. — no flag, just launch from the repo dir +``` + +While the agent is blocked waiting on a Teams reply (any host that calls `wait_for_sponsor_dm` explicitly), the host CLI shows a heartbeat animation so you know it's listening to Teams, not your keyboard: + +``` + __ + (___()'`; woof! 🐕 + /, /` + \"--\ + +(•ᴗ•) zZz... listening for Teams DM [42s] (Ctrl+C to break) +``` + +Frames cycle (`ʕ•ᴥ•ʔ waiting on sponsor`, `(´・ω・`) sponsor hasn't replied yet`, `(◕‿◕) still here, still waiting`, …) every ~30s with elapsed time. Ctrl+C breaks out cleanly. Full host-by-host protocol: [`docs/claude-copilot-cli-channel-port.md`](docs/claude-copilot-cli-channel-port.md) and [`prompts/anatomy/channel-discipline.md`](prompts/anatomy/channel-discipline.md). + After setup, use `./status.sh` as the canonical health and identity check: ```bash @@ -110,7 +141,7 @@ After setup, use `./status.sh` as the canonical health and identity check: ## Documentation -The full doc site: **** +The full doc site: **** Direct pointers: @@ -163,25 +194,6 @@ This is a research repo, not a production service. It runs reliably on a develop ## Contributing -This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and do grant us the rights to use your contribution. For details, visit . - -When you submit a pull request, a CLA bot automatically determines whether you need to provide a CLA and decorates the PR appropriately. Follow the bot's instructions. You only need to do this once across all repositories using the Microsoft CLA. - -This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/). For more information, see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with questions or comments. - Test discipline is the contract. TDD: failing test first, implementation second. `pytest -v && ruff check .` must pass before every commit; coverage threshold is 80%. -File issues for bugs and platform questions. PRs welcome — for anything touching auth, Teams, or the body prompt, read [`docs/runbooks/hard-won-learnings.md`](docs/runbooks/hard-won-learnings.md) first. The hard-won learnings file is append-only; new gotchas get numbered entries, never deletions. - -Useful links: - -- [Code of Conduct](https://opensource.microsoft.com/codeofconduct/) -- [Security policy](SECURITY.md) -- [MIT License](LICENSE) -- [Microsoft Open Source](https://opensource.microsoft.com/) -- [Microsoft Privacy Statement](https://privacy.microsoft.com/privacystatement) -- [Microsoft Trademarks](https://www.microsoft.com/legal/intellectualproperty/trademarks) - -## Disclaimer - -*This is a prototype. Flexible FIC and Entra Agent Identity are preview surfaces — APIs may change. The platform is designed to show the pattern and to be copyable, not to be run as-is in production.* +File issues for bugs and platform questions. PRs welcome — for anything touching auth, Teams, or the body prompt, read [`docs/runbooks/hard-won-learnings.md`](docs/runbooks/hard-won-learnings.md) first. The hard-won learnings file is append-only; new gotchas get numbered entries, never deletions. \ No newline at end of file diff --git a/TODOS.md b/TODOS.md index f4daf80..f4f0579 100644 --- a/TODOS.md +++ b/TODOS.md @@ -1,5 +1,20 @@ # TODOS +## P0 — DO NEXT + +### persona-sati: implement /authorize + /token PKCE flow (blocks Claude Code SSE) +Claude Code v2.1.152 now does MCP OAuth 2.1 discovery and ignores `.mcp.json` `headersHelper` when the server advertises OAuth metadata. Persona-sati advertises metadata (ADR-006 in persona-sati) but never shipped the browser PKCE `/authorize` endpoint — clicking "Authenticate" in Claude Code's `/mcp` UI lands on `{"error":"invalid_request","error_description":"Missing or malformed Authorization header"}`. + +**Current workaround:** entraclaw's `.mcp.json` was rewritten to use the persona-sati **stdio shim** (`persona-sati-stdio-shim.sh`) instead of native SSE+headersHelper. Works, but moves Claude Code users off the path ADR-005 designed around (SSE-native with mid-session token refresh). + +**Tracking:** persona-sati issue tracker. Fix lives in persona-sati, not here. + +**Bonus bug surfaced:** `persona-sati/scripts/setup.sh` line 280 gates the `.mcp.json` rewire behind `! --skip-deploy`, so `--mcp-transport=stdio --skip-deploy` silently fails to rewrite. Workaround: invoke `wire_mcp_json.py` directly. Also tracked in the persona-sati issue tracker. + +- **Effort:** M (persona-sati side — auth design decision + `/authorize` consent page + `/token` PKCE + redirect-URI allowlist on DCR). No work in this repo until persona-sati ships. +- **Depends on:** persona-sati design choice for browser-flow identity binding (Entra device-code? B2B SSO? localhost-only?). +- **Source:** Diagnosed 2026-05-27 in entraclaw session — Claude Code v2.1.152. + ## P1 ### Script toolkit final phase: README + GitHub Pages script reference @@ -59,9 +74,9 @@ Two bugs, both observed at 2026-04-17T17:00:00 PDT (= 00:00:01 UTC 2026-04-18): - **Effort:** S (~30 LOC + tests for both) - **Source:** Live observation 2026-04-17 evening (first real scheduled fire) -### ~~Email cursor sub-second precision~~ ✅ DONE +### Email cursor sub-second precision `email_poll.poll_once` returns `latest_ts` verbatim from Graph; the cursor file may end up at second precision while Graph internally compares with sub-second. Result: an email at the cursor's exact second gets re-returned every poll. Per-session dedup in `_background_poll_email` handles within-session, but the email re-pushes once on every server restart. Real fix: bump cursor by 1ms when it equals the latest receivedDateTime, or store sub-second precision unconditionally. -- **Shipped:** `advance_cursor()` bumps the poll watermark by 1 ms after each batch (PR pending). +- **Effort:** XS (~10 LOC + 1 test) - **Source:** Live observation 2026-04-17 (a teammate's "Ball game tonight" loop) ### ~~Token auto-refresh in teams_send~~ ✅ DONE diff --git a/docs/getting-started/quickstart.md b/docs/getting-started/quickstart.md index f945224..bf94122 100644 --- a/docs/getting-started/quickstart.md +++ b/docs/getting-started/quickstart.md @@ -1,6 +1,6 @@ # Quickstart -**Source:** +**Source:** ## Prerequisites @@ -14,8 +14,8 @@ ## One-Command Setup (macOS/Linux) ```bash -git clone https://github.com/microsoft/entraclaw.git -cd Entraclaw +git clone https://github.com/brandwe/entraclaw-identity-research.git +cd entraclaw-identity-research ./scripts/setup.sh ``` @@ -48,6 +48,50 @@ See `docs/reference/setup-script.md` for the full flag list, and `docs/guides/st pytest -v --cov=entraclaw --cov-report=term-missing ``` +## Launching the Agent + +The repo isn't published to npm or pypi — your host CLI loads the local stdio MCP server from `.mcp.json` in the current directory. No flag is needed for that part; MCP servers in `.mcp.json` are auto-discovered. What differs between hosts is **how inbound Teams DMs reach the agent**. + +### Claude Code (recommended) + +Channel push: inbound Teams messages and emails arrive as next-turn system reminders without a tool call. Requires the dev-channel allowlist flag: + +```bash +claude --dangerously-load-development-channels server:entraclaw +``` + +The double-dash matters. Single-dash silently treats `server:entraclaw` as prompt text — see Learning #44 in [`docs/runbooks/hard-won-learnings.md`](../runbooks/hard-won-learnings.md). `server:entraclaw` is the MCP server name from `.mcp.json`, not a publication identifier — the value matches the key inside the `mcpServers` object in `.mcp.json`. + +### GitHub Copilot CLI, Codex, Cursor, and other non-Claude hosts + +MCP tools work, but there is no `notifications/claude/channel` equivalent — channel push is silently absent. Inbound Teams messages instead arrive **inline** as a `sponsor_reply` field on `send_teams_message`, which auto-blocks until the sponsor replies. This is host-detected on the server side; no flag, no parameter: + +```bash +copilot # or: codex, cursor, etc. — no flag, just launch from the repo dir +``` + +While the agent is blocked waiting on a Teams reply (any host that calls `wait_for_sponsor_dm` explicitly, or the auto-wait inside `send_teams_message` on non-Claude hosts), the host CLI shows a heartbeat animation so you know it is listening to Teams, not your keyboard: + +``` + __ + (___()'`; woof! 🐕 + /, /` + \"--\ + +(•ᴗ•) zZz... listening for Teams DM [42s] (Ctrl+C to break) +``` + +Frames cycle every ~30s with an elapsed-time counter: + +- `(•ᴗ•) zZz... listening for Teams DM` +- `(•ᴗ•)╯ checking inbox` +- `ʕ•ᴥ•ʔ waiting on sponsor` +- `(´・ω・`) sponsor hasn't replied yet` +- `(╯°□°)╯ Teams DM = next turn` +- `(◕‿◕) still here, still waiting` + +Ctrl+C breaks out cleanly. Full host-by-host protocol is in [`docs/claude-copilot-cli-channel-port.md`](../claude-copilot-cli-channel-port.md) and [`prompts/anatomy/channel-discipline.md`](../../prompts/anatomy/channel-discipline.md). + ## Common Pitfalls - **Teams provisioning latency.** The 10–15 min wait is real. If `create_chat` 404s, give it another five minutes before debugging.