Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
101 changes: 101 additions & 0 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL Advanced"

on:
push:
branches: [ "main", "dev" ]
pull_request:
branches: [ "main", "dev" ]
schedule:
- cron: '41 11 * * 4'

jobs:
analyze:
name: Analyze (${{ matrix.language }})
# Runner size impacts CodeQL analysis time. To learn more, please see:
# - https://gh.io/recommended-hardware-resources-for-running-codeql
# - https://gh.io/supported-runners-and-hardware-resources
# - https://gh.io/using-larger-runners (GitHub.com only)
# Consider using larger runners or machines with greater resources for possible analysis time improvements.
runs-on: ${{ (matrix.language == 'swift' && 'macos-latest') || 'ubuntu-latest' }}
permissions:
# required for all workflows
security-events: write

# required to fetch internal or private CodeQL packs
packages: read

# only required for workflows in private repositories
actions: read
contents: read

strategy:
fail-fast: false
matrix:
include:
- language: actions
build-mode: none
- language: python
build-mode: none
# CodeQL supports the following values keywords for 'language': 'actions', 'c-cpp', 'csharp', 'go', 'java-kotlin', 'javascript-typescript', 'python', 'ruby', 'rust', 'swift'
# Use `c-cpp` to analyze code written in C, C++ or both
# Use 'java-kotlin' to analyze code written in Java, Kotlin or both
# Use 'javascript-typescript' to analyze code written in JavaScript, TypeScript or both
# To learn more about changing the languages that are analyzed or customizing the build mode for your analysis,
# see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/customizing-your-advanced-setup-for-code-scanning.
# If you are analyzing a compiled language, you can modify the 'build-mode' for that language to customize how
# your codebase is analyzed, see https://docs.github.com/en/code-security/code-scanning/creating-an-advanced-setup-for-code-scanning/codeql-code-scanning-for-compiled-languages
steps:
- name: Checkout repository
uses: actions/checkout@v4

# Add any setup steps before running the `github/codeql-action/init` action.
# This includes steps like installing compilers or runtimes (`actions/setup-node`
# or others). This is typically only required for manual builds.
# - name: Setup runtime (example)
# uses: actions/setup-example@v1

# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v4
with:
languages: ${{ matrix.language }}
build-mode: ${{ matrix.build-mode }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.

# For more details on CodeQL's query packs, refer to: https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/configuring-code-scanning#using-queries-in-ql-packs
# queries: security-extended,security-and-quality

# If the analyze step fails for one of the languages you are analyzing with
# "We were unable to automatically build your code", modify the matrix above
# to set the build mode to "manual" for that language. Then modify this step
# to build your code.
# ℹ️ Command-line programs to run using the OS shell.
# 📚 See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsrun
- name: Run manual build steps
if: matrix.build-mode == 'manual'
shell: bash
run: |
echo 'If you are using a "manual" build mode for one or more of the' \
'languages you are analyzing, replace this with the commands to build' \
'your code, for example:'
echo ' make bootstrap'
echo ' make release'
exit 1

- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v4
with:
category: "/language:${{matrix.language}}"
3 changes: 3 additions & 0 deletions .github/workflows/test-windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@ on:
pull_request:
branches: [main]

permissions:
contents: read

jobs:
test-windows:
runs-on: windows-latest
Expand Down
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ First public release. Reference implementation for Microsoft Entra Agent ID and
**Discipline**
- 1,237 tests, ruff clean, 80% coverage threshold enforced.
- 66 hard-won learnings at `docs/runbooks/hard-won-learnings.md`.
- Full docs site at <https://legendary-adventure-k5npoz7.pages.github.io/> with API reference, script reference, ADRs, platform learnings, and runbooks.
- Full docs site at <https://microsoft.github.io/entraclaw/> with API reference, script reference, ADRs, platform learnings, and runbooks.

### Known limitations

Expand Down
37 changes: 34 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,15 +89,46 @@ Full walkthrough in [`docs/architecture/system-overview.md`](docs/architecture/s
Mac or Linux:

```bash
git clone https://github.com/microsoft/Entraclaw.git
cd Entraclaw
git clone https://github.com/brandwe/entraclaw-identity-research.git
cd entraclaw-identity-research
./scripts/setup.sh --new --with-upn-suffix=yourname
source .venv/bin/activate
claude --dangerously-load-development-channels server:entraclaw
```

`setup.sh` is idempotent. It provisions the Blueprint, BlueprintPrincipal, Agent Identity, and Agent User; assigns a Teams-capable license; uploads a self-signed certificate to Entra; and writes `.env` plus `.mcp.json` with no secrets on disk. Full walkthrough — including Windows, cloud memory, cross-tenant group chats, and the Work IQ Word setup — is in [`docs/getting-started/quickstart.md`](docs/getting-started/quickstart.md) and [`INSTALL.md`](INSTALL.md).

### Launching the agent

The repo isn't published to npm/pypi — your host CLI loads the local stdio MCP server from `.mcp.json` in the cwd. No flag needed for that; it's auto-discovered. What differs between hosts is **how inbound Teams DMs reach the agent**.

**Claude Code (recommended).** Channel push: inbound Teams messages and emails arrive as next-turn system reminders without a tool call. Requires the dev-channel allowlist flag:

```bash
claude --dangerously-load-development-channels server:entraclaw
```

The double-dash matters — single-dash silently treats `server:entraclaw` as prompt text (Learning #44). `server:entraclaw` is the MCP server name from `.mcp.json`, not a publication identifier.

**GitHub Copilot CLI, Codex, Cursor, other non-Claude hosts.** MCP tools work, but there's no `notifications/claude/channel` equivalent — channel push is silently absent. Inbound Teams messages instead arrive **inline** as `sponsor_reply` on `send_teams_message`, which auto-blocks until the sponsor replies (host-detected, server-side).

```bash
copilot # or: codex, cursor, etc. — no flag, just launch from the repo dir
```

While the agent is blocked waiting on a Teams reply (any host that calls `wait_for_sponsor_dm` explicitly), the host CLI shows a heartbeat animation so you know it's listening to Teams, not your keyboard:

```
__
(___()'`; woof! 🐕
/, /`
\"--\

(•ᴗ•) zZz... listening for Teams DM [42s] (Ctrl+C to break)
```

Frames cycle (`ʕ•ᴥ•ʔ waiting on sponsor`, `(´・ω・`) sponsor hasn't replied yet`, `(◕‿◕) still here, still waiting`, …) every ~30s with elapsed time. Ctrl+C breaks out cleanly. Full host-by-host protocol: [`docs/claude-copilot-cli-channel-port.md`](docs/claude-copilot-cli-channel-port.md) and [`prompts/anatomy/channel-discipline.md`](prompts/anatomy/channel-discipline.md).

After setup, use `./status.sh` as the canonical health and identity check:

```bash
Expand All @@ -110,7 +141,7 @@ After setup, use `./status.sh` as the canonical health and identity check:

## Documentation

The full doc site: **<https://legendary-adventure-k5npoz7.pages.github.io/>**
The full doc site: **<https://brandonwerner.com/entraclaw-identity-research/>**

Direct pointers:

Expand Down
15 changes: 15 additions & 0 deletions TODOS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,20 @@
# TODOS

## P0 — DO NEXT

### persona-sati: implement /authorize + /token PKCE flow (blocks Claude Code SSE)
Claude Code v2.1.152 now does MCP OAuth 2.1 discovery and ignores `.mcp.json` `headersHelper` when the server advertises OAuth metadata. Persona-sati advertises metadata (ADR-006 in persona-sati) but never shipped the browser PKCE `/authorize` endpoint — clicking "Authenticate" in Claude Code's `/mcp` UI lands on `{"error":"invalid_request","error_description":"Missing or malformed Authorization header"}`.

**Current workaround:** entraclaw's `.mcp.json` was rewritten to use the persona-sati **stdio shim** (`persona-sati-stdio-shim.sh`) instead of native SSE+headersHelper. Works, but moves Claude Code users off the path ADR-005 designed around (SSE-native with mid-session token refresh).

**Tracking:** persona-sati issue tracker. Fix lives in persona-sati, not here.

**Bonus bug surfaced:** `persona-sati/scripts/setup.sh` line 280 gates the `.mcp.json` rewire behind `! --skip-deploy`, so `--mcp-transport=stdio --skip-deploy` silently fails to rewrite. Workaround: invoke `wire_mcp_json.py` directly. Also tracked in the persona-sati issue tracker.

- **Effort:** M (persona-sati side — auth design decision + `/authorize` consent page + `/token` PKCE + redirect-URI allowlist on DCR). No work in this repo until persona-sati ships.
- **Depends on:** persona-sati design choice for browser-flow identity binding (Entra device-code? B2B SSO? localhost-only?).
- **Source:** Diagnosed 2026-05-27 in entraclaw session — Claude Code v2.1.152.

## P1

### Script toolkit final phase: README + GitHub Pages script reference
Expand Down
2 changes: 1 addition & 1 deletion docs/developer/docs-site.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ The workflow:
4. Uploads the `site/` artifact to GitHub Pages.
5. Deploys via `actions/deploy-pages@v4`.

Published at <https://legendary-adventure-k5npoz7.pages.github.io/>. GitHub Pages is configured with `build_type=workflow`; re-enable via:
Published at <https://microsoft.github.io/entraclaw/>. GitHub Pages is configured with `build_type=workflow`; re-enable via:

```bash
gh api -X POST repos/<owner>/<repo>/pages -f 'build_type=workflow'
Expand Down
9 changes: 5 additions & 4 deletions docs/engineering-status.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Engineering Status

**Last updated:** 2026-05-21
**Status:** v1 released. Three auth modes (Agent User / Delegated / Bot Gateway) running locally on macOS, Linux, and ARM64 Windows 11. **1,237 tests** across the suite, ruff clean. Body-first prompt architecture loads at boot; persona-sati MCP wires personality and memory when configured. ADR-005 cloud-memory Phases 1, 2, 5, 6a shipped — blob storage is opt-in via `setup.sh --use-cloud-memory`. Work IQ Word migration landed (PR #75) and the `send_teams_message` auto-wait pattern is host-gated and deterministic. README, docs site, and GitHub Pages auto-deploy refreshed 2026-05-21.
**Last updated:** 2026-05-27
**Status:** v1 released. Three auth modes (Agent User / Delegated / Bot Gateway) running locally on macOS, Linux, and ARM64 Windows 11. **1,248 tests** across the suite, ruff clean. Body-first prompt architecture loads at boot; persona-sati MCP wires personality and memory when configured. ADR-005 cloud-memory Phases 1, 2, 5, 6a shipped — blob storage is opt-in via `setup.sh --use-cloud-memory`. Work IQ Word migration landed (PR #75) and the `send_teams_message` auto-wait pattern is host-gated and deterministic. README, docs site, and GitHub Pages auto-deploy refreshed 2026-05-21.

---

Expand All @@ -10,15 +10,16 @@
Source of truth for detail: `TODOS.md` in the repository root. One line each below.

- **Script-toolkit docs closeout** — `./status.sh` is the canonical entry; finish the remaining script-reference polish and smoke verification. See `TODOS.md` P1.
- **Test isolation: blob env leakage** — `tmp_data_dir` fixture in `tests/tools/test_interaction_log.py` doesn't clear `ENTRACLAW_BLOB_ENDPOINT`; 10 tests fail on any machine with blob env configured.
- **Test isolation: blob env leakage** — `tmp_data_dir` fixture in `tests/tools/test_interaction_log.py` doesn't clear `ENTRACLAW_BLOB_ENDPOINT`; 10 tests fail on any machine with blob env configured. Partially addressed: `test_interaction_log.py`, `test_daily_summary.py`, and `test_email_poll.py` fixtures now unset blob env; session-scoped autouse fixture still open.
- **MCP server orphans on Claude Code exit** — background poll tasks sit outside FastMCP's lifespan cancel scope; new sessions spawn a second server, both poll Graph independently.
- **Daily summary scheduler — wrong day + double-fire** — UTC-based `target_day` summarizes the brand-new UTC day at 5pm PDT; scheduler fired twice at the same second on 2026-04-17.
- **Email cursor sub-second precision** — cursor file at second precision; an email at the cursor's exact second gets re-pushed once on every server restart.

## Recently Shipped

Last ~30 days. Full diff: `git log --since="2026-04-21"`.

- **`read_email` MCP tool** (2026-05-27) — fetches the full body + all recipient lists + headers of an inbound mail by `message_id`. Fixes the gap where the 60s email-poll channel push truncates the preview of long forwarded mails. Same three-hop Agent User token + `Mail.Read` scope as the poll. +7 tests.
- **Email cursor sub-second precision** (2026-05-27) — `advance_cursor()` bumps the poll watermark by 1 ms so Graph's `gt` filter does not re-fetch messages at the cursor's exact second after a server restart.
- **README + docs-site refresh** (2026-05-21, ff9a8dd, 9b73dee, b495073) — developer-first README rewrite, GitHub Pages auto-deploy, nav restructure.
- **OSS sanitization passes** (2026-05-21, f2a3c18; 2026-05-18, 6cff243) — PII scrub, personal data and private identifiers removed from repo.
- **Script toolkit refactor + E2E smoke harness** (2026-05-19, PR #77) — `./status.sh` consolidated; `setup.sh --status` delegates to the same implementation.
Expand Down
50 changes: 47 additions & 3 deletions docs/getting-started/quickstart.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Quickstart

**Source:** <https://github.com/microsoft/Entraclaw>
**Source:** <https://github.com/brandwe/entraclaw-identity-research>

## Prerequisites

Expand All @@ -14,8 +14,8 @@
## One-Command Setup (macOS/Linux)

```bash
git clone https://github.com/microsoft/Entraclaw.git
cd Entraclaw
git clone https://github.com/brandwe/entraclaw-identity-research.git
cd entraclaw-identity-research
./scripts/setup.sh
```

Expand Down Expand Up @@ -48,6 +48,50 @@ See `docs/reference/setup-script.md` for the full flag list, and `docs/guides/st
pytest -v --cov=entraclaw --cov-report=term-missing
```

## Launching the Agent

The repo isn't published to npm or pypi — your host CLI loads the local stdio MCP server from `.mcp.json` in the current directory. No flag is needed for that part; MCP servers in `.mcp.json` are auto-discovered. What differs between hosts is **how inbound Teams DMs reach the agent**.

### Claude Code (recommended)

Channel push: inbound Teams messages and emails arrive as next-turn system reminders without a tool call. Requires the dev-channel allowlist flag:

```bash
claude --dangerously-load-development-channels server:entraclaw
```

The double-dash matters. Single-dash silently treats `server:entraclaw` as prompt text — see Learning #44 in [`docs/runbooks/hard-won-learnings.md`](../runbooks/hard-won-learnings.md). `server:entraclaw` is the MCP server name from `.mcp.json`, not a publication identifier — the value matches the key inside the `mcpServers` object in `.mcp.json`.

### GitHub Copilot CLI, Codex, Cursor, and other non-Claude hosts

MCP tools work, but there is no `notifications/claude/channel` equivalent — channel push is silently absent. Inbound Teams messages instead arrive **inline** as a `sponsor_reply` field on `send_teams_message`, which auto-blocks until the sponsor replies. This is host-detected on the server side; no flag, no parameter:

```bash
copilot # or: codex, cursor, etc. — no flag, just launch from the repo dir
```

While the agent is blocked waiting on a Teams reply (any host that calls `wait_for_sponsor_dm` explicitly, or the auto-wait inside `send_teams_message` on non-Claude hosts), the host CLI shows a heartbeat animation so you know it is listening to Teams, not your keyboard:

```
__
(___()'`; woof! 🐕
/, /`
\"--\

(•ᴗ•) zZz... listening for Teams DM [42s] (Ctrl+C to break)
```

Frames cycle every ~30s with an elapsed-time counter:

- `(•ᴗ•) zZz... listening for Teams DM`
- `(•ᴗ•)╯ checking inbox`
- `ʕ•ᴥ•ʔ waiting on sponsor`
- `(´・ω・`) sponsor hasn't replied yet`
- `(╯°□°)╯ Teams DM = next turn`
- `(◕‿◕) still here, still waiting`

Ctrl+C breaks out cleanly. Full host-by-host protocol is in [`docs/claude-copilot-cli-channel-port.md`](../claude-copilot-cli-channel-port.md) and [`prompts/anatomy/channel-discipline.md`](../../prompts/anatomy/channel-discipline.md).

## Common Pitfalls

- **Teams provisioning latency.** The 10–15 min wait is real. If `create_chat` 404s, give it another five minutes before debugging.
Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Entraclaw Identity Research

**Source:** <https://github.com/microsoft/Entraclaw> · **License:** MIT
**Source:** <https://github.com/microsoft/entraclaw> · **License:** MIT

Entraclaw is a Python MCP server that gives a device-local agent its own Entra **Agent ID** and an **Agent User** that has all the capabilities of a human user in a Microsoft tenant. It can have a Teams presence and be invited to meetings to chat with your colleagues 1:1, a mailbox it can monitor and respond to, create and edit Word documents, make PowerPoint presentations, and allows you to access your CLI. The agent signs in autonomously, sends Teams messages from its own account, and writes audit events against its own object ID. It runs on macOS, Linux, and Windows, and works with Claude Code, Copilot CLI, or any MCP-speaking client.

Expand Down
4 changes: 1 addition & 3 deletions docs/reference/agent-foundry-entra-provisioning.py
Original file line number Diff line number Diff line change
Expand Up @@ -469,11 +469,9 @@ def verify_graph_preflight(token, checks):
def bootstrap_cli():
try:
required_values = build_required_permission_values(include_ca=True)
client_id, _, tenant_id = ensure_app_registration(required_values)
ensure_app_registration(required_values)
print("")
print("Provisioner bootstrap complete")
print(f" Tenant: {tenant_id}")
print(f" Client ID: {client_id}")
print(f" Permissions: {', '.join(required_values)}")
return 0
except ProvisionerBootstrapError as exc:
Expand Down
6 changes: 3 additions & 3 deletions manifests/teams-app/manifest.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
"id": "00000000-0000-0000-0000-000000000000",
"developer": {
"name": "EntraClaw Research",
"websiteUrl": "https://github.com/microsoft/Entraclaw",
"privacyUrl": "https://github.com/microsoft/Entraclaw",
"termsOfUseUrl": "https://github.com/microsoft/Entraclaw"
"websiteUrl": "https://github.com/microsoft/entraclaw",
"privacyUrl": "https://github.com/microsoft/entraclaw",
"termsOfUseUrl": "https://github.com/microsoft/entraclaw"
},
"name": {
"short": "EntraClaw Agent",
Expand Down
Loading
Loading