Skip to content

feat: implement per-user token auth for Vault (#18)#36

Merged
jh-lee-cryptolab merged 6 commits intomainfrom
issue-18-per-user-auth
Mar 24, 2026
Merged

feat: implement per-user token auth for Vault (#18)#36
jh-lee-cryptolab merged 6 commits intomainfrom
issue-18-per-user-auth

Conversation

@jh-lee-cryptolab
Copy link
Copy Markdown
Contributor

@jh-lee-cryptolab jh-lee-cryptolab commented Mar 23, 2026

Summary

  • Per-user token management via runevault CLI (issue, revoke, list)
  • Role-based access control with CRUD operations (admin/agent defaults + custom roles)
  • In-memory token/role store with async YAML persistence (vault-tokens.yml, vault-roles.yml)
  • Admin HTTP API on container-internal unix socket (no admin token — SSH + docker group + container isolation)
  • Per-user rate limiting, top_k enforcement, scope checks, token expiry
  • Per-user monitoring labels in Prometheus metrics
  • Backward compatibility: legacy VAULT_TOKENS env var with deprecation warning
  • Cloud deployments (AWS/GCP/OCI): migrate VAULT_TOKENSVAULT_TEAM_SECRET, add config volumes and runevault alias

Design decisions

Metadata DEK: per-agent → single team key

Previously, metadata DEKs were issued per agent_id. With per-user tokens, every user with decrypt_metadata scope can decrypt metadata — so per-agent DEKs provide no meaningful isolation. If any single DEK is compromised, the attacker gains access to the same data that all other DEKs protect. Switched to a single team_secret-based metadata key to simplify key management without reducing effective security.

Test plan

  • pytest tests/unit/test_token_store.py — token lifecycle, role CRUD, rate limiting, persistence
  • pytest tests/unit/test_admin_server.py — admin HTTP API over unix socket
  • pytest tests/unit/test_auth.py — token validation, scope enforcement
  • Docker build + runevault token issue/list/revoke smoke test
  • Verify gRPC error codes (UNAUTHENTICATED, RESOURCE_EXHAUSTED, INVALID_ARGUMENT)

Closes #18

jh-lee-cryptolab and others added 3 commits March 23, 2026 12:27
Replace single shared VAULT_TOKENS with per-user token system:
- TokenStore with in-memory auth + async YAML persistence
- Admin HTTP server on internal unix socket for token/role CRUD
- runevault CLI (vault_admin_cli.py) for admin operations
- Per-role top_k, rate limiting, scope enforcement
- VAULT_TEAM_SECRET for shared DEK derivation (backward compat)
- Specific gRPC error codes (UNAUTHENTICATED, RESOURCE_EXHAUSTED, etc.)
- Per-user label in Prometheus metrics

New files: token_store.py, admin_server.py, vault_admin_cli.py
Tests: 38 passing (test_token_store.py, test_admin_server.py)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add per-user monitoring: resolve username before gRPC handlers so
  Prometheus user label reflects actual identity instead of "unknown"
- Change CLI --expires-days to --expires with duration syntax (90d, 12w, 6m)
  to match issue spec
- Add rate_limit format validation on role create/update to prevent
  deferred crashes during gRPC request handling
- Remove misleading delete_role CLI warning (server already rejects
  deletion when tokens reference the role)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Rename vault_token → team_secret across all Terraform configs (AWS/GCP/OCI)
- Replace VAULT_TOKENS env var with VAULT_TEAM_SECRET in startup scripts
- Add config volume mount and generate vault-roles.yml/vault-tokens.yml
  so cloud instances boot in per-user mode instead of legacy single-token
- Add runevault CLI alias and docker group setup for admin user

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jh-lee-cryptolab jh-lee-cryptolab self-assigned this Mar 23, 2026
jh-lee-cryptolab and others added 2 commits March 23, 2026 13:28
Unix socket bind failed with PermissionError in Docker due to /var/run
ownership. Switch to a standard HTTP server on localhost:8081, which is
not exposed in docker-compose and remains container-internal only.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…onfig files (#18)

Remove hardcoded role seeding from cloud-init.yaml; TokenStore now
auto-persists default roles/tokens YAML on first boot when files
don't exist yet.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… member (#18)

AWS cloud-init was missing vault-roles.yml/vault-tokens.yml generation.
Also fix GCP/OCI startup scripts to use 'member' role instead of 'agent'.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jh-lee-cryptolab
Copy link
Copy Markdown
Contributor Author

Additional changes planned for metadata DEK architecture:

Motivation: The current design derives a single shared DEK from VAULT_TEAM_SECRET and distributes it to all agents. If any agent's DEK is compromised, all team metadata encrypted with that key is exposed, and there is no way to rotate the key for the affected agent without impacting every other agent.

  1. Use VAULT_TEAM_SECRET as the master secret and HKDF for per-user DEK derivation.
    Each agent's DEK will be derived via HKDF using the team secret as input key material and a per-user identifier as salt/info: agent_id = SHA256(token)[:32], agent_dek = HKDF-SHA256(ikm=team_secret, info=agent_id). This ensures:

    • Each member receives a unique DEK tied to their token.
    • Vault can still decrypt any member's metadata using team_secret + the agent_id stored in the envelope {"a": ..., "c": ...}.
    • Token rotation automatically rotates the DEK — revoking a compromised token invalidates the associated DEK for future writes, while existing data remains decryptable via the envelope's agent_id.
  2. Remove dependency on pyenvector's MetadataKey.json for DEK derivation.
    Currently, Vault loads MetadataKey.json via get_key_stream() and uses it as the HMAC master key for per-agent DEK derivation. However, enVector Cloud's built-in metadata encryption is disabled (metadata_encryption=False on index creation), and the key file is only generated as a side effect of pyenvector's default KeyGenerator behavior (metadata_encryption defaults to True). This key is being repurposed outside its original intent. Item 1 above replaces this with Vault's own master secret.

Comment thread deployment/aws/cloud-init.yaml
@jh-lee-cryptolab jh-lee-cryptolab marked this pull request as ready for review March 24, 2026 22:54
@jh-lee-cryptolab jh-lee-cryptolab merged commit ef0d6cf into main Mar 24, 2026
@jh-lee-cryptolab jh-lee-cryptolab deleted the issue-18-per-user-auth branch March 24, 2026 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Per-user token auth for Vault

2 participants